Sohail Mohammad

Production AI and Adjacent Thoughts

Forward deployed engineer at Together AI. Notes on production AI and adjacent thoughts.

Inference Economics Read the Field Guide Email

Together AIInference optimization and post-training with enterprise customers

AmazonAgent platforms for 1.5M+ employees

JPMorgan ChaseRAG system from 0 to 10k+ users

Wendy'sGenAI drive-thru systems

Jack HenryGPU infrastructure

Start Here

Flagship Production Inference Economics A field guide, calculator, and essay series for measuring AI systems by loaded cost per accepted result. Book The Honest Field Guide to Production Inference How to reason about provider choice, routing, retries, quality gates, serving, and migration economics. Tool LCPR Calculator Run the loaded-cost math on your own workload instead of trusting token-price theater.

Selected Work

Latest Research see all →

Latest Writings see all →