← sohailmo.ai The Inference Field Guide · MMXXVI edition 1

Sohail Mohammad

Production Inference Economics

How to measure, model, and operate production inference decisions. Loaded cost per accepted result, not price per million tokens.

Opener The Deterministic Gate Part 0 How to Use This Book Part 1 The Economic Unit Part 2 Serving Physics Part 3 Workload Economics Part 4 Migration Gates Part 5 Operating the Decision Appendix Calculator Manual and Living Reference

❦ ❦ ❦