How to measure, model, and operate production inference decisions. Loaded cost per accepted result, not price per million tokens.