Why we will keep lighting cigars with a blowtorch.
Local and open source models aren't the answer to token frugality.
TL;DR: The winners are still Anthropic, OpenAI, Google, hyperscalers, and the AI infra layer underneath them.
—
The narrative of local, smaller, cheaper open source models as an alternative to overpaying for frontier tokens keeps popping up as a bear case. I think it’s a seductive and intuitive idea: why use a blowtorch to light a cigar? We have a focused job to do, so why all of this general intelligence, scale, cost, latency, and additional risk? Let’s have a small, affordable, lobotomized intelligence for each task.
For focused tasks like summarizing small chunks of text or general data munging, fine. But for coding and agentic workflows operating over longer horizons? I don’t think so.
I see three dynamics:
Total Cost to Resolution (TCR): for complex tasks, the more powerful model will likely ultimately cost less. Stronger models require fewer reasoning tokens, one shot the best solution more quickly, and reduce the number of brute force generate-verify iterations.
The Horizon Wall: Over long trajectories, these small differences have a big impact on quality and thus how long the horizons can be that the models work independently. Small errors compound into catastrophic impact on viability.
Parallelism Capacity: Because of B, smarter models allow more parallelism because they can work longer between human (or agent) review.
I think the compounding nature is the crucial point because it propagates over every phase. A poorer research session compounds into poorer planning which compounds into an inferior implementation and weaker code reviews and so on.
The math: if you compare a 95% success rate to an 80% success rate over a simple 10 step procedure, the former will succeed 60% of the time and the latter 10%. What if the chain is hundreds or thousands of steps?
Perhaps harnesses will become more model heterogeneous (likely at the behest of the orchestrating models themselves) to reduce latency and cost. For planning and agentic trajectories, there is no substitute for the best.

