In the world of AI, demos are easy. They can also be a lie. A large language model's ability to generalize and improvise makes it simple to create a dazzling proof-of-concept, but as Alex Salazar, co-founder of Arcade, warns, "going from that easy demo to something production grade, it might be an entirely different product."
This is the central challenge facing engineering leaders today: bridging the massive chasm between a flashy demo and a secure, reliable, and cost-effective AI system that delivers consistent value. Salazar argues that the key is to intentionally constrain the AI's creative freedom—a strategy he calls "dialing up determinism."
This article explores Salazar's insights on the four critical blockers to production-grade AI, his "calculator" approach to solving them, and the agent-native paradigm that this new way of building enables.
The four production blockers
According to Salazar, 90% of the work in building an AI product happens after the initial demo is working. The journey to production is blocked by four critical challenges:
- Consistency and accuracy: A production system needs to be trustworthy, performing correctly 80-90% of the time, not just in cherry-picked examples.
- Security and safety: Real-world deployments must protect sensitive data. The lack of secure, reliable authentication is why a true "personal assistant agent" still doesn't exist.
- Token costs: A demo that costs $50 to run can quickly scale to millions of dollars when deployed to thousands of users.
- Latency: Slow AI responses kill the user experience, but making models more accurate often makes them slower.
The solution: dialing up determinism with the 'calculator' approach
Salazar's breakthrough insight was to intentionally constrain the AI's decision space. He uses a brilliant metaphor to explain this: instead of giving the AI a blank slate, his team provides it with a very limited set of pre-approved actions. It's like handing the AI a calculator, but first removing its general-purpose buttons and replacing them with a few specific ones that represent the only valid choices. The AI can then only "pick from the buttons" you've given it, ensuring its actions are predictable and safe.
This "calculator" approach dramatically improves reliability. By forcing the model to choose from a discrete set of well-defined tools or actions - rather than generating open-ended responses - you dial up determinism. For security in particular, this means handling authentication outside the model. The AI can call a secure tool, but it can't directly manipulate credentials. The security logic is "inside the tool call, inside the button."
To get started, Salazar offers two key pieces of advice: first, build a muscle of evaluations right out of the gate, because you can't achieve consistency without rigorous testing. Second, "descope the living daylights out of your project" and focus on narrow workflows where modern APIs already exist.
Embracing the agent-native paradigm
This deterministic approach enables a new way of building software. Traditional APIs are resource-based (Create, Read, Update, Delete), but AI agents are intention-based. An agent thinks, "I'm gonna reply to an email," an intention that might involve a hundred API calls or zero.
This requires a paradigm shift for developers. You can't just consume a generic API; you have to build custom tools grounded in the specific domain and use case of your agent. This is also the key to solving "compounding error rates"—the way a multi-step AI workflow can fail if any single step goes wrong. By replacing probabilistic AI steps with deterministic tool calls, you dramatically reduce the chance of failure.
Building this way requires new skills. It's not enough to be good at using AI tools; you need "agent-native" engineers who understand this new world. The most effective teams, like Arcade's, create a "cross-pollination" of expertise, where agent experts, auth experts, and distributed systems experts learn from each other to solve these novel challenges.
From dazzling demos to durable value
The path from a dazzling AI demo to a durable, production-grade system is paved with discipline. As Alex Salazar's insights reveal, the magic of generative AI is not enough. Real-world value is only created when that magic is contained, directed, and made predictable.
The "calculator" approach - dialing up determinism by giving AI a specific set of tools to choose from - is the key to bridging this gap. It's a trade-off: you sacrifice some of the model's free-form creativity in exchange for consistency, security, and reliability.
For leaders and developers entering this new agent-native world, the lesson is clear. The most successful AI products won't come from the teams that build the most impressive demos, but from those who have the discipline to engineer a production-ready reality.
To hear more from Alex Salazar on building production-grade AI, listen to him discuss these ideas in depth on the Dev Interrupted podcast.