Engineering leaders face critical decisions about which AI approaches will deliver the most value while managing risks. As large language models (LLMs) like GPT-4 and Claude dominate headlines, specialized AI models designed for specific use cases are emerging as compelling alternatives, particularly for code generation.
Brandon Jung, VP of Ecosystem at Tabnine, argues that specialized models offer distinct advantages in transparency, cost-effectiveness, and tailored functionality that make them better suited for many enterprise coding scenarios. This perspective challenges the notion that the path to artificial general intelligence (AGI) through ever-larger models is the only approach worth pursuing.
AI-assisted coding: The case for specialization
The fundamental principle of "good data in, good data out" remains as relevant for generative AI as it was for previous generations of machine learning. However, specialized models offer distinct advantages when it comes to code generation.
"We've always known that AI is good data in, good data out, bad data in, bad data out. So from that aspect, that's not really changed just because it's generative," explains Jung. This principle becomes particularly important when considering what code your developers are using as inspiration.
For organizations concerned about data provenance, specialized models provide greater transparency. "Tabnine early on said, hey, based on our customers, we've got a model that is based on only fully permissive open source and that you can audit, right? so that you can see everything that was in it," Jung notes.
This transparency becomes crucial when considering how AI tools operate differently based on the coding context. Jung explains that AI tools need to function differently depending on whether developers are in "discovery mode" (requiring responses within 150 milliseconds) versus chat-style interactions where longer response times are acceptable. Specialized models can be optimized for these different use cases in ways that massive general-purpose models cannot.
Why LLMs Fall Short for Enterprise Code Generation
While large language models have captured public attention with their impressive capabilities, they face significant challenges that specialized models can address more effectively.
Data transparency stands as one of the most significant concerns. "LLMs first off, they just want lots of data. And that's just fundamentally the way that they're set up," Jung explains. This insatiable appetite for data creates problems when considering where that data comes from and what rights are associated with it.
Regulatory challenges are emerging that may further complicate the use of large models. Jung highlights recent European regulations that could dramatically impact model development: "The EU passed what is the Artificial Intelligence Act, or AI Act, and in it, there's some very important aspects that talk specifically about having, you cannot have any copyrighted data in your model."
This requirement poses a significant challenge for large models that have ingested vast amounts of internet data, including copyrighted materials. Specialized models built on carefully curated datasets can more easily demonstrate compliance with such regulations.
Cost considerations also favor specialized approaches. While many cloud providers currently subsidize access to their large models, Jung points out that this situation won't last forever: "An API call right now is probably looks to be dramatically subsidized by all the providers in the gold rush of trying to get developers and others to use their APIs... but naturally you have to pay for this and these are continually updated. They are not a one and done option."
Why Data Provenance Is Key for AI-Assisted Development
For engineering leaders implementing AI coding tools, understanding data provenance is essential for building trust and ensuring quality outputs.
“Think of the idea of a golden repo.” Jung emphasizes the importance of identifying and maintaining high-quality source repositories as a foundational practice for engineering organizations:
This focus on identifying high-quality code repositories within your organization provides the foundation for effective AI implementation. Without this step, organizations risk reinforcing poor practices or introducing inconsistencies.
Jung notes that the rise of generative AI has heightened the urgency for organizations to prioritize data quality. Because these models rely heavily on clean, reliable inputs, their adoption is driving a renewed focus on improving data hygiene and establishing more rigorous data practices.
"A good amount of the initial adoption has been super, general front-end web development, which, again, is going to look very standard across what you see on general GitHub," Jung notes. However, as organizations look to apply AI to more specialized codebases, including legacy systems, the importance of data quality and specificity increases dramatically.
Prerequisites for Scaling AI Coding Tools
Implementing AI coding tools without having solid development practices in place can amplify existing problems rather than solve them.
Jung emphasizes that organizations should establish DORA metrics (Deployment Frequency, Lead Time for Changes, Mean Time to Restore, and Change Failure Rate) and good development practices before implementing AI tools: "Adding generative AI at least into the coding space without having those in place is a recipe for complete disaster because now you've just taken, you just put more in the front end of your, of your process and, it's not gonna go well."
Security concerns require particular attention, as AI tools don't automatically produce secure code. "One other fun question we usually get is does it write secure code? I swear that question comes up like almost every time," Jung notes. The answer depends entirely on the quality of your security pipelines and processes to validate AI-generated code.
Organizations need robust code review processes to identify complex algorithms that may have been introduced by junior developers using AI tools. Without these guardrails, teams risk incorporating code that no one fully understands, creating potential maintenance and security issues down the line.
How AI Coding Tools Accelerate Developer Onboarding
AI coding tools are transforming how new developers join teams and learn codebases, creating both opportunities and challenges.
"These tools are also pretty good at describing exactly what's going on in a code base. So, when you see the code go, What is this code doing? It's very good at giving you a very good high level of what's going on," Jung explains. This capability can dramatically accelerate a new developer's understanding of existing code.
Jung described how these tools can pave the way for more meaningful knowledge transfer between senior and junior developers. Rather than spending time on surface-level questions, juniors can quickly get answers to basic queries through AI assistance. This frees up time for deeper discussions—like understanding the reasoning behind architectural decisions or algorithmic choices—leading to richer learning opportunities and more productive collaboration.
By handling the basic "what" and "how" questions, AI tools free up senior developers to focus on transferring deeper knowledge about the "why" behind code design decisions. This may require adjusting team ratios between junior and senior developers, though Jung acknowledges the industry is still determining the optimal balance.
Rethinking Code Review in the Age of AI
As AI tools become more integrated into development workflows, code review processes must evolve to address new challenges.
One significant concern is identifying when junior developers introduce complex algorithms they don't fully understand. "The code review process. Okay, before I put it into production, you know, does this look like something that we already have? Is this something completely new? What's the level of complexity of the code that I'm about to implement?" Jung asks.
Jung recommends that teams implement systems to identify and review high-complexity code before it reaches production, ensuring potential risks are addressed early. Organizations should consider tracking which code was AI-assisted for future reference, enabling them to trace back issues that might emerge later. This creates a balance between leveraging AI for innovation while maintaining appropriate control and understanding of the codebase.
Choosing the Right AI Approach for Engineering Teams
As AI continues to transform software development, engineering leaders must approach implementation with intentionality and a focus on fundamentals. Specialized AI models offer compelling advantages in transparency, cost, and tailored functionality that make them well-suited for enterprise coding scenarios.
The key to successful implementation lies in understanding your data, establishing strong development practices, and adapting team structures and processes to leverage AI effectively. By focusing on these fundamentals, organizations can harness the power of AI-assisted coding while managing risks and building sustainable competitive advantages.
As Jung concludes: "Know your data. Like, there's no two ways about this. We've known this forever. I think that's when we talk about the piece, that's the hard work... Hey, get good at your data. It's the lifeblood of what you got."
Go Beyond Specialized Models with AI Orchestration
You've explored the powerful advantages of specialized AI models for enterprise coding. Now it's time to scale these benefits across your entire development workflow. Join our workshop, Beyond Copilot: Gaining the AI Advantage, to learn how leading organizations are integrating AI into their daily engineering processes with advanced orchestration practices. What you'll gain:
- Discover your current AI maturity level with our AI Collaboration Matrix, benchmarking your team's adoption against industry leaders.
- Identify actionable opportunities to introduce agentic AI into your critical workflows—moving beyond assistance toward proactive AI-driven development.
- Hear real-world examples from enterprises that successfully transitioned to AI-orchestration workflows, achieving meaningful competitive advantages.
Don't miss this chance to elevate your AI strategy and position your team for sustained success.
AI Adoption & Best Practices
Dive deeper into effective AI adoption:
- AI-powered Code Reviews
- Measuring Impact: The Gen AI Code Report
- How Agentic AI Will Disrupt Your Software Delivery Lifecycle
- AI Metrics: How to Track and Improve Generative AI Coding
- Boost Productivity with AI-Powered Developer Support
Listen to Brandon Jung’s full Dev Interrupted episode below, and read the full transcript here.