Gen AI Research: Software Development Productivity At Google

Gen AI tools have transitioned from an experimental concept to essential across various roles in a few short years. Tools like GitHub Copilot are making significant waves in the software development industry, leading to deepening discussions about the impact of Gen AI on developer productivity. However, how much does Gen AI improve developer productivity, particularly in enterprise environments requiring high context awareness to complete tasks?

New research from Google sheds some light on the productivity impacts of Gen AI tooling. They conducted a randomized controlled trial on their employees and obtained compelling insights. Most notably, developers using AI tools completed software development tasks 21% faster. This article will break down Google's study and share some of the critical insights engineering leaders should take away from this research.

The Study: Assessing AI’s Impact on Developer Productivity

Google’s research stands out because it evaluates multiple AI tools working in unison rather than focusing on just one. While previous studies have centered on tools like GitHub Copilot, Google's trial examined the combined effects of three generative AI tools that have been available to Google engineers for a while:

AI Code Completion - Semantic AI code completion that enables single and multi-line code suggestions as developers type.
Smart Paste - Context-aware adjustments when a developer copies code from one area into another location.
Natural Language to Code - An AI assistant that is trained in various programming languages and is activated by highlighting code and selecting a menu option to have the AI assistant make recommendations

The researchers sought to design an "enterprise-grade" task that accurately reflected the type of work developers typically handle and utilized the full range of Google’s developer tools. Participants were given ten files containing 474 lines of code and detailed instructions on how to edit the code. Their task was implementing a new service that logs messages from a fake product onto Google’s internal file-storage service.

To complete this, participants needed to update the build, data structure, and test files to ensure alignment with the existing codebase and then build and test the project. The task was considered complete once all the tests passed. It was designed to be complex, requiring a solid understanding of infrastructure, code search, editing, writing, and refactoring test plans. Before the research project began, the task was tested on a small group of engineers, who established that it should take between 30 minutes and 2.5 hours to complete.

The study aimed to answer three questions:

What impact does AI have on time spent completing an enterprise-grade development task?
How do developer and task characteristics influence the AI's impact on task speed?
How do these characteristics interact with AI to accelerate or slow down developers?

Key Findings: 21% Faster Development, But Context Matters

The study's headline result is clear: developers using AI tools completed tasks 21% faster than their counterparts without AI. On average, the AI-assisted group finished in 96 minutes, compared to 114 minutes for the control group. However, while the 21% improvement is notable, the context matters.

Box and whisker plot comparing time on task between AI and No AI groups, showing interquartile ranges and average task times with outliers represented as circles. The AI group has slightly higher task times on average compared to the No AI group

The results reveal interesting contrasts with previous research. Another recent study found a 56% productivity increase, but that research involved Upwork freelancers with varying experience levels, while Google’s study focused on 96 full-time developers familiar with the company’s systems. This highlights how AI’s impact on productivity can vary based on the developer’s expertise and environment.

Another study, which examined developers at Microsoft, Accenture, and an anonymous Fortune 100 company, reported a 26% productivity boost with GitHub Copilot. This suggests that while Gen AI tools can significantly enhance productivity, the improvement depends on the developer’s experience and work environment.

Senior Developers See the Biggest Gains

One surprising result from Google’s study was that senior developers saw the most significant productivity gains. This challenges the assumption that AI primarily benefits junior developers by compensating for their lack of experience.

Why do senior developers benefit more? The answer likely lies in task complexity and the deep expertise required to work efficiently in large-scale codebases. While AI tools offer powerful suggestions, they still require significant skill to refine and implement effectively. With their broader experience, senior developers can leverage AI to make faster, smarter decisions, optimizing their workflow in ways that less-experienced developers might not be able to.

This finding underscores a critical insight for software engineering leaders: AI tools aren’t just for boosting junior developer efficiency. Senior developers can use AI to drive even greater productivity and serve as examples for effectively integrating AI into workflows.

Limitations and Opportunities for Future Research

Google’s study provides strong evidence that Gen AI can measurably improve developer productivity; however, the relatively small sample size means that more research is needed to understand the implications fully. Specifically, the sample size for seniority was small enough that it’s difficult to know if their results represent statistical noise rather than a clear trend. Additionally, the task was specifically designed for Google’s infrastructure and internal tools, limiting its applicability outside the company. Lastly, Google didn’t consider code quality, which frequently comes up as an area of concern when adopting Gen AI.

Google’s study also invites future research on how AI tools can be tailored to different developer skill levels. For instance, how can junior developers be "up-leveled" to better use AI tools? And what further benefits might be unlocked by offering personalized AI experiences for different types of developers?

Is Gen AI Improving Your Developer Productivity?

Google’s research supports the argument that Gen AI is quickly becoming a strategic asset for organizations seeking to accelerate software delivery. More importantly, their research indicates you should strategically apply multiple Gen AI tools and give them the internal context they need to complete tasks successfully. Do you want to measure the impact of Gen AI tools on your software engineering efficiency? Watch our workshop to learn how.

Ben Lloyd Pearson

Ben hosts Dev Interrupted, a podcast and newsletter for engineering leaders, and is Director of DevEx Strategy at LinearB. Ben has spent the last decade working in platform engineering and developer advocacy to help teams improve workflows, foster internal and external communities, and deliver better developer experiences.

Connect with

Your next read

Cover image for How Google DeepMind is transforming engineering with AI

Workflow

The Engineering Productivity Platform

Resources

Use Cases

Features

Productivity Research Center

6.1M PRs

< 26 Hrs

13.3%