Webinar

Modernize Your Manufacturing and Retail Data Estate with Microsoft Fabric

In this live webinar discover practical strategies and real-world insights from our experts.

Register Now

Summarize with AI

Not enough time? get the key points instantly.

Nine in ten engineers now report using AI tools daily. Most engineering leaders evaluate the investment based on Copilot acceptance rates and self-reported speed, and both are trending upward.

AI pays off across the SDLC. But the returns are unevenly distributed. The only randomized controlled trial of experienced developers on familiar codebases measured a 19% slowdown, even as those developers believed they were 20% faster. The DORA report confirms the pattern. Yhroughput gains are now positive, but instability persists.

Most mid-market budgets sit in coding. The phases with the strongest measured payoffs draw almost no investment.

Testing and observability deliver the clearest AI returns today

The strongest AI returns in the SDLC come from two phases most teams haven’t tried yet.

Meta’s TestGen-LLM is the best-documented AI testing deployment at an industrial scale. Nearly three-quarters of its generated test recommendations were accepted into production during Instagram and Facebook test-a-thons, and more than one in ten classes it touched saw measurable coverage improvements.

A study confirmed the pattern from a different angle, using reinforcement learning to surface the first failing test within the top 16% of the suite. Both approached the same problem the same way: score components by defect history and change frequency, then focus testing effort where it matters most.

On the operations side, Gartner projects AIOps adopters will reduce MTTR by up to 40% by 2027. The DORA report adds an important qualifier: AI amplifies both existing operational strengths and dysfunctions equally.

What makes both phases safer first investments than coding tools is the blast radius. A bad AI test gets caught by the CI gate. A bad AI alert gets triaged by the on-call engineer. The failure mode is noise, not production defects.

What you can do

Pick one test-heavy codebase. Deploy AI test generation with acceptance gates: tests must build, pass reliably, and increase coverage. If testing ROI proves out, AIOps on one high-alert-volume production service is the natural second move.

Stay updated with Simform’s weekly insights.

Requirements and legacy discovery deliver the highest leverage

AI’s per-hour value is highest in the phases that sit before coding.

A peer-reviewed study tested AI-generated acceptance criteria against independent domain expert judgment. More than 80% of the AI-generated criteria were judged as relevant additions to existing user stories.

In structured pilots we have seen run on complex-domain projects, BA and QA pairs who loaded domain context into AI tools and generated acceptance criteria reported shorter rework cycles and better edge-case coverage than manual analysis alone.

Defects caught at requirements cost up to 100 times less to fix than those that reach production, which makes every improvement here a multiplier for everything downstream.

McKinsey’s LegacyX program reports a 40-50% acceleration in modernization timelines. In one case, 20,000 lines of COBOL, estimated at 700 to 800 hours of manual effort, were reduced by 40% with AI agents handling discovery and mapping. AI improves the discovery phase that precedes a rewrite. It does not fix the rewrite itself.

What you can do

Pilot AI-assisted requirements on your highest-rework feature area. For legacy code older than 3 years with the original engineers gone, start with a discovery-only AI scoping engagement to map dependencies and surface undocumented business logic before committing to a full modernization program. NeuVantage codifies this discovery into a structured modernization assessment.

Architecture decisions need more caution than any other phase

Architecture is a trade-off analysis. AI is pattern completion. The gap between those two tasks is structural, not a tool limitation.

A peer-reviewed analysis published in TMLR provides the clearest evidence that LLMs can articulate correct architectural principles but do not reliably apply them. The authors call this “computational split-brain syndrome,” where the model understands the trade-off in theory but cannot execute the reasoning that resolves it.

InfoQ’s architect community reached a similar conclusion. AI can suggest alternatives when given sufficient context, but it cannot make decisions.

A large-scale analysis of AI-generated code found that roughly a quarter to a third of the output contained exploitable security weaknesses, which is why security architecture choices should be made specifically by humans.

The less visible risk is the gap between AI-generated code and the team’s understanding of the system it runs. When AI writes faster than engineers build mental models, architectural review becomes reconstruction rather than judgment.

What you can do

Use AI to draft architecture decision records, conduct prior art research, and document. Exclude it from trade-off decisions, novel component design, and security architecture choices.

Sequencing drives more value than switching tools

Most teams invested in coding tools first and are now planning to expand. Gartner says that teams that apply AI only to coding capture roughly 10% productivity gains, while teams deploying across the SDLC are projected to capture 25 to 30% by 2028.

Bain’s Report explains that coding accounts for only 25-35% of the time from idea to product launch. A tool that accelerates a quarter of the timeline has a ceiling.

For teams with less platform discipline than Microsoft or Google, a coding-first rollout risks yielding throughput gains at the expense of downstream instability.

What you can do

Before expanding your coding-tool rollout, instrument downstream effects such as code churn, PR size, review time, and change-failure rate. If quality signals worsen after 90 days, redirect the budget to testing or observability first.

The teams getting the most from AI in their SDLC treat it as a sequencing decision. Testing and observability first, where evidence is strongest. Requirements and legacy discovery next, where upstream gains compound. Architecture stays human. Coding tools earn their place when the surrounding system can absorb the speed.

PexAI provides the operating framework for this sequencing, with standardized blueprints that govern how AI integrates across each phase. Explore where to start.

Stay updated with Simform’s weekly insights.

Hiren is CTO at Simform with an extensive experience in helping enterprises and startups streamline their business performance through data-driven innovation.

Sign up for the free Newsletter

For exclusive strategies not found on the blog

Revisit consent button
How we use your personal information

We do not collect any information about users, except for the information contained in cookies. We store cookies on your device, including mobile device, as per your preferences set on our cookie consent manager. Cookies are used to make the website work as intended and to provide a more personalized web experience. By selecting ‘Required cookies only’, you are requesting Simform not to sell or share your personal information. However, you can choose to reject certain types of cookies, which may impact your experience of the website and the personalized experience we are able to offer. We use cookies to analyze the website traffic and differentiate between bots and real humans. We also disclose information about your use of our site with our social media, advertising and analytics partners. Additional details are available in our Privacy Policy.

Required cookies Always Active

These cookies are necessary for the website to function and cannot be turned off.

Optional cookies

Under the California Consumer Privacy Act, you may choose to opt-out of the optional cookies. These optional cookies include analytics cookies, performance and functionality cookies, and targeting cookies.

Analytics cookies

Analytics cookies help us understand the traffic source and user behavior, for example the pages they visit, how long they stay on a specific page, etc.

Performance cookies

Performance cookies collect information about how our website performs, for example,page responsiveness, loading times, and any technical issues encountered so that we can optimize the speed and performance of our website.

Targeting cookies

Targeting cookies enable us to build a profile of your interests and show you personalized ads. If you opt out, we will share your personal information to any third parties.