Webinar

Accelerate Your Move to Azure

Learn legacy migration patterns with modern, governed approaches on Microsoft Azure

Register Now

Most agentic AI projects get canceled when they move closer to production, and the organization asks three questions at once:

What value is this delivering?

What does it cost to run at scale, and

Who is accountable when it takes the wrong action?

Gartner expects over 40% of agentic AI projects to be canceled by the end of 2027 due to escalating costs, unclear business value, or inadequate risk controls.

In this edition, I’ll break down the reasons those three problems recur and what you can do to address them.

You’re treating “agentic” as a category and skipping the form-factor decision

Agentic programs stall early when “agentic” is treated as a label and not a capability.

Gartner calls out agent washing and estimates only about 130 of the “thousands” of agentic AI vendors are real, which makes tool selection and scoping a primary failure point.

Gartner’s own rule keeps teams out of that trap: use agents when decisions are needed, automation for routine workflows, and assistants for simple retrieval.

Iceland used Azure OpenAI to build Genie, a conversational interface to its internal knowledge base, so store teams can ask questions and get a summarized answer with a link back to the source. They started with a narrow rollout (the 2023 Christmas guides) and expanded from there. That’s the point. Match the form factor to the workflow first.

  • Use assistants for knowledge work: retrieve, summarize, and point people to the right source.
  • Use automation for repeatable steps with clear rules.
  • Use agents when the workflow requires decisions and tool-driven actions across systems.

What can you do?

  • Write the form factor on one line: assistant, automation, or agent.
  • Write the action surface on one line: what systems it can touch and what actions it can take.
  • Start with a read-only scope: move to draft actions, then limited autonomous actions after reliability is measured.

Stay updated with Simform’s weekly insights.

You’re adding agents to workflows before tool calls and permissions are designed

Agentic programs move or stall based on workflow fit. Gartner points out that connecting agents to legacy systems can disrupt workflows and require costly modifications, so teams move faster when workflows are designed around tool calls, permissions, and escalation paths.

AUDI AG treated this as an engineering problem from day one. They built on Azure AI Foundry with App Service and Cosmos DB, and they describe the supporting plumbing (Azure Functions with Key Vault and Cosmos DB) that connects an assistant framework to secure data and services.

What can you do?

  • Pick one workflow with a clear start and finish (HR request, IT helpdesk, order status, internal policy lookup).
  • List the tool calls it needs in plain language (read record, create ticket, update status, notify owner).
  • Map permissions and escalation rules for each tool call before you touch prompts.
  • Build one working path end-to-end for that workflow, then expand scope by adding the next tool call, not the next “agent.”

You’re shipping agents without release gates, traces, and drift checks

Agentic projects keep going when teams can answer three production questions quickly: what the agent did, why it did it, and whether it is getting better or worse over time. Scaling agentic AI responsibly requires governance and oversight that matches autonomy, including observability and risk controls that stay in place after launch.

That control surface has two parts:

  • Evaluation gates before release: test agent behavior on known tasks, compare versions, and block regressions.
  • Monitoring after release: track real behavior in production and catch drift, recurring failures, or unsafe tool use before they become incidents. Teams often operationalize this by sending traces and logs to Azure Monitor Application Insights, then using the Agents view to drill into failed runs and tool calls.

Microsoft’s guidance indicates that the evaluation is tied to production telemetry and to metrics designed for agent behavior (task adherence, tool-call accuracy, intent resolution) rather than generic chatbot scores.

Microsoft also emphasizes observability as a combination of traces, logs, and evaluation results to troubleshoot quality, safety, and operational health.

What can you do?

  • Make evaluations a release gate. Every change ships with a scorecard and a pass/fail threshold. You can run these evaluations in Azure AI Foundry using the agent metrics Microsoft documents, then compare results across versions before promoting a change.
  • Trace tool calls end-to-end. Log inputs, tool arguments, outputs, and outcomes to diagnose incidents. Trace tool calls end-to-end. Log inputs, tool arguments, outputs, and outcomes to diagnose incidents.
  • Pick three operational metrics and stick to them: Task adherence (rework and exceptions), Tool-call accuracy (operational errors) and Intent resolution (handoffs and escalations)
  • Add drift alerts. A small live-traffic sample with scheduled evaluation catches degradation early. Continuous evaluations against a live traffic sample can flag quality regressions, and alerts can route through Azure Monitor when thresholds are breached.

You’re granting broad action access before approval tiers are defined

Agentic workflows are AI-driven processes in which software can make decisions and take actions via tools such as APIs.

That shifts the control problem from “does it answer correctly” to “what can it touch, and what happens when it acts.” Higher autonomy increases open-ended tool use and the risk of irreversible actions, while broad access increases exposure to sensitive information.

A second issue arises when agents read emails, tickets, or documents. Prompt injection can be embedded in user input or hidden inside documents the agent consumes, which is why Microsoft includes prompt shields for both user and document attacks.

What can you do?

  • Set action tiers per workflow and require approval for write actions that change records, access, or money.
  • Use least-privilege identities for tool access and keep secrets in Key Vault with Entra-based access control.
  • Turn on prompt shield defenses for user input and for documents the agent reads.

Agentic programs survive when safety testing keeps pace with releases.

Microsoft has added an AI red teaming agent in Azure AI Foundry through the Azure AI Evaluation SDK, built on its open-source PyRIT framework.

Teams can run automated scans against model and app endpoints, generate reports, and review results alongside the same evaluation scores used for releases.

If you’re moving agents toward production, ThoughtMesh helps teams orchestrate and govern agent workflows.

Stay updated with Simform’s weekly insights.

Hiren is CTO at Simform with an extensive experience in helping enterprises and startups streamline their business performance through data-driven innovation.

Sign up for the free Newsletter

For exclusive strategies not found on the blog

Revisit consent button
How we use your personal information

We do not collect any information about users, except for the information contained in cookies. We store cookies on your device, including mobile device, as per your preferences set on our cookie consent manager. Cookies are used to make the website work as intended and to provide a more personalized web experience. By selecting ‘Required cookies only’, you are requesting Simform not to sell or share your personal information. However, you can choose to reject certain types of cookies, which may impact your experience of the website and the personalized experience we are able to offer. We use cookies to analyze the website traffic and differentiate between bots and real humans. We also disclose information about your use of our site with our social media, advertising and analytics partners. Additional details are available in our Privacy Policy.

Required cookies Always Active

These cookies are necessary for the website to function and cannot be turned off.

Optional cookies

Under the California Consumer Privacy Act, you may choose to opt-out of the optional cookies. These optional cookies include analytics cookies, performance and functionality cookies, and targeting cookies.

Analytics cookies

Analytics cookies help us understand the traffic source and user behavior, for example the pages they visit, how long they stay on a specific page, etc.

Performance cookies

Performance cookies collect information about how our website performs, for example,page responsiveness, loading times, and any technical issues encountered so that we can optimize the speed and performance of our website.

Targeting cookies

Targeting cookies enable us to build a profile of your interests and show you personalized ads. If you opt out, we will share your personal information to any third parties.