Retrieval-augmented generation (RAG) is a game-changer for enterprise-grade AI systems, giving professionals the ability to obtain highly accurate, up-to-date, company-specific responses to their large language model (LLM) queries. Many organizations are now starting to consider how to build on that to access the next frontier for the technology, and potentially its most valuable yet — agentic AI.

Agentic AI expands on the capabilities of RAG by using generative AI to interact with tools, enabling developers to create and run autonomous, multistep workflows. In simple terms, agentic AI systems can carry out complex tasks and take actions based on guided and informed reasoning as opposed to just providing answers. With tools, LLMs can connect and interact with external resources (such as the internet, databases, CRMs, and APIs) to achieve an overall objective.

Running AI agents requires enterprises to have certain resources in place and presents new risks that need to be mitigated by careful planning and guardrails.

Here are seven key elements companies need to consider in order to build safe, effective AI agents.

1. Orchestration

Agentic AI systems need an orchestration element built into their structure to coordinate the various tools and processes that run workflows. This also gives organizations greater visibility into the system’s reasoning and logic — and prevents it from becoming a “black box.” A leading approach to incorporating structured flow into the decision-making process of an agentic AI system is through state machines. A state machine manages different conditions in a workflow, effectively giving agents situational awareness to understand their context, how to react to triggers, and the assets they can call on. The logic built into the state machine will govern how the system defines different states, handles transitions, and executes appropriate actions for each state. State machines and other components of the system can be developed using open-source frameworks for building applications that integrate LLMs with other tools. Engineers can implement the required chains (the sequence of operations an agent performs) for different tasks.

2. Guardrails

Because of the autonomous nature of agentic AI, it is essential to set up guardrails that define and limit their scope of action. This need for guardrails makes it crucial to use an advanced LLM that has built-in traceability and transparency. These systems can explain their reasoning and display logs that show where and why they’re making decisions. Our family of Command models, for example, show their reasoning through citations and explicit planning steps, allowing them to be audited and explained. Specialized third-party frameworks can be used to improve traceability, creating visual graphs that show each decision point in a system’s process. Another important component of guardrails is to include “humans in the loop” to limit the amount of autonomy in the system. A robust agentic AI system needs to include transparent decision-making criteria and a user-friendly interface for manual human approval, ensuring that the most complex, sensitive decisions are based on human judgment. Guardrails can also be set up to prevent undesired actions or outputs by using detection systems like AI-driven monitoring, automated alerts, and predefined rule enforcement.

3. Knowledgeable teams

It’s vital to have personnel in place who understand generative models, basic RAG systems, necessary guardrails, and common AI pitfalls. Where that expertise is lacking or needs updating, there are excellent educational resources that anyone can access. LLM University, for example, has a wealth of learning modules covering everything from AI deployment to RAG to tool use. Beyond upskilling, companies may also need to hire new talent, as well as reinforce cultural values around this technology to ensure they have the right knowledge and mindsets in place.

4. A powerful, enterprise-grade LLM

Agentic AI workflows need LLMs that are specifically trained to perform multi-step tool use. That’s not the case with every LLM, especially older models. Cohere Command A, for example, is specifically designed to act as a core reasoning engine for enterprises. It can combine multiple tools over different steps to accomplish complex tasks and correct itself in the face of a bug or failure, increasing its success rate.

5. Tool architecture

Building agentic AI requires you to define the various tools you’ll be using to take actions and interact with other assets, such as databases or APIs. Those could be chatbot and sentiment analysis tools for customer service, fraud detection and expense management tools for accounting, or candidate screening and employee engagement tools for HR functions. The first step is to define how different tools connect to an external system and the specific calls that will be made. Second, it’s important to define the tool schema, which explains to the LLM what a tool does and which parameters it accepts.

6. Evaluation

It is essential to consider multiple factors when evaluating an agentic AI system. First, the generative language models should be tested against an evaluation dataset. This represents the type of real-world data the agent will encounter, and judges the model’s performance in various scenarios. Second, the agent's overall architecture, including input/output mechanisms, data processing pipelines, and decision-making processes should be evaluated. And finally, it is critical to evaluate the deployment platform to ensure it matches the enterprise architecture strategy and can scale with production needs.

7. Moving to production

Agentic AI systems need to undergo extensive testing before “going live” in production. A Unit Test bank — a collection of predefined inputs and expected outputs used to verify the accuracy and performance of system components — is necessary to validate any agentic outputs. Unit test banks will be use case dependent. Additionally, since agentic AI systems require a large amount of tokens (the building blocks of text they interact with), companies must consider how an agentic system will scale. They should secure the necessary resources to avoid running out of capacity once the system goes into production. This step is critical, as failure to plan cost and resource considerations can stop an agentic AI system from moving to production.

Agentic systems are still in the early stages of adoption, but the technology is maturing quickly. In the coming years, organizations will likely use agentic AI to run increasingly complex tasks. Agents can assist with the more tedious parts of workflows and free up employees to focus on higher-value tasks. For example, a sales and lead generation agent — with points of human review built into its workflow — could scan company databases and LinkedIn profiles, identify businesses that match their target audience, and send them custom messages. Now is the perfect time for companies to start building the foundational resources and skills they’ll need to thrive and compete in this next stage of generative AI.

Building enterprise AI agents

A comprehensive blueprint for developing scalable AI agents in regulated industries.

Download the eBook

How enterprises can start building Agentic AI