Agentic AI: Orchestration Frameworks
Build AI agent systems with orchestration frameworks. Covers agent architectures, tool calling, multi-agent coordination, LangGraph, CrewAI, AutoGen, evaluation, and production deployment.
Agentic AI represents the shift from “AI that answers questions” to “AI that takes actions.” Instead of generating text and stopping, agents plan multi-step workflows, invoke tools, make decisions based on intermediate results, and loop until a task is complete. This is the architecture behind autonomous coding assistants, research agents, and enterprise workflow automation.
But the gap between “agent demo” and “agent in production” is enormous. Demos work because the happy path is scripted. Production fails because agents encounter ambiguity, tools return errors, plans need revision, and costs spiral when agents get stuck in loops. This guide covers the frameworks, patterns, and engineering discipline needed to deploy agents that actually work.
Agent Architecture Patterns
Pattern 1: ReAct (Reason + Act)
The foundational pattern. The agent reasons about what to do, takes an action, observes the result, and iterates:
Thought: I need to find the customer's order status
Action: query_database(customer_id="C-12345", table="orders")
Observation: [Order #9876: shipped, tracking: 1Z999...]
Thought: I have the order status. Let me format the response.
Action: respond("Your order #9876 has shipped. Tracking: 1Z999...")
class ReActAgent:
def __init__(self, llm, tools, max_iterations=10):
self.llm = llm
self.tools = {t.name: t for t in tools}
self.max_iterations = max_iterations
def run(self, task: str) -> str:
messages = [{"role": "system", "content": self.system_prompt}]
messages.append({"role": "user", "content": task})
for i in range(self.max_iterations):
response = self.llm.generate(
messages=messages,
tools=self.tool_definitions,
)
if response.tool_calls:
for call in response.tool_calls:
tool = self.tools[call.function.name]
result = tool.execute(**call.function.arguments)
messages.append({
"role": "tool",
"content": str(result),
"tool_call_id": call.id,
})
else:
return response.content # Agent is done
return "Agent reached maximum iterations without completing."
Pattern 2: Plan-Execute
The agent creates a plan first, then executes steps sequentially:
class PlanExecuteAgent:
def __init__(self, planner_llm, executor_llm, tools):
self.planner = planner_llm
self.executor = executor_llm
self.tools = tools
def run(self, task: str) -> str:
# Step 1: Create plan
plan = self.planner.generate(
f"Create a step-by-step plan to accomplish: {task}\n"
f"Available tools: {[t.name for t in self.tools]}\n"
f"Return as numbered steps."
)
results = []
for step in parse_plan(plan):
# Step 2: Execute each step
result = self.executor.generate(
f"Execute this step: {step}\n"
f"Previous results: {results}\n"
f"Use the appropriate tool.",
tools=self.tools,
)
results.append({"step": step, "result": result})
# Step 3: Replan if needed
if should_replan(results):
plan = self.planner.generate(
f"Original task: {task}\n"
f"Completed steps: {results}\n"
f"The previous plan needs adjustment. Create updated plan."
)
return synthesize_results(results)
Pattern 3: Multi-Agent Coordination
Multiple specialized agents collaborate on complex tasks:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Researcher │ │ Analyst │ │ Writer │
│ Agent │────▶│ Agent │────▶│ Agent │
│ │ │ │ │ │
│ Tools: │ │ Tools: │ │ Tools: │
│ - web_search │ │ - calculate │ │ - format_doc │
│ - read_url │ │ - query_db │ │ - review │
│ - summarize │ │ - visualize │ │ - publish │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
└───────────────────┴───────────────────┘
│
┌───────────────┐
│ Orchestrator │
│ (Coordinator) │
└───────────────┘
Framework Comparison
| Framework | Architecture | Strengths | Weakness | Best For |
|---|---|---|---|---|
| LangGraph | Graph-based state machines | Fine-grained control, conditional branching | Steep learning curve | Complex workflows with conditional logic |
| CrewAI | Role-based multi-agent | Natural role definition, simple API | Less control over execution flow | Team-like collaboration patterns |
| AutoGen | Conversation-based | Supports human-in-the-loop, multi-agent chat | Verbose, hard to debug | Research, group decision-making |
| LlamaIndex Workflows | Event-driven | Strong RAG integration | Newer, smaller ecosystem | RAG-heavy agent applications |
| Custom (from scratch) | Whatever you need | Full control, no framework overhead | Maintenance burden | Simple agents, production-critical systems |
LangGraph Example
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
class AgentState(TypedDict):
task: str
plan: list[str]
current_step: int
results: list[dict]
final_output: str
def create_plan(state: AgentState) -> AgentState:
plan = planner_llm.generate(
f"Create a plan for: {state['task']}"
)
return {"plan": parse_steps(plan), "current_step": 0}
def execute_step(state: AgentState) -> AgentState:
step = state["plan"][state["current_step"]]
result = executor_llm.generate(
f"Execute: {step}\nPrevious: {state['results']}",
tools=available_tools,
)
state["results"].append({"step": step, "result": result})
state["current_step"] += 1
return state
def should_continue(state: AgentState) -> str:
if state["current_step"] >= len(state["plan"]):
return "synthesize"
return "execute"
def synthesize(state: AgentState) -> AgentState:
output = llm.generate(
f"Synthesize results for '{state['task']}': {state['results']}"
)
return {"final_output": output}
# Build graph
workflow = StateGraph(AgentState)
workflow.add_node("plan", create_plan)
workflow.add_node("execute", execute_step)
workflow.add_node("synthesize", synthesize)
workflow.set_entry_point("plan")
workflow.add_edge("plan", "execute")
workflow.add_conditional_edges("execute", should_continue, {
"execute": "execute",
"synthesize": "synthesize",
})
workflow.add_edge("synthesize", END)
agent = workflow.compile()
Tool Design
Well-designed tools are the foundation of reliable agents:
class Tool:
def __init__(self, name, description, function, parameters_schema):
self.name = name
self.description = description
self.function = function
self.schema = parameters_schema
def execute(self, **kwargs):
# Validate inputs
validated = self.validate(kwargs)
# Execute with timeout
try:
result = asyncio.wait_for(
self.function(**validated),
timeout=30.0,
)
return {"status": "success", "data": result}
except asyncio.TimeoutError:
return {"status": "error", "message": "Tool execution timed out"}
except Exception as e:
return {"status": "error", "message": str(e)}
# Tool design principles:
# 1. Clear, unambiguous descriptions
# 2. Strict parameter validation
# 3. Timeout protection
# 4. Structured error responses
# 5. Idempotent where possible
Production Safety
Cost Controls
class AgentBudget:
def __init__(self, max_tokens=50_000, max_tool_calls=20, max_time_seconds=120):
self.max_tokens = max_tokens
self.max_tool_calls = max_tool_calls
self.max_time = max_time_seconds
self.tokens_used = 0
self.tool_calls = 0
self.start_time = time.time()
def check(self):
if self.tokens_used > self.max_tokens:
raise BudgetExceeded(f"Token limit: {self.tokens_used}/{self.max_tokens}")
if self.tool_calls > self.max_tool_calls:
raise BudgetExceeded(f"Tool call limit: {self.tool_calls}/{self.max_tool_calls}")
if time.time() - self.start_time > self.max_time:
raise BudgetExceeded(f"Time limit: {self.max_time}s exceeded")
Human-in-the-Loop
def execute_with_approval(agent, task, approval_required_tools):
"""Pause agent execution for human approval on sensitive actions."""
for step in agent.plan(task):
if step.tool_name in approval_required_tools:
approval = request_human_approval(
action=step.tool_name,
parameters=step.parameters,
context=step.reasoning,
)
if not approval.approved:
agent.replan(f"Action '{step.tool_name}' was rejected: {approval.reason}")
continue
result = agent.execute_step(step)
Evaluation
| Metric | What It Measures | Target |
|---|---|---|
| Task completion rate | % of tasks completed successfully | > 80% |
| Steps to completion | Average number of agent steps | Lower is better (efficiency) |
| Tool selection accuracy | Did the agent pick the right tool? | > 90% |
| Cost per task | Total LLM + tool costs per task | Track trend, optimize |
| Error recovery rate | Agent recovers from tool failures | > 70% |
| Latency (end-to-end) | Time from task submission to completion | < 2 minutes for standard tasks |
Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|---|---|
| No iteration limit | Agent loops infinitely on unsolvable tasks | Hard limits: max iterations, max tokens, max time |
| Vague tool descriptions | Agent picks wrong tool, wastes steps | Write descriptions like API docs: precise, with examples |
| No error handling | Tool failures crash the agent | Return structured errors, teach agent to handle failures |
| Premature multi-agent | Coordination overhead exceeds single-agent simplicity | Start with one agent, add more only when tasks are truly parallelizable |
| No observability | Can’t debug why agent made bad decisions | Log all reasoning, tool calls, and observations |
| Trusting agent output | Agent outputs used without validation | Human review for high-stakes actions, automated validation for others |
Agentic AI Checklist
- Agent pattern selected (ReAct, Plan-Execute, Multi-Agent)
- Tools designed with clear descriptions, validation, and error handling
- Framework selected or custom implementation justified
- Cost controls: token budget, iteration limit, time limit
- Human-in-the-loop configured for sensitive actions
- Observability: full trace logging for reasoning and tool calls
- Evaluation framework: completion rate, efficiency, cost tracking
- Error recovery: agent handles tool failures gracefully
- Security: tools respect user permissions, no privilege escalation
- Testing: adversarial tasks, edge cases, loop detection
- Monitoring: production dashboards for completion, cost, latency
- Documentation: agent capabilities, limitations, escalation paths
:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For agentic AI consulting, visit garnetgrid.com. :::