13
ReAct (Reason + Act)
The foundational agent loop. If you only know one pattern, know this one.
ReAct Agent Loop Architecture
ReAct loop implementation
import json
def run_react(question, llm, tool_router, max_steps=8):
history = [
{"role": "system", "content": "Respond with JSON: "
"{type: 'thought', text} OR "
"{type: 'action', name, arguments} OR "
"{type: 'answer', text}."},
{"role": "user", "content": question},
]
for step in range(max_steps):
raw = llm(history)
msg = json.loads(raw)
history.append({"role": "assistant", "content": raw})
if msg["type"] == "answer":
return msg["text"]
elif msg["type"] == "action":
result = tool_router.dispatch(msg)
history.append({
"role": "user",
"content": f"observation: {json.dumps(result)}"
})
# "thought" -> just continue loop
return "max steps exhausted"
ReAct worked example: multi-step customer query
USER QUERY: "What's the total spend for Acme Corp this quarter, and create a ticket if it dropped?"
Step 1: THOUGHT -> "I need to look up Acme Corp's account first"
Step 2: ACTION -> get_account(account_id="ACC-1234")
Step 3: OBSERVE -> {name: "Acme Corp", arr: 450000, status: "active"}
Step 4: THOUGHT -> "Got the account. Now I need quarterly spend data"
Step 5: ACTION -> get_quarterly_spend(account_id="ACC-1234", quarter="Q2-2026")
Step 6: OBSERVE -> {current: 98000, previous: 142000, change: -31%}
Step 7: THOUGHT -> "Spend dropped 31%. That's significant. I should create a ticket"
Step 8: ACTION -> create_ticket(account="ACC-1234", summary="Q2 spend drop 31%", priority="high")
Step 9: OBSERVE -> {ticket_id: "SUP-5678", url: "https://jira.company.com/SUP-5678"}
Step 10: ANSWER -> "Acme Corp's Q2 spend is $98K, down 31% from Q1 ($142K).
I've created ticket SUP-5678 to flag this for the account team."
ReAct failure modes and mitigations
| FAILURE MODE | WHAT HAPPENS | MITIGATION |
|---|---|---|
| Infinite loop | Model keeps calling the same tool with same args, or alternates between two tools | Set max_steps (8-15 for most tasks). Detect repeated tool calls: if same tool+args called twice, break |
| Drift | Model forgets the original question as context fills with observations | Re-inject the original question every 3-5 steps: "Remember, the user asked: {original_question}" |
| Runaway cost | Each step = 1 API call. 15 steps x $0.03 = $0.45 per query. At 10K queries/day = $4,500/day | Token budgets per query. Step limits. Monitor p99 step counts. Alert on queries exceeding 10 steps |
| Wrong tool order | Model tries to create a ticket before looking up the account | Add dependency hints in system prompt: "Always look up the account before taking any action on it" |
| Hallucinated tool args | Model invents account IDs or user IDs not provided by the user | Input validation in dispatcher. Return clear errors: "Invalid ID format. Expected ACC-XXXX" |
| Premature answer | Model answers after 1 step when it should do more research | System prompt: "Before answering, verify you have all necessary data. If uncertain, use another tool" |
PRODUCTION COST REALITY: ReAct loops are the #1 source of unexpected LLM costs. Every step is a full API call with the entire conversation history. Step 10 sends all of steps 1-9 again. Token usage grows quadratically. For high-volume systems, implement: (1) step limits, (2) per-user token budgets, (3) intermediate result caching, (4) alerts on p99 step counts.