LangGraph vs Vanilla Agents: When the Graph Earns Its Keep
After shipping the Saudi hospital voice agent on LangGraph and a half-dozen vanilla tool-calling bots, here's when the graph abstraction is worth its weight — and when it just slows you down.
Most tool-calling agents I've shipped do not need LangGraph. A FastAPI endpoint, an OpenAI client, a tools list, and a while loop will take you surprisingly far. The moment you reach for a graph is the moment the conversation stops being linear.
The Saudi hospital calling agent is the cleanest example I have. A patient phone call has at least five interleaved sub-flows — identity verification, booking, cancellation, rescheduling, doctor-availability lookup — and the caller can jump between them mid-sentence. With a vanilla while-loop you end up reconstructing state from the message history on every turn, which is fragile the moment the model hallucinates a slot value or the caller corrects themselves.
LangGraph pays off here because the state object is explicit. Each node owns one concern, transitions are typed, and the conversation history becomes incidental rather than load-bearing. When the production model occasionally fumbles a date, the graph still knows we are inside the 'booking' subgraph and can re-prompt deterministically instead of asking the LLM to recover.
The cost is real though. Debugging a graph is harder than debugging a single function. Tracing a misrouted edge takes more cognitive load than reading a linear handler. And for the multi-tenant RAG chatbot SaaS, where 90% of conversations are single-turn Q&A with the occasional escalation tool call, a graph would be over-engineering.
My rule of thumb after a year of shipping both: if your agent has more than three persistent state slots that survive across turns, or more than two sub-flows the user can jump between, use LangGraph. Otherwise, a clean tools loop is faster to build, faster to debug, and easier to hand off.