Run the same multi-agent system two ways: each agent as its own API orchestrated by LangGraph, vs one LangGraph monolith in a single codebase. Compare latency, deploys, scaling and failure isolation.
This lab answers one question: WHERE does each agent run? — all agents in one LangGraph codebase, or each agent as its own service. How the agents communicate (synchronous calls vs event-driven) is a separate, independent choice → see Backpressure. A monolith LangGraph can still run asynchronously; the two dials are explained in Topology vs Communication.
LangGraph Application — one deployablein-process function calls
🧩 langgraph · StateGraph(nodes)
📥
Intake
→
🔎
Retrieval
→
✍️
Answer
→
🧪
Evaluation
→
🔔
Notify
Orchestration overhead
~5 ms
in-process calls
Redeploy blast radius
—
whole graph
Scaling unit
whole graph
—
Failure isolation
Shared fate
one crash kills the graph
Dimension
Monolith LangGraph
Microservice Agents
Agent-to-agent calls
In-process (~1ms)
Network API (~45ms)
Codebase
One repo, one graph
One repo per agent
Deploy a single agent
Redeploys everything
Deploy that agent only
Scale a hot agent
Clone the whole graph
Add replicas of that agent
A crashing agent
Takes down the graph
Isolated, others keep serving
Polyglot / per-agent model
Hard — shared runtime
Easy — independent stacks
Ops & infra effort
Low
Higher (mesh, discovery, tracing)
Best when
Small team, early product
Independent scale & team autonomy
MCP gateway — the control boundary
Splitting into services isn't enough — the agent still needs a boundary in front of the tools. An MCP gateway is a service with an LLM-oriented interface that decides what the agent is allowed to do. Try a tool call with the gateway off, then on.
🤖 Agent
→
— no boundary —
→
🏢 ERP / systems
The agent tries to call…
allowlist per agent
input schema check
rate limits
tenant isolation
What just happened
▹Monolith LangGraph: every agent is a node in one codebase, calls are in-process (~1ms), and the whole graph deploys and scales as a single unit — simple, fast, but coarse.
▹Microservice agents: each agent is its own API the orchestrator calls over the network (~45ms/hop). You pay latency + ops complexity, and gain independent deploys, per-agent scaling and failure isolation.
▹Same five agents, two architectures. Ship a new Answer model: the monolith redeploys all 5; microservices touch only 1. Scale Retrieval: the monolith clones the whole graph; microservices add Retrieval replicas alone.
▹The MCP / tool gateway is a control boundary, not just a connector: it decides which tools this agent may call, validates the input schema, holds write actions for approval, and enforces rate limits and tenant isolation. Never let the agent call enterprise systems directly.