Lab 14

Agentic AI

Agent Architecture — Microservices vs Monolith

Run the same multi-agent system two ways: each agent as its own API orchestrated by LangGraph, vs one LangGraph monolith in a single codebase. Compare latency, deploys, scaling and failure isolation.

This lab answers one question: WHERE does each agent run? — all agents in one LangGraph codebase, or each agent as its own service. How the agents communicate (synchronous calls vs event-driven) is a separate, independent choice → see Backpressure. A monolith LangGraph can still run asynchronously; the two dials are explained in Topology vs Communication.

LangGraph Application — one deployablein-process function calls

🧩 langgraph · StateGraph(nodes)

📥

Intake

→

🔎

Retrieval

→

✍️

Answer

→

🧪

Evaluation

→

🔔

Notify

Orchestration overhead

~5 ms

in-process calls

Redeploy blast radius

—

whole graph

Scaling unit

whole graph

—

Failure isolation

Shared fate

one crash kills the graph

Dimension

Monolith LangGraph

Microservice Agents

Agent-to-agent calls

In-process (~1ms)

Network API (~45ms)

Codebase

One repo, one graph

One repo per agent

Deploy a single agent

Redeploys everything

Deploy that agent only

Scale a hot agent

Clone the whole graph

Add replicas of that agent

A crashing agent

Takes down the graph

Isolated, others keep serving

Polyglot / per-agent model

Hard — shared runtime

Easy — independent stacks

Ops & infra effort

Low

Higher (mesh, discovery, tracing)

Best when

Small team, early product

Independent scale & team autonomy

MCP gateway — the control boundary

Splitting into services isn't enough — the agent still needs a boundary in front of the tools. An MCP gateway is a service with an LLM-oriented interface that decides what the agent is allowed to do. Try a tool call with the gateway off, then on.

🤖 Agent

→

— no boundary —

→

🏢 ERP / systems

The agent tries to call…

allowlist per agent

input schema check

rate limits

tenant isolation

What just happened

▹Monolith LangGraph: every agent is a node in one codebase, calls are in-process (~1ms), and the whole graph deploys and scales as a single unit — simple, fast, but coarse.
▹Microservice agents: each agent is its own API the orchestrator calls over the network (~45ms/hop). You pay latency + ops complexity, and gain independent deploys, per-agent scaling and failure isolation.
▹Same five agents, two architectures. Ship a new Answer model: the monolith redeploys all 5; microservices touch only 1. Scale Retrieval: the monolith clones the whole graph; microservices add Retrieval replicas alone.
▹The MCP / tool gateway is a control boundary, not just a connector: it decides which tools this agent may call, validates the input schema, holds write actions for approval, and enforces rate limits and tenant isolation. Never let the agent call enterprise systems directly.