Runtime Substrate for Multi-Agent Fleets.

Replayable federated context for multi-agent runtime, managing what each agent in the fleet sees, believes, and remembers — with complete provenance.

curl -fsSL https://ezra128.vercel.app/install.sh | bash
See it in action ↓
<40MS
Per-Agent Overhead
NAGENTS
Dynamic Fleet Scaling
3TIER
Memory Topology
REPLAY
Branching Belief History

Three steps.
Running in minutes.

ezra / quickstart
1
Install
curl -fsSL https://ezra128.vercel.app/install.sh | bash
2
Spawn an agent
from ezra_core.runtime import Ezra ezra = Ezra.from_env() graph = await ezra.create_session_graph(session_graph_id="race-weekend") svc = await ezra.spawn_agent(graph, agent_id="strategist", permission_scope=["tyres"])
3
Run a turn
result = await svc.complete("What tyre for the final stint?") print(result.response)
Full Setup Guide — Google Cloud ADK

Six things one
platform handles.

Most tools solve federation or memory — for a single agent. Ezra does both, for any number of agents, as one runtime.

// Federation

Pushdown Data Access

Connect to MongoDB, Snowflake, BigQuery, REST. The database does the math — the model gets typed summaries with provenance, never raw dumps.

// Memory

Three-Tier Recall

Hot, warm, cold. Automatic eviction, compaction, and salience-ranked hydration. Each agent gets the most relevant slice of what the fleet has ever known.

// Beliefs

Four-Strategy Reconciliation

When agents disagree, four strategies resolve it before the model sees it: last-write, highest-trust, manual escalation, or your own custom resolver.

// Replay

Branching Replay

Reconstruct what any agent knew at any prior moment. Mutate state. Run forward. Diff against reality. The compliance and counterfactual artifact.

// Security

Per-Agent Scopes

Every agent sees only what its permission scope allows. Enforced at the platform layer on every memory query and data fetch — not in the prompt.

// Scale

Flat Latency at N

Adding agents doesn't slow per-agent decisions. O(1) hot tier, scope-filtered shared tiers, independent mesh fetches. Constant latency, linear throughput.

The 8-step router.
Per agent. <40ms.

Every agent call runs through eight steps in the same order. Each agent gets its own pipeline instance. The output of every step is observable, replayable, and pinned to the belief store.

// Pipeline   EZRA.RUNTIME / ROUTER P50 Target · <40ms
01
Parse
Intent · entities
02
Policy
Scope check
03
Belief
Reconcile
04
Hydrate
Memory pull
05
Fetch
Mesh decision
06
Assemble
Context build
07
Call
LLM via litellm
08
Write
Beliefs · trace
Most agent platforms only do steps 1, 7, and 8. Steps 2–6 are where Ezra differs — policy enforcement, belief reconciliation, scope-filtered memory, pushdown federation, and salience-ranked assembly all happen before the model is called.

A fleet that spawns
and shares.

One session graph binds N agents — spawned and terminated as the operation needs. Each has its own scoped view, but they share one belief store and one tiered memory.

Loading diagram…

Hot, warm,
and cold.

Memory is not flat. The runtime evicts, compacts, and hydrates across three tiers — so each agent always sees the most relevant slice, filtered by its permission scope.

Hot
Redis · per-agent hash
Active turn, pinned beliefs, and current mesh result — isolated per agent in the session graph.
~0ms
Warm
Qdrant · vector filtered
Compressed prior turns, topic summaries, recent tool results. Searchable by salience, evicted on decay.
5–15ms
Cold
MongoDB Atlas · durable
Episodic, semantic core, archival, procedural rules, and the full versioned belief history. Never evicted.
20–50ms

Five things no other
platform ships.

Persistent memory is table-stakes. The wedge is what sits one layer above it: provenance, audit, per-agent replay, time-travel, and cross-agent reconciliation.

i.

Pushdown execution

The database does the math. Typed queries to MongoDB, Snowflake, BigQuery — the model never touches raw rows.
ii.

Versioned belief audit

Every commitment any agent made, every fact it relied on, attributed by agent_id. Queryable and exportable at any timestamp.
iii.

Branching replay

Reconstruct any agent's view. Mutate state. Run forward with a new model or policy. Diff branches. The compliance and eval surface.
iv.

Time-travel federated query

Run any query as of any prior timestamp. Source-native where it exists. Best-effort with explicit provenance where it does not.
v.

Cross-agent reconciliation

When agents commit conflicting beliefs, four strategies resolve it: last-write, highest-trust, manual escalation, or a custom application-defined resolver.

Three ways in.
Pick one.

Ezra is framework-agnostic. ADK is the primary path. The Python SDK covers LangGraph and LangChain. REST covers everything else.

// Path 01
ADK Service

Direct Python integration for agents built with Google ADK. No MCP indirection. Best for production fleets on Google Cloud.

from ezra.adk_service import EzraService svc = EzraService( session_graph_id="race-weekend" ) ctx = svc.recall(agent_id="tyre_eng") svc.belief_check(claim="soft optimal")
// Path 02
Python SDK

Six methods for LangGraph, LangChain, or any custom framework. Adopt belief audit alone, or memory, or replay.

from ezra_core import Ezra ezra = Ezra.from_env( session_graph_id="incident-triage" ) await ezra.complete( agent_id="triage", messages=[...], )
// Path 03
REST API

Thirteen endpoints. Bearer or OIDC auth. Use from any language or external system.

# Thirteen endpoints POST /ezra/belief/check POST /ezra/belief/snapshot POST /ezra/mesh/query POST /ezra/commit POST /ezra/recall POST /ezra/rewind POST /ezra/revert POST /ezra/replay POST /ezra/branch POST /ezra/branch/diff GET /ezra/health
Full API Reference

Four rooms it
belongs in.

// AI Platform Engineer
"One platform. One audit trail. Any number of agents."

Wins back weeks of glue code per agent. Federation, memory, and belief tracking for the entire fleet.

// Enterprise Security
"Agents inherit user identity. The wrong agent never sees wrong data."

Per-agent permission scopes enforced on every query and fetch. The security chokepoint for the fleet.

// Compliance Officer
"Every fact, every source, every permission — queryable and replayable."

The belief history is the SOC2, GDPR, and financial-audit artifact. Branching replay is the root-cause tool.

// CTO
"Smallest context per call. Lower tokens, lower latency, at 100+ agents."

Performance stays flat as the fleet scales. Constant per-agent latency, linear total throughput. Provable.

Ezra is the
multi-agent platform
for enterprises.
// Status

All systems nominal