Runtime Substrate for Multi-Agent Fleets.

Replayable federated context for multi-agent runtime, managing what each agent in the fleet sees, believes, and remembers — with complete provenance.

curl -fsSL https://ezra128.vercel.app/install.sh | bash

Quick Start → Read the Pitch →

See it in action ↓

<40^MS

Per-Agent Overhead

N^AGENTS

Dynamic Fleet Scaling

3^TIER

Memory Topology

∞^REPLAY

Branching Belief History

Quickstart

Three steps.
Running in minutes.

ezra / quickstart

Install

curl -fsSL https://ezra128.vercel.app/install.sh | bash

Spawn an agent

from ezra_core.runtime import Ezra ezra = Ezra.from_env() graph = await ezra.create_session_graph(session_graph_id="race-weekend") svc = await ezra.spawn_agent(graph, agent_id="strategist", permission_scope=["tyres"])

Run a turn

result = await svc.complete("What tyre for the final stint?") print(result.response)

Full Setup Guide — Google Cloud ADK →

Capabilities

Six things one
platform handles.

Most tools solve federation or memory — for a single agent. Ezra does both, for any number of agents, as one runtime.

// Federation

Pushdown Data Access

Connect to MongoDB, Snowflake, BigQuery, REST. The database does the math — the model gets typed summaries with provenance, never raw dumps.

// Memory

Three-Tier Recall

Hot, warm, cold. Automatic eviction, compaction, and salience-ranked hydration. Each agent gets the most relevant slice of what the fleet has ever known.

// Beliefs

Four-Strategy Reconciliation

When agents disagree, four strategies resolve it before the model sees it: last-write, highest-trust, manual escalation, or your own custom resolver.

// Replay

Branching Replay

Reconstruct what any agent knew at any prior moment. Mutate state. Run forward. Diff against reality. The compliance and counterfactual artifact.

// Security

Per-Agent Scopes

Every agent sees only what its permission scope allows. Enforced at the platform layer on every memory query and data fetch — not in the prompt.

// Scale

Flat Latency at N

Adding agents doesn't slow per-agent decisions. O(1) hot tier, scope-filtered shared tiers, independent mesh fetches. Constant latency, linear throughput.

Architecture

The 8-step router.
Per agent. <40ms.

Every agent call runs through eight steps in the same order. Each agent gets its own pipeline instance. The output of every step is observable, replayable, and pinned to the belief store.

// Pipeline EZRA.RUNTIME / ROUTER P50 Target · <40ms

Parse

Intent · entities

Policy

Scope check

Belief

Reconcile

Hydrate

Memory pull

Fetch

Mesh decision

Assemble

Context build

Call

LLM via litellm

Write

Beliefs · trace

Most agent platforms only do steps 1, 7, and 8. Steps 2–6 are where Ezra differs — policy enforcement, belief reconciliation, scope-filtered memory, pushdown federation, and salience-ranked assembly all happen before the model is called.

Session graph

A fleet that spawns
and shares.

One session graph binds N agents — spawned and terminated as the operation needs. Each has its own scoped view, but they share one belief store and one tiered memory.

Loading diagram…

Hot, warm,
and cold.

Memory is not flat. The runtime evicts, compacts, and hydrates across three tiers — so each agent always sees the most relevant slice, filtered by its permission scope.

Hot

Redis · per-agent hash

Active turn, pinned beliefs, and current mesh result — isolated per agent in the session graph.

~0ms

Warm

Qdrant · vector filtered

Compressed prior turns, topic summaries, recent tool results. Searchable by salience, evicted on decay.

5–15ms

Cold

MongoDB Atlas · durable

Episodic, semantic core, archival, procedural rules, and the full versioned belief history. Never evicted.

20–50ms

Whitespace

Five things no other
platform ships.

Persistent memory is table-stakes. The wedge is what sits one layer above it: provenance, audit, per-agent replay, time-travel, and cross-agent reconciliation.

Pushdown execution

The database does the math. Typed queries to MongoDB, Snowflake, BigQuery — the model never touches raw rows.

ii.

Versioned belief audit

Every commitment any agent made, every fact it relied on, attributed by agent_id. Queryable and exportable at any timestamp.

iii.

Branching replay

Reconstruct any agent's view. Mutate state. Run forward with a new model or policy. Diff branches. The compliance and eval surface.

iv.

Time-travel federated query

Run any query as of any prior timestamp. Source-native where it exists. Best-effort with explicit provenance where it does not.

Cross-agent reconciliation

When agents commit conflicting beliefs, four strategies resolve it: last-write, highest-trust, manual escalation, or a custom application-defined resolver.

Integration

Three ways in.
Pick one.

Ezra is framework-agnostic. ADK is the primary path. The Python SDK covers LangGraph and LangChain. REST covers everything else.

// Path 01

ADK Service

Direct Python integration for agents built with Google ADK. No MCP indirection. Best for production fleets on Google Cloud.

from ezra.adk_service import EzraService svc = EzraService( session_graph_id="race-weekend" ) ctx = svc.recall(agent_id="tyre_eng") svc.belief_check(claim="soft optimal")

// Path 02

Python SDK

Six methods for LangGraph, LangChain, or any custom framework. Adopt belief audit alone, or memory, or replay.

from ezra_core import Ezra ezra =
					Ezra.from_env( session_graph_id="incident-triage" ) await ezra.complete( agent_id="triage", messages=[...], )

// Path 03

REST API

Thirteen endpoints. Bearer or OIDC auth. Use from any language or external system.

# Thirteen endpoints POST /ezra/belief/check POST /ezra/belief/snapshot POST /ezra/mesh/query POST /ezra/commit POST /ezra/recall POST /ezra/rewind POST /ezra/revert POST /ezra/replay POST /ezra/branch POST /ezra/branch/diff GET /ezra/health

Full API Reference →

Audience

Four rooms it
belongs in.

// AI Platform Engineer

"One platform. One audit trail. Any number of agents."

Wins back weeks of glue code per agent. Federation, memory, and belief tracking for the entire fleet.

// Enterprise Security

"Agents inherit user identity. The wrong agent never sees wrong data."

Per-agent permission scopes enforced on every query and fetch. The security chokepoint for the fleet.

// Compliance Officer

"Every fact, every source, every permission — queryable and replayable."

The belief history is the SOC2, GDPR, and financial-audit artifact. Branching replay is the root-cause tool.

// CTO

"Smallest context per call. Lower tokens, lower latency, at 100+ agents."

Performance stays flat as the fleet scales. Constant per-agent latency, linear total throughput. Provable.

Ezra is the
multi-agent platform
for enterprises.

Explore Docs → Read the Pitch → Book a Briefing

// Repo

github.com/xavio2495/ezra

// Docs

ezra128.vercel.app/docs

// Status

All systems nominal

// Contact

2495.immanuel@gmail.com