Three-Tier Memory
Ezra manages memory across three tiers modelled on the hardware memory hierarchy — each a different time scale, backed by a different store, all scope-filtered per agent.
| Tier | Analog | Latency | Holds | Backing store |
|---|---|---|---|---|
| Hot | CPU cache | ~0 ms | Active turn, pinned beliefs, fetched mesh result | Redis (per-agent hash) |
| Warm | RAM | 5–15 ms | Compressed prior turns, topic summaries | Qdrant (graph + scope filtered) |
| Cold | SSD | 20–50 ms | Episodic, semantic, procedural, belief history | MongoDB Atlas |
Hot — Redis
One Redis hash per agent holds the recent turns (a capped ring buffer), pinned beliefs, and the latest mesh result. Because each agent owns its key, the hot tier is O(1) per agent — adding agents adds rows, not iterations.
Warm — Qdrant
Compressed prior-turn and topic summaries, recalled by semantic similarity to the current input and filtered server-side by session_graph_id, then in Python by scope and TTL. Summaries past their TTL are evicted by the lifecycle meta-agent.
Cold — MongoDB Atlas
Four sub-stores share one Atlas client:
| Sub-store | Loading | Notes |
|---|---|---|
| Episodic | by session-graph / timestamp | Time-windowed event memory |
| Semantic — core | always, at agent spawn, for scope | The always-needed facts |
| Semantic — archival | per turn, by vector similarity | Atlas Vector Search |
| Procedural | by intent pattern | Inferred behavioural rules |
| Belief history | append-only, queryable | The compliance artifact — never compacts |
Core vs. archival semantic memory
Core facts load at every spawn for the agent's scope (institutional knowledge it always needs). Archival facts are the long tail: on every turn, the router embeds the input and pulls the most-similar archival facts via Atlas $vectorSearch, scope-filtered. Each recall bumps the fact's access count; once it crosses a threshold the learning meta-agent promotes it to core, so frequently-needed knowledge graduates into the always-loaded set.
# Per-turn archival recall (run inside router step 4):
facts = await semantic_store.recall_archival(
"how long do the medium tyres last?",
user_id="team-ezra",
scope_topics={"tyres", "strategy"},
limit=3,
)The vector index is created with MongoSemanticStore.ensure_vector_index(num_dimensions) at wiring/ingest time (sized to the embedder — gemini-embedding-001 is 3072-dim; Vertex text-embedding-005 is 768-dim).
Cross-graph inheritance
A new session graph can hydrate core semantic memory from any number of prior graphs:
graph = await ezra.create_session_graph(
session_graph_id="race-weekend-imola-2026",
inherits_from=["race-weekend-monaco-2026", "race-weekend-bahrain-2026"],
)Inherited core facts load at spawn, scope-filtered. Belief history is never inherited — only semantic core (and, optionally, procedural rules).