Understanding Agent Memory: A Developer's Guide
A deep dive into the four types of memory AI agents need and why each matters for building production-grade systems.
One API for four memory types. Working, semantic, episodic, and procedural — all serverless, all AWS-native. Give your agents persistent memory in minutes, not months.
To give them memory, developers stitch together four databases, four clients, four billing accounts — with no unified query layer.
Redis for state. Pinecone for vectors. Postgres for structured data. S3 for logs. Four billing accounts, four clients, zero unified memory layer.
State lives here. Vectors live there. Events are somewhere else. You write glue code instead of writing agents.
Without persistence, every invocation starts from scratch. Context, preferences, and history vanish between calls.
AWS-native serverless infrastructure. No LLM required for CRUD operations.
Key-value state in DynamoDB. Sub-10ms reads with optimistic locking and configurable TTL. Ideal for agent step state and task context.
Natural-language text stored as 1024-dimensional vectors in Aurora pgvector. Auto-embedded via Bedrock Titan. Duplicates are merged, not re-inserted.
Append-only time-series event log. Hot data in DynamoDB, automatically tiered to S3. Full session replay and time-range queries.
Tool definitions, prompt templates, schemas, and rules stored in Postgres. Version-controlled and queryable by name. Schema is live; SDK methods ship in v0.2.
Install pip install mnemora-sdk and you're ready.
1from mnemora import MnemoraSync23with MnemoraSync(api_key="mnm_...") as client:4 # Store working-memory state5 client.store_state("agent-1", {"task": "summarize Q4", "step": 1})67 # Semantic memory — auto-embedded server-side8 client.store_memory("agent-1", "User prefers bullet points over prose.")910 # Vector search across all stored memories11 results = client.search_memory("user formatting preferences", agent_id="agent-1")12 for r in results:13 print(r.content, r.similarity_score)1415 # Log an episode to the time-series history16 client.store_episode(agent_id="agent-1", session_id="sess-001",17 type="action", content={"tool": "summarize", "input": "Q4 report"})
Concrete data. No hype.
| Feature | Mnemora | Mem0 | Zep | Letta |
|---|---|---|---|---|
| Memory types | 4 (state, semantic, episodic, procedural) | 1 (semantic only) | 2 (semantic + temporal) | 2 (core + archival) |
| Vector search | pgvector 1024d | External DB | Built-in | Built-in |
| LLM required for CRUD | Every op | Every op | ||
| Serverless | ||||
| Self-hostable | Partial | |||
| Multi-tenant | ||||
| LangGraph checkpoints | ||||
| State latency | <10ms | ~500ms | <200ms | ~1s |
Data based on public documentation as of 2025. Subject to change.
Mem0 and Letta call an LLM for every memory operation, adding latency and token cost. Mnemora does direct database CRUD. State reads are sub-10ms. No LLM overhead, no rate limits.
Every component scales to zero when idle. DynamoDB on-demand, Aurora Serverless v2, Lambda, S3. You pay per request. Estimated idle cost: ~$1/month.
Each API key is scoped to a tenant. Each agent gets an isolated namespace. Data is never mixed at the database layer. Built for SaaS products with multiple end-users.
Drop in MnemoraCheckpointSaver as your LangGraph checkpointer. Each thread_id maps to a Mnemora agent with optimistic locking to prevent concurrent-write data loss.
A deep dive into the four types of memory AI agents need and why each matters for building production-grade systems.
Step-by-step tutorial: how to use MnemoraCheckpointSaver to give your LangGraph agents durable, cross-session memory.
The economics of serverless infrastructure for AI workloads — and why stateless-by-default is the wrong architecture choice.
Start free. Scale as you grow. No surprises.
For exploration and side projects
For early-stage products
For production applications
For high-volume teams
All plans include TLS encryption, AWS-native infrastructure, and the full Python SDK. No credit card required for the free tier.
Start in under 5 minutes. No infrastructure to configure. No servers to manage. Just memory that works.