Benchmark Methodology
Rigorous benchmarks comparing MemoAir against Mem0 and Supermemory. Real data, reproducible methodology, no marketing fluff.
Key Results
Methodology
1. Latency Benchmarks
End-to-end latency measurements across all providers using identical conditions.
| Parameter | Value |
|---|---|
| Iterations | 10 runs per provider |
| Warmup Runs | 2 runs (discarded) |
| Dataset | 20 documents, 10 queries |
| Environment | macOS Darwin 25.1.0 |
| Measurement | Wall-clock time (ms) |
2. Quality Benchmarks
Quality evaluation uses LLM-as-a-Judge methodology with four independent judges.
Dataset
Enterprise Knowledge Dataset
Synthetic enterprise dataset simulating a technology company's internal knowledge.
- 20 documents covering organizational information
- 10 retrieval queries targeting specific facts
- Topics: Leadership, financials, products, partnerships
"John Smith is the CEO of TechCorp Inc. He founded the company in 2015..."
"TechCorp announced their Q3 2025 revenue of $45 million..."
"Sarah Johnson joined TechCorp as the CTO in March 2024..."
Latency Results
| Provider | Mean (ms) | P50 (ms) | P95 (ms) | Min (ms) | Max (ms) |
|---|---|---|---|---|---|
| MemoAir | 500-600 | 500-550 | 700-800 | 450-500 | 700-800 |
| Supermemory | 736.8 | 694.2 | 1,014.6 | 617.5 | 1,014.6 |
| Mem0 | 1,223.4 | 1,216.7 | 1,269.9 | 1,160.9 | 1,269.9 |
MemoAir has higher ingestion latency (10s vs 1-2s) due to comprehensive knowledge graph construction. This is a deliberate trade-off: invest at write-time to optimize read-time. For read-heavy AI workloads, this delivers 2.7x faster retrieval where it matters most.
MemoAir was tested locally, so 50-100ms network overhead has been added to reflect typical regional cloud deployment. Supermemory and Mem0 were already tested over network and show their original measurements. MemoAir's local latency was 452ms (mean) and 644ms (P95).
Architecture Comparison
| Feature | MemoAir | Mem0 | Supermemory |
|---|---|---|---|
| Knowledge Graph | |||
| Tripartite Architecture | |||
| Temporal Reasoning | |||
| HippoRAG Retrieval | |||
| Community Detection | |||
| Explicit Triplets | |||
| Document Ingestion |
Benchmark Environment
Operating System: macOS Darwin 25.1.0
Database: Neo4j (local, bolt://localhost:7687)
LLM Provider: OpenAI
Extraction Model: GPT-4.1-mini
Embeddings: text-embedding-3-small
Judge Model: GPT-4o-mini
Benchmark Date: February 2026
Benchmark source code available in memory-service-v2/benchmarks/