Voice Memory Quickstart
Plug MemoAir into any LLM-driven voice agent in five minutes. Install the Python or TypeScript SDK, grab keys from the dashboard, seed an org index, then call search_memory before each reply and save_response after. The local memory runtime is bundled and auto-launched on first use — no Docker step.
Where to go after this page
- On LiveKit? LiveKit integration — drop-in
MemoAirLiveKitAgent. - On your own framework? Python SDK reference — full method signatures and options for
MemoAirVoiceClient. - Building a TypeScript LiveKit Agent? JS/TypeScript SDK reference — npm packages
memoair-voiceandmemoair-livekit.
Prerequisites
- Python 3.9 or higher.
- A MemoAir account API key (
memoair_pk_…), a project ID (proj_…), and an agent ID (agent_…) from dashboard.memoair.space.
Install
Pick your language. Both SDK families use the same localvoice-runtime contract and auto-resolve the runtime on first use.
# Pythonpip install "memoair-voice>=0.3.2" # Node / TypeScriptnpm install memoair-voiceUsing LiveKit? pip install "memoair-livekit>=0.3.2" or npm install memoair-livekit adds the drop-in MemoAirLiveKitAgent on top.
Get keys from the dashboard
Sign in at dashboard.memoair.space and create (or pick) a project + agent. From the dashboard copy three values:
- API key —
memoair_pk_…. Treat as a secret. Account-scoped: one per account, shared across every project and agent in your org. - Project ID —
proj_…. Identifies the workspace. Safe to log. - Agent ID —
agent_…. Identifies the voice bot inside the project. Safe to log.
Configure your .env
MEMOAIR_API_KEY=memoair_pk_...MEMOAIR_PROJECT_ID=proj_...MEMOAIR_AGENT_ID=agent_...user_id is supplied per call, not via env — resolve it from your transport (LiveKit participant identity, websocket session, Vapi metadata, etc.) and pass it into each search_memory / save_response call so every end-user gets an isolated memory lane.
Seed a knowledge index
Static knowledge (FAQs, runbooks, profile facts) lives in MemoAir's org index — project-scoped, shared across all users. Pick either path; both land in the same index.
Option A — Dashboard upload (no code)
At dashboard.memoair.space open your project → Knowledge → New index. Name it agent-memory, upload PDFs / paste raw notes, and the dashboard chunks and embeds for you. Best for non-trivial corpora and anything beyond the 100-doc / 1 MB per-call API cap.
Option B — Code-driven seed
import asyncioimport osfrom memoair_voice import MemoAirVoiceClient async def main() -> None: # create_index is a cloud-side call — it doesn't need a runtime, so the # build script can construct the client without touching user memory at # all. aclose() releases the cloud HTTP client cleanly. client = MemoAirVoiceClient( api_key=os.environ["MEMOAIR_API_KEY"], project_id=os.environ["MEMOAIR_PROJECT_ID"], agent_id=os.environ["MEMOAIR_AGENT_ID"], ) try: # First call CREATES the index; later calls APPEND (or replace by id). # id is the stable per-document key; metadata is free-form tags. result = await client.create_index( "agent-memory", documents=[ {"id": "returns", "text": "Returns accepted within 30 days with a receipt.", "metadata": {"topic": "returns"}}, {"id": "shipping", "text": "Express shipping is 1-2 days.", "metadata": {"topic": "shipping"}}, ], ) print(f"Indexed {result.chunk_count} chunks (version={result.version})") finally: await client.aclose() asyncio.run(main())Wire MemoAir into your agent
The default pattern: search before each reply (auto-inject as a system message), save after each reply. The example below is framework-agnostic; for the LiveKit drop-in see the LiveKit page.
import asyncioimport osfrom memoair_voice import MemoAirVoiceClient async def run_turn( client: MemoAirVoiceClient, user_text: str, *, participant_identity: str,) -> str: # 1. Recall context locally for THIS end-user. SearchResult.contextText is # a ready-to-splice system-message string (profile + working + # permanent + org composed). ctx = await client.search_memory( user_text, user={"id": participant_identity}, # add name/metadata when available timeout_ms=250, ) system_prompt = ( "You are a helpful assistant.\n\n" f"## Memory\n{ctx.contextText}" if ctx.contextText else "You are a helpful assistant." ) # 2. Call your LLM with system_prompt + user_text. (Stand-in below.) assistant_text = await your_llm_call(system_prompt, user_text) # 3. Persist the completed turn so memory grows over time. Best-effort; # transient errors are swallowed so audio is never blocked. await client.save_response( user_text=user_text, assistant_text=assistant_text, user={"id": participant_identity}, metadata={"framework": "custom"}, ) return assistant_text async def main() -> None: # Construct the client ONCE at boot. Internally the SDK keeps a pool of # voice-runtime processes keyed by (project_id, user.id) — pass user_id # per call and the pool routes (or spawns) the right runtime. client = MemoAirVoiceClient( api_key=os.environ["MEMOAIR_API_KEY"], project_id=os.environ["MEMOAIR_PROJECT_ID"], agent_id=os.environ["MEMOAIR_AGENT_ID"], ) try: reply = await run_turn( client, "How long do I have to return something?", participant_identity="user_42", ) print(reply) finally: await client.aclose() asyncio.run(main())One client, many users. The SDK pools voice-runtime processes by (project_id, user.id) behind the scenes — build MemoAirVoiceClient once and pass user={...} on every call. The pool caps at max_concurrent_users=20 (LRU evicts older runtimes after the cap, after flushing their sync outbox); bump it for high-concurrency bridges. See the multi-workspace SaaS guide for the partner / project-per-customer pattern.
Pick your framework
v0.3 ships a first-class LiveKit adapter (MemoAirLiveKitAgent). For Pipecat / Vapi / Retell — and any other framework — use the same MemoAirVoiceClient pattern shown above. The canonical reference for non-LiveKit frameworks is examples/livekit/voice_agents/memoair_voice_custom_agent.py in the repo. First-class adapters for those three frameworks land in v0.4.
| Framework | Package | Status |
|---|---|---|
| LiveKit Agents | pip install memoair-livekit | Available now |
| Pipecat | # custom-agent pattern (v0.3) | Custom v0.3 · adapter v0.4 |
| Vapi | # custom-agent pattern (v0.3) | Custom v0.3 · adapter v0.4 |
| Retell AI | # custom-agent pattern (v0.3) | Custom v0.3 · adapter v0.4 |
Troubleshooting
pool.exhausted
The runtime pool is at max_concurrent_users (20 by default) and every handle is mid-call, so LRU eviction has nothing to drop. Raise max_concurrent_users when constructing MemoAirVoiceClient, or wait for an in-flight call to release.
pool.port_exhausted
All ports inside runtime_port_range (default (7878, 7977)) are bound. Widen the range or lower max_concurrent_users.
httpx.ConnectError to a runtime port
The pool failed to bring up a new voice-runtime process. Check stderr for the download/spawn step; behind a strict firewall you may need to pre-cache the binary via runtime_binary=.
runtime.timeout on search_memory
The composer exceeded the 250 ms default. Bump with timeout_ms=500 or check the runtime logs for a degraded lane.
bootstrap_failed in runtime logs
MemoAir cloud is unreachable. search_memory still works against the last cached projection; new permanent facts won't sync down until connectivity returns.