Voice Memory Quickstart

Plug MemoAir into any LLM-driven voice agent in five minutes. Install the Python or TypeScript SDK, grab keys from the dashboard, seed an org index, then call search_memory before each reply and save_response after. The local memory runtime is bundled and auto-launched on first use — no Docker step.

Where to go after this page

On LiveKit? LiveKit integration — drop-in MemoAirLiveKitAgent.
On your own framework? Python SDK reference — full method signatures and options for MemoAirVoiceClient.
Building a TypeScript LiveKit Agent? JS/TypeScript SDK reference — npm packages memoair-voice and memoair-livekit.

Prerequisites

Python 3.9 or higher.
A MemoAir account API key (memoair_pk_…), a project ID (proj_…), and an agent ID (agent_…) from dashboard.memoair.space.

Install

Pick your language. Both SDK families use the same localvoice-runtime contract and auto-resolve the runtime on first use.

terminal

BASH

# Python
pip install "memoair-voice>=0.3.2"
 
# Node / TypeScript
npm install memoair-voice

Using LiveKit? pip install "memoair-livekit>=0.3.2" or npm install memoair-livekit adds the drop-in MemoAirLiveKitAgent on top.

Get keys from the dashboard

API key — memoair_pk_…. Treat as a secret. Account-scoped: one per account, shared across every project and agent in your org.
Project ID — proj_…. Identifies the workspace. Safe to log.
Agent ID — agent_…. Identifies the voice bot inside the project. Safe to log.

Configure your .env

.env

BASH

MEMOAIR_API_KEY=memoair_pk_...
MEMOAIR_PROJECT_ID=proj_...
MEMOAIR_AGENT_ID=agent_...

user_id is supplied per call, not via env — resolve it from your transport (LiveKit participant identity, websocket session, Vapi metadata, etc.) and pass it into each search_memory / save_response call so every end-user gets an isolated memory lane.

Seed a knowledge index

Static knowledge (FAQs, runbooks, profile facts) lives in MemoAir's org index — project-scoped, shared across all users. Pick either path; both land in the same index.

Option A — Dashboard upload (no code)

At dashboard.memoair.space open your project → Knowledge → New index. Name it agent-memory, upload PDFs / paste raw notes, and the dashboard chunks and embeds for you. Best for non-trivial corpora and anything beyond the 100-doc / 1 MB per-call API cap.

Option B — Code-driven seed

build_index.py

PYTHON

import asyncio
import os
from memoair_voice import MemoAirVoiceClient
 
 
async def main() -> None:
    # create_index is a cloud-side call — it doesn't need a runtime, so the
    # build script can construct the client without touching user memory at
    # all. aclose() releases the cloud HTTP client cleanly.
    client = MemoAirVoiceClient(
        api_key=os.environ["MEMOAIR_API_KEY"],
        project_id=os.environ["MEMOAIR_PROJECT_ID"],
        agent_id=os.environ["MEMOAIR_AGENT_ID"],
    )
    try:
        # First call CREATES the index; later calls APPEND (or replace by id).
        # id is the stable per-document key; metadata is free-form tags.
        result = await client.create_index(
            "agent-memory",
            documents=[
                {"id": "returns", "text": "Returns accepted within 30 days with a receipt.", "metadata": {"topic": "returns"}},
                {"id": "shipping", "text": "Express shipping is 1-2 days.", "metadata": {"topic": "shipping"}},
            ],
        )
        print(f"Indexed {result.chunk_count} chunks (version={result.version})")
    finally:
        await client.aclose()
 
 
asyncio.run(main())

Wire MemoAir into your agent

The default pattern: search before each reply (auto-inject as a system message), save after each reply. The example below is framework-agnostic; for the LiveKit drop-in see the LiveKit page.

agent.py

PYTHON

import asyncio
import os
from memoair_voice import MemoAirVoiceClient
 
 
async def run_turn(
    client: MemoAirVoiceClient,
    user_text: str,
    *,
    participant_identity: str,
) -> str:
    # 1. Recall context locally for THIS end-user. SearchResult.contextText is
    #    a ready-to-splice system-message string (profile + working +
    #    permanent + org composed).
    ctx = await client.search_memory(
        user_text,
        user={"id": participant_identity},  # add name/metadata when available
        timeout_ms=250,
    )
    system_prompt = (
        "You are a helpful assistant.\n\n"
        f"## Memory\n{ctx.contextText}"
        if ctx.contextText
        else "You are a helpful assistant."
    )
 
    # 2. Call your LLM with system_prompt + user_text. (Stand-in below.)
    assistant_text = await your_llm_call(system_prompt, user_text)
 
    # 3. Persist the completed turn so memory grows over time. Best-effort;
    #    transient errors are swallowed so audio is never blocked.
    await client.save_response(
        user_text=user_text,
        assistant_text=assistant_text,
        user={"id": participant_identity},
        metadata={"framework": "custom"},
    )
    return assistant_text
 
 
async def main() -> None:
    # Construct the client ONCE at boot. Internally the SDK keeps a pool of
    # voice-runtime processes keyed by (project_id, user.id) — pass user_id
    # per call and the pool routes (or spawns) the right runtime.
    client = MemoAirVoiceClient(
        api_key=os.environ["MEMOAIR_API_KEY"],
        project_id=os.environ["MEMOAIR_PROJECT_ID"],
        agent_id=os.environ["MEMOAIR_AGENT_ID"],
    )
    try:
        reply = await run_turn(
            client,
            "How long do I have to return something?",
            participant_identity="user_42",
        )
        print(reply)
    finally:
        await client.aclose()
 
 
asyncio.run(main())

One client, many users. The SDK pools voice-runtime processes by (project_id, user.id) behind the scenes — build MemoAirVoiceClient once and pass user={...} on every call. The pool caps at max_concurrent_users=20 (LRU evicts older runtimes after the cap, after flushing their sync outbox); bump it for high-concurrency bridges. See the multi-workspace SaaS guide for the partner / project-per-customer pattern.

Pick your framework

v0.3 ships a first-class LiveKit adapter (MemoAirLiveKitAgent). For Pipecat / Vapi / Retell — and any other framework — use the same MemoAirVoiceClient pattern shown above. The canonical reference for non-LiveKit frameworks is examples/livekit/voice_agents/memoair_voice_custom_agent.py in the repo. First-class adapters for those three frameworks land in v0.4.

Framework	Package	Status
LiveKit Agents	`pip install memoair-livekit`	Available now
Pipecat	`# custom-agent pattern (v0.3)`	Custom v0.3 · adapter v0.4
Vapi	`# custom-agent pattern (v0.3)`	Custom v0.3 · adapter v0.4
Retell AI	`# custom-agent pattern (v0.3)`	Custom v0.3 · adapter v0.4

Troubleshooting

pool.exhausted

The runtime pool is at max_concurrent_users (20 by default) and every handle is mid-call, so LRU eviction has nothing to drop. Raise max_concurrent_users when constructing MemoAirVoiceClient, or wait for an in-flight call to release.

pool.port_exhausted

All ports inside runtime_port_range (default (7878, 7977)) are bound. Widen the range or lower max_concurrent_users.

httpx.ConnectError to a runtime port

The pool failed to bring up a new voice-runtime process. Check stderr for the download/spawn step; behind a strict firewall you may need to pre-cache the binary via runtime_binary=.

runtime.timeout on search_memory

The composer exceeded the 250 ms default. Bump with timeout_ms=500 or check the runtime logs for a degraded lane.

bootstrap_failed in runtime logs

MemoAir cloud is unreachable. search_memory still works against the last cached projection; new permanent facts won't sync down until connectivity returns.