Local Runtime HTTP API
The voice runtime exposes a loopback HTTP surface at http://127.0.0.1:7878 by default. The wire contract is frozen v1; the canonical source is docs/api/voice-sdk-runtime.md.
Endpoints
/v1/runtime/healthLiveness probe + boot-pinned identity
/v1/runtime/session/startOpen one voice session; loads profile + permanent + org
/v1/runtime/search_memoryPer-turn recall. Local-only. Returns composed contextText
/v1/runtime/turn/afterPersist a completed turn; enqueues cloud sync
/v1/runtime/session/endFlush working brief; push pending outbox
GET /v1/runtime/health
{ "status": "ok", "version": "0.1.0", "active_sessions": 1, "uptime_ms": 12345, "workspaceId": "ws_123", "userId": "user_42"}The SDK reads workspaceId and userId to detect identity-mismatched sidecars before any session starts.
POST /v1/runtime/session/start
{ "sessionId": "voice-session-1", "workspaceId": "ws_123", "userId": "user_42", "satham": { "variant": "default", "embedDim": 384 }, "metadata": { "agentFramework": "livekit" }}{ "sessionId": "voice-session-1", "bootstrap": { "profileVersion": 17, "permanentManifestVersion": 92, "orgManifestVersion": 14, "syncToken": "stoken_abc", "budgets": { "contextCharCap": 8000, "perLaneCharCap": 2400, "topKPermanent": 6, "topKOrg": 4, "topKWorking": 4 } }, "paths": { "sessionJsonl": "/var/lib/memoair/voice/ws_123/user_42/voice-session-1/session.jsonl", "workingBrief": "/var/lib/memoair/voice/ws_123/user_42/voice-session-1/working_brief.json" }}POST /v1/runtime/search_memory
The single v1 recall surface. Must complete locally; never calls cloud or a hosted LLM. Default timeout 250 ms.
{ "sessionId": "voice-session-1", "turnId": "turn-7", "query": "what timezone does the user prefer", "intent": "answer_current_user", "includeSources": true, "topK": { "permanent": 6, "org": 4, "working": 4 }}{ "contextText": "## profile\n…\n## working\n…\n## permanent\n…\n## org\n…", "profile": { "version": 17, "summary": "Sunit, IST, prefers detailed answers." }, "working": [{ "kind": "brief", "text": "Discussing voice SDK design.", "score": 1.0 }], "permanent": [ { "memoryId": "mem_abc", "matchKey": "user.location.current", "text": "User lives in Delhi.", "score": 0.82, "validFrom": "2026-05-10T00:00:00Z" } ], "org": [{ "docId": "doc_42", "chunkId": "doc_42:7", "text": "Standup is 10:30 IST.", "score": 0.71 }], "trace": { "profileMs": 0.1, "workingMs": 1.2, "permanentMs": 3.4, "orgMs": 4.7, "totalMs": 8.8, "degraded": false }}POST /v1/runtime/turn/after
{ "sessionId": "voice-session-1", "turnId": "turn-7", "userText": "What time is standup?", "assistantText": "10:30 IST.", "toolCalls": [{ "name": "search_memory", "latencyMs": 8.8, "hits": 3 }], "metadata": { "interrupted": false }}{ "status": "ok", "localEventId": "evt_local_123" }Error envelope
{ "error": { "code": "runtime.timeout", "message": "...", "details": {} } }Standard codes: runtime.bad_request, runtime.session_not_found, runtime.bootstrap_failed, runtime.projection_unavailable, runtime.identity_mismatch, runtime.internal.
Python SDK surface
The two packages shipping in v0.3 are memoair-voice (the high-level client + internal runtime pool + low-level wrapper) and memoair-livekit (the drop-in livekit.agents.Agent subclass). For Pipecat, Vapi, Retell, and any other non-LiveKit framework, use MemoAirVoiceClient directly — see the LiveKit Option B walkthrough or the canonical custom-agent reference at examples/livekit/voice_agents/memoair_voice_custom_agent.py.
# Default: high-level clientfrom memoair_voice import ( MemoAirVoiceClient, SearchResult, IndexBuildResult, MemoAirVoiceMemoryError,) # LiveKit drop-in (renamed in v0.3 — was MemoAirVoiceAgent)from memoair_livekit import MemoAirLiveKitAgent # Low-level / power usersfrom memoair_voice import ( MemoAirVoiceMemory, MemoAirVoiceMemorySync, MemoryTool, MemoryToolSync,)The shipped contract: construct MemoAirVoiceClient once at boot with api_key="memoair_pk_…", project_id="proj_…", and agent_id="agent_…"; pass the end-user identifier per call to search_memory and save_response. The internal RuntimePool spawns one voice-runtime per (project_id, end-user) pair, bounded by max_concurrent_users (default 20). Cloud calls carry Authorization: Bearer memoair_pk_…, X-Project-Id, X-Agent-Id, and X-User-Id headers. Method-by-method reference: Python SDK Reference.