Vapi × MemoAir Voice Memory
Back a Vapi Custom Knowledge Base with MemoAir's shared org index. Vapi fires a webhook on every user turn; memoair-vapi answers it with documents retrieved from your project's org memory. Retrieval runs locally in the bundled memory runtime — the query is embedded and the org .mv2 snapshot is searched in-process — so lookups stay sub-10ms and never add a network hop to the live call.
How it fits together
- You host the webhook.
pip install memoair-vapiand run a small FastAPI server (reference app provided). Vapi calls your server — MemoAir does not host it. - Retrieval is client-side.
MemoAirVapiSearchwrapsMemoAirVoiceClient, which launches the bundledvoice-runtime, pulls the org.mv2snapshot once, and searches it locally on every turn. - MemoAir cloud only builds + serves the index. You seed the org index from the dashboard (PDF upload) or via the SDK; the runtime pulls it on bootstrap.
Install
The Python SDK includes the local memory runtime binary — no Docker required. The example webhook server uses FastAPI + uvicorn.
pip install "memoair-vapi>=0.1.0" "fastapi>=0.110" "uvicorn>=0.27" python-dotenvmemoair-vapi depends on memoair-voice, so the runtime + cloud client are pulled in automatically.
Get your API key, project ID, and agent ID
Sign in at dashboard.memoair.space and create (or pick) a project + agent. From the dashboard copy:
- API key — looks like
memoair_pk_…. Account-scoped; treat as a secret. - Project ID — identifies the workspace whose org index you'll search. Safe to log.
- Agent ID — identifies the voice bot inside the project. Safe to log.
Phase 1 of the Vapi integration is org-lane only: every call searches the shared, project-scoped org index. There is no per-caller user_id yet — that arrives with the per-user lanes in a later phase.
Environment setup
Create a .env next to your webhook server:
# MemoAirMEMOAIR_API_KEY=memoair_pk_...MEMOAIR_PROJECT_ID=proj_...MEMOAIR_AGENT_ID=agent_...MEMOAIR_SEARCH_TIMEOUT_MS=250 # Vapi webhook secret (from the Custom Knowledge Base config in step 6).# For local dev you can skip verification instead:VAPI_WEBHOOK_SECRET=your-webhook-secret# VAPI_SKIP_SIGNATURE_VERIFY=1Seed the org index
The knowledge Vapi will answer from lives in MemoAir's org index — project-scoped, shared across all callers. Seed it once. Pick either path; both land in the same index.
Option A — Dashboard upload (no code)
At dashboard.memoair.space open your project's Voice → Org index tab, name the index agent-memory, and upload PDFs / text. The dashboard handles extraction, chunking, and embedding. Best for PDFs and larger corpora.
Option B — Code-driven seed
Good for structured facts kept next to your server code. The cloud endpoint is build-or-append: the first call creates the index, later calls add to the same name. Cap per call: 100 docs and 1 MB total.
import asyncioimport os from dotenv import load_dotenvfrom memoair_voice import MemoAirVoiceClient load_dotenv() INDEX_NAME = os.getenv("MEMOAIR_INDEX_NAME", "agent-memory") MEMORIES = [ { "id": "return-policy", "text": "Returns accepted within 30 days of purchase with a receipt.", "metadata": {"kind": "faq", "topic": "returns"}, }, { "id": "shipping", "text": "Standard shipping is 3-5 business days. Express is 1-2 days.", "metadata": {"kind": "faq", "topic": "shipping"}, },] async def main() -> None: # MemoAirVoiceClient is NOT an async context manager — manage it with # try/finally + aclose(). user_id is required for index writes: the # backend gates project-API-key callers on an X-User-Id header, which # the SDK only stamps when user_id is set. The index stays # workspace-scoped and readable by the org-only search at query time. client = MemoAirVoiceClient( api_key=os.environ["MEMOAIR_API_KEY"], project_id=os.environ["MEMOAIR_PROJECT_ID"], agent_id=os.environ["MEMOAIR_AGENT_ID"], user_id="index-builder", ) try: result = await client.create_index(INDEX_NAME, MEMORIES) print(f"Indexed {result.chunk_count} chunks (version={result.version})") finally: await client.aclose() if __name__ == "__main__": asyncio.run(main())python build_index.pyRun the webhook server
MemoAirVapiSearch does the work: handle_request(payload) parses the Vapi webhook, searches the org lane, and returns {"documents": [...]} in Vapi's expected shape. verify_vapi_signature validates the x-vapi-signature header (HMAC-SHA256).
from __future__ import annotations import jsonimport osfrom contextlib import asynccontextmanager from dotenv import load_dotenvfrom fastapi import FastAPI, Request, Responsefrom memoair_vapi import MemoAirVapiSearch, VapiWebhookError, verify_vapi_signature load_dotenv() WEBHOOK_SECRET = os.getenv("VAPI_WEBHOOK_SECRET", "")SKIP_VERIFY = os.getenv("VAPI_SKIP_SIGNATURE_VERIFY", "0").lower() in {"1", "true", "yes"} @asynccontextmanagerasync def lifespan(app: FastAPI): # One search client per process. It owns the bundled voice-runtime and # pulls the org .mv2 snapshot on first use; aclose() tears it down. app.state.search = MemoAirVapiSearch( api_key=os.environ["MEMOAIR_API_KEY"], project_id=os.environ["MEMOAIR_PROJECT_ID"], agent_id=os.environ["MEMOAIR_AGENT_ID"], search_timeout_ms=int(os.getenv("MEMOAIR_SEARCH_TIMEOUT_MS", "250")), ) try: yield finally: await app.state.search.aclose() app = FastAPI(lifespan=lifespan) @app.get("/health")async def health() -> dict: return {"status": "ok"} @app.post("/vapi/webhook")async def vapi_webhook(request: Request) -> Response: raw_body = await request.body() if not SKIP_VERIFY: sig = request.headers.get("x-vapi-signature", "") if not verify_vapi_signature(raw_body, sig, WEBHOOK_SECRET): return Response(status_code=401, content='{"error":"invalid signature"}', media_type="application/json") try: payload = json.loads(raw_body or b"{}") except json.JSONDecodeError: return Response(status_code=400, content='{"error":"invalid json"}', media_type="application/json") try: # Non-knowledge-base messages are acked with {}. Retrieval failures # degrade to {"documents": []} so a live call never breaks. result = await request.app.state.search.handle_request(payload) except VapiWebhookError: return Response(status_code=400, content='{"error":"malformed payload"}', media_type="application/json") return Response(status_code=200, content=json.dumps(result), media_type="application/json")Start it, then expose it publicly:
uvicorn memoair_vapi_server:app --port 8000# In another terminal, expose it (Vapi's cloud must reach your webhook):ngrok http 8000 # -> https://<subdomain>.ngrok-free.devA complete runnable server + seeder lives in examples/vapi/ in the repo.
Register the Custom Knowledge Base with Vapi
Custom knowledge bases are API-only — Vapi's dashboard does not expose them. Create the KB pointing at your public webhook URL, then attach it to your assistant. When sending the PATCH, include the complete model object — Vapi replaces it wholesale and does not merge nested fields.
# 1. Create the custom KB -> returns an idcurl -X POST https://api.vapi.ai/knowledge-base \ -H "Authorization: Bearer $VAPI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "provider": "custom-knowledge-base", "server": { "url": "https://<subdomain>.ngrok-free.dev/vapi/webhook", "secret": "your-webhook-secret" } }' # 2. Attach the returned id to your assistant (send the FULL model object)curl -X PATCH https://api.vapi.ai/assistant/$ASSISTANT_ID \ -H "Authorization: Bearer $VAPI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": { "provider": "openai", "model": "gpt-4o", "knowledgeBaseId": "<KB_ID_FROM_STEP_1>", "messages": [ { "role": "system", "content": "Answer using the knowledge base context provided." } ] } }'Set server.secret to the same value as VAPI_WEBHOOK_SECRET so signatures verify. For a first smoke test you can set VAPI_SKIP_SIGNATURE_VERIFY=1 and omit the secret.
Test it
Talk to your assistant in the Vapi dashboard (or call its number) and watch requests arrive:
# Tail the server log — every turn shows POST /vapi/webhook 200# Or smoke-test the webhook directly:curl -X POST https://<subdomain>.ngrok-free.dev/vapi/webhook \ -H 'Content-Type: application/json' \ -d '{"message":{"type":"knowledge-base-request", "messages":[{"role":"user","content":"What is your return policy?"}]}}'# -> {"documents": [{"content": "...", "similarity": 0.9, "uuid": "..."}]}How retrieval works
- •Local embedding + search. On each turn the bundled runtime embeds the query and searches the org
.mv2snapshot in-process — no per-query call to MemoAir cloud, keeping the hot path fast. - •Vapi's document shape.
handle_requestmaps org hits to{content, similarity, uuid}and returns at mosttop_kdocuments. - •Never breaks a call. Non-knowledge-base messages are acked with
{}; retrieval failures degrade to{"documents": []}with a 200. Only genuinely malformed payloads return 400.
Troubleshooting
Search returns empty documents. Expected before the org index is seeded (step 4), or after the runtime evicts an idle session (default 300s) — the next turn cold-starts and re-pulls. Relevance is in the similarity score: off-topic queries still return top_k docs but with low / negative scores.
401 invalid signature. The server.secret registered with Vapi must match VAPI_WEBHOOK_SECRET. Vapi signs as sha256=<hex>; verify_vapi_signature accepts both that and a bare hex digest.
Vapi can't reach the webhook. The URL must be public (ngrok or a deployed host) and end in your route (e.g. /vapi/webhook). localhost will not work — Vapi calls from its cloud.
Patch wiped my assistant config. The assistant PATCH replaces the entire model object. GET /assistant/$ASSISTANT_ID first and resend its existing messages / settings alongside knowledgeBaseId.