Home/Documentation
Vapi · Custom Knowledge Base

Vapi × MemoAir Voice Memory

Back a Vapi Custom Knowledge Base with MemoAir's shared org index. Vapi fires a webhook on every user turn; memoair-vapi answers it with documents retrieved from your project's org memory. Retrieval runs locally in the bundled memory runtime — the query is embedded and the org .mv2 snapshot is searched in-process — so lookups stay sub-10ms and never add a network hop to the live call.

How it fits together

  • You host the webhook. pip install memoair-vapi and run a small FastAPI server (reference app provided). Vapi calls your server — MemoAir does not host it.
  • Retrieval is client-side. MemoAirVapiSearch wraps MemoAirVoiceClient, which launches the bundled voice-runtime, pulls the org .mv2 snapshot once, and searches it locally on every turn.
  • MemoAir cloud only builds + serves the index. You seed the org index from the dashboard (PDF upload) or via the SDK; the runtime pulls it on bootstrap.
1

Install

The Python SDK includes the local memory runtime binary — no Docker required. The example webhook server uses FastAPI + uvicorn.

terminal
BASH
pip install "memoair-vapi>=0.1.0" "fastapi>=0.110" "uvicorn>=0.27" python-dotenv

memoair-vapi depends on memoair-voice, so the runtime + cloud client are pulled in automatically.

2

Get your API key, project ID, and agent ID

Sign in at dashboard.memoair.space and create (or pick) a project + agent. From the dashboard copy:

  • API key — looks like memoair_pk_…. Account-scoped; treat as a secret.
  • Project ID — identifies the workspace whose org index you'll search. Safe to log.
  • Agent ID — identifies the voice bot inside the project. Safe to log.

Phase 1 of the Vapi integration is org-lane only: every call searches the shared, project-scoped org index. There is no per-caller user_id yet — that arrives with the per-user lanes in a later phase.

3

Environment setup

Create a .env next to your webhook server:

.env
BASH
# MemoAir
MEMOAIR_API_KEY=memoair_pk_...
MEMOAIR_PROJECT_ID=proj_...
MEMOAIR_AGENT_ID=agent_...
MEMOAIR_SEARCH_TIMEOUT_MS=250
 
# Vapi webhook secret (from the Custom Knowledge Base config in step 6).
# For local dev you can skip verification instead:
VAPI_WEBHOOK_SECRET=your-webhook-secret
# VAPI_SKIP_SIGNATURE_VERIFY=1
4

Seed the org index

The knowledge Vapi will answer from lives in MemoAir's org index — project-scoped, shared across all callers. Seed it once. Pick either path; both land in the same index.

Option A — Dashboard upload (no code)

At dashboard.memoair.space open your project's Voice Org index tab, name the index agent-memory, and upload PDFs / text. The dashboard handles extraction, chunking, and embedding. Best for PDFs and larger corpora.

Option B — Code-driven seed

Good for structured facts kept next to your server code. The cloud endpoint is build-or-append: the first call creates the index, later calls add to the same name. Cap per call: 100 docs and 1 MB total.

build_index.py
PYTHON
import asyncio
import os
 
from dotenv import load_dotenv
from memoair_voice import MemoAirVoiceClient
 
load_dotenv()
 
INDEX_NAME = os.getenv("MEMOAIR_INDEX_NAME", "agent-memory")
 
MEMORIES = [
{
"id": "return-policy",
"text": "Returns accepted within 30 days of purchase with a receipt.",
"metadata": {"kind": "faq", "topic": "returns"},
},
{
"id": "shipping",
"text": "Standard shipping is 3-5 business days. Express is 1-2 days.",
"metadata": {"kind": "faq", "topic": "shipping"},
},
]
 
 
async def main() -> None:
# MemoAirVoiceClient is NOT an async context manager — manage it with
# try/finally + aclose(). user_id is required for index writes: the
# backend gates project-API-key callers on an X-User-Id header, which
# the SDK only stamps when user_id is set. The index stays
# workspace-scoped and readable by the org-only search at query time.
client = MemoAirVoiceClient(
api_key=os.environ["MEMOAIR_API_KEY"],
project_id=os.environ["MEMOAIR_PROJECT_ID"],
agent_id=os.environ["MEMOAIR_AGENT_ID"],
user_id="index-builder",
)
try:
result = await client.create_index(INDEX_NAME, MEMORIES)
print(f"Indexed {result.chunk_count} chunks (version={result.version})")
finally:
await client.aclose()
 
 
if __name__ == "__main__":
asyncio.run(main())
terminal
BASH
python build_index.py
5

Run the webhook server

MemoAirVapiSearch does the work: handle_request(payload) parses the Vapi webhook, searches the org lane, and returns {"documents": [...]} in Vapi's expected shape. verify_vapi_signature validates the x-vapi-signature header (HMAC-SHA256).

memoair_vapi_server.py
PYTHON
from __future__ import annotations
 
import json
import os
from contextlib import asynccontextmanager
 
from dotenv import load_dotenv
from fastapi import FastAPI, Request, Response
from memoair_vapi import MemoAirVapiSearch, VapiWebhookError, verify_vapi_signature
 
load_dotenv()
 
WEBHOOK_SECRET = os.getenv("VAPI_WEBHOOK_SECRET", "")
SKIP_VERIFY = os.getenv("VAPI_SKIP_SIGNATURE_VERIFY", "0").lower() in {"1", "true", "yes"}
 
 
@asynccontextmanager
async def lifespan(app: FastAPI):
# One search client per process. It owns the bundled voice-runtime and
# pulls the org .mv2 snapshot on first use; aclose() tears it down.
app.state.search = MemoAirVapiSearch(
api_key=os.environ["MEMOAIR_API_KEY"],
project_id=os.environ["MEMOAIR_PROJECT_ID"],
agent_id=os.environ["MEMOAIR_AGENT_ID"],
search_timeout_ms=int(os.getenv("MEMOAIR_SEARCH_TIMEOUT_MS", "250")),
)
try:
yield
finally:
await app.state.search.aclose()
 
 
app = FastAPI(lifespan=lifespan)
 
 
@app.get("/health")
async def health() -> dict:
return {"status": "ok"}
 
 
@app.post("/vapi/webhook")
async def vapi_webhook(request: Request) -> Response:
raw_body = await request.body()
 
if not SKIP_VERIFY:
sig = request.headers.get("x-vapi-signature", "")
if not verify_vapi_signature(raw_body, sig, WEBHOOK_SECRET):
return Response(status_code=401, content='{"error":"invalid signature"}',
media_type="application/json")
 
try:
payload = json.loads(raw_body or b"{}")
except json.JSONDecodeError:
return Response(status_code=400, content='{"error":"invalid json"}',
media_type="application/json")
 
try:
# Non-knowledge-base messages are acked with {}. Retrieval failures
# degrade to {"documents": []} so a live call never breaks.
result = await request.app.state.search.handle_request(payload)
except VapiWebhookError:
return Response(status_code=400, content='{"error":"malformed payload"}',
media_type="application/json")
 
return Response(status_code=200, content=json.dumps(result),
media_type="application/json")

Start it, then expose it publicly:

terminal
BASH
uvicorn memoair_vapi_server:app --port 8000
# In another terminal, expose it (Vapi's cloud must reach your webhook):
ngrok http 8000 # -> https://<subdomain>.ngrok-free.dev

A complete runnable server + seeder lives in examples/vapi/ in the repo.

6

Register the Custom Knowledge Base with Vapi

Custom knowledge bases are API-only — Vapi's dashboard does not expose them. Create the KB pointing at your public webhook URL, then attach it to your assistant. When sending the PATCH, include the complete model object — Vapi replaces it wholesale and does not merge nested fields.

terminal
BASH
# 1. Create the custom KB -> returns an id
curl -X POST https://api.vapi.ai/knowledge-base \
-H "Authorization: Bearer $VAPI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"provider": "custom-knowledge-base",
"server": {
"url": "https://<subdomain>.ngrok-free.dev/vapi/webhook",
"secret": "your-webhook-secret"
}
}'
 
# 2. Attach the returned id to your assistant (send the FULL model object)
curl -X PATCH https://api.vapi.ai/assistant/$ASSISTANT_ID \
-H "Authorization: Bearer $VAPI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": {
"provider": "openai",
"model": "gpt-4o",
"knowledgeBaseId": "<KB_ID_FROM_STEP_1>",
"messages": [
{ "role": "system", "content": "Answer using the knowledge base context provided." }
]
}
}'

Set server.secret to the same value as VAPI_WEBHOOK_SECRET so signatures verify. For a first smoke test you can set VAPI_SKIP_SIGNATURE_VERIFY=1 and omit the secret.

7

Test it

Talk to your assistant in the Vapi dashboard (or call its number) and watch requests arrive:

terminal
BASH
# Tail the server log — every turn shows POST /vapi/webhook 200
# Or smoke-test the webhook directly:
curl -X POST https://<subdomain>.ngrok-free.dev/vapi/webhook \
-H 'Content-Type: application/json' \
-d '{"message":{"type":"knowledge-base-request",
"messages":[{"role":"user","content":"What is your return policy?"}]}}'
# -> {"documents": [{"content": "...", "similarity": 0.9, "uuid": "..."}]}

How retrieval works

  • Local embedding + search. On each turn the bundled runtime embeds the query and searches the org .mv2 snapshot in-process — no per-query call to MemoAir cloud, keeping the hot path fast.
  • Vapi's document shape. handle_request maps org hits to {content, similarity, uuid} and returns at most top_k documents.
  • Never breaks a call. Non-knowledge-base messages are acked with {}; retrieval failures degrade to {"documents": []} with a 200. Only genuinely malformed payloads return 400.

Troubleshooting

Search returns empty documents. Expected before the org index is seeded (step 4), or after the runtime evicts an idle session (default 300s) — the next turn cold-starts and re-pulls. Relevance is in the similarity score: off-topic queries still return top_k docs but with low / negative scores.

401 invalid signature. The server.secret registered with Vapi must match VAPI_WEBHOOK_SECRET. Vapi signs as sha256=<hex>; verify_vapi_signature accepts both that and a bare hex digest.

Vapi can't reach the webhook. The URL must be public (ngrok or a deployed host) and end in your route (e.g. /vapi/webhook). localhost will not work — Vapi calls from its cloud.

Patch wiped my assistant config. The assistant PATCH replaces the entire model object. GET /assistant/$ASSISTANT_ID first and resend its existing messages / settings alongside knowledgeBaseId.