Home/Documentation

⌘K

Vapi · Custom Knowledge Base

Vapi × MemoAir Voice Memory

Back a Vapi Custom Knowledge Base with MemoAir's shared org index. Vapi fires a webhook on every user turn; memoair-vapi answers it with documents retrieved from your project's org memory. Retrieval runs locally in the bundled memory runtime — the query is embedded and the org .mv2 snapshot is searched in-process — so lookups stay sub-10ms and never add a network hop to the live call.

How it fits together

You host the webhook. pip install memoair-vapi and run a small FastAPI server (reference app provided). Vapi calls your server — MemoAir does not host it.
Retrieval is client-side. MemoAirVapiSearch wraps MemoAirVoiceClient, which launches the bundled voice-runtime, pulls the org .mv2 snapshot once, and searches it locally on every turn.
MemoAir cloud only builds + serves the index. You seed the org index from the dashboard (PDF upload) or via the SDK; the runtime pulls it on bootstrap.

Install

The Python SDK includes the local memory runtime binary — no Docker required. The example webhook server uses FastAPI + uvicorn.

terminal

BASH

pip install "memoair-vapi>=0.1.0" "fastapi>=0.110" "uvicorn>=0.27" python-dotenv

memoair-vapi depends on memoair-voice, so the runtime + cloud client are pulled in automatically.

Get your API key, project ID, and agent ID

API key — looks like memoair_pk_…. Account-scoped; treat as a secret.
Project ID — identifies the workspace whose org index you'll search. Safe to log.
Agent ID — identifies the voice bot inside the project. Safe to log.

Phase 1 of the Vapi integration is org-lane only: every call searches the shared, project-scoped org index. There is no per-caller user_id yet — that arrives with the per-user lanes in a later phase.

Environment setup

Create a .env next to your webhook server:

.env

BASH

# MemoAir
MEMOAIR_API_KEY=memoair_pk_...
MEMOAIR_PROJECT_ID=proj_...
MEMOAIR_AGENT_ID=agent_...
MEMOAIR_SEARCH_TIMEOUT_MS=250
 
# Vapi webhook secret (from the Custom Knowledge Base config in step 6).
# For local dev you can skip verification instead:
VAPI_WEBHOOK_SECRET=your-webhook-secret
# VAPI_SKIP_SIGNATURE_VERIFY=1

Seed the org index

The knowledge Vapi will answer from lives in MemoAir's org index — project-scoped, shared across all callers. Seed it once. Pick either path; both land in the same index.

Option A — Dashboard upload (no code)

At dashboard.memoair.space open your project's Voice → Org index tab, name the index agent-memory, and upload PDFs / text. The dashboard handles extraction, chunking, and embedding. Best for PDFs and larger corpora.

Option B — Code-driven seed

Good for structured facts kept next to your server code. The cloud endpoint is build-or-append: the first call creates the index, later calls add to the same name. Cap per call: 100 docs and 1 MB total.

build_index.py

PYTHON

import asyncio
import os
 
from dotenv import load_dotenv
from memoair_voice import MemoAirVoiceClient
 
load_dotenv()
 
INDEX_NAME = os.getenv("MEMOAIR_INDEX_NAME", "agent-memory")
 
MEMORIES = [
    {
        "id": "return-policy",
        "text": "Returns accepted within 30 days of purchase with a receipt.",
        "metadata": {"kind": "faq", "topic": "returns"},
    },
    {
        "id": "shipping",
        "text": "Standard shipping is 3-5 business days. Express is 1-2 days.",
        "metadata": {"kind": "faq", "topic": "shipping"},
    },
]
 
 
async def main() -> None:
    # MemoAirVoiceClient is NOT an async context manager — manage it with
    # try/finally + aclose(). user_id is required for index writes: the
    # backend gates project-API-key callers on an X-User-Id header, which
    # the SDK only stamps when user_id is set. The index stays
    # workspace-scoped and readable by the org-only search at query time.
    client = MemoAirVoiceClient(
        api_key=os.environ["MEMOAIR_API_KEY"],
        project_id=os.environ["MEMOAIR_PROJECT_ID"],
        agent_id=os.environ["MEMOAIR_AGENT_ID"],
        user_id="index-builder",
    )
    try:
        result = await client.create_index(INDEX_NAME, MEMORIES)
        print(f"Indexed {result.chunk_count} chunks (version={result.version})")
    finally:
        await client.aclose()
 
 
if __name__ == "__main__":
    asyncio.run(main())

terminal

BASH

python build_index.py

Run the webhook server

MemoAirVapiSearch does the work: handle_request(payload) parses the Vapi webhook, searches the org lane, and returns {"documents": [...]} in Vapi's expected shape. verify_vapi_signature validates the x-vapi-signature header (HMAC-SHA256).

memoair_vapi_server.py

PYTHON

from __future__ import annotations
 
import json
import os
from contextlib import asynccontextmanager
 
from dotenv import load_dotenv
from fastapi import FastAPI, Request, Response
from memoair_vapi import MemoAirVapiSearch, VapiWebhookError, verify_vapi_signature
 
load_dotenv()
 
WEBHOOK_SECRET = os.getenv("VAPI_WEBHOOK_SECRET", "")
SKIP_VERIFY = os.getenv("VAPI_SKIP_SIGNATURE_VERIFY", "0").lower() in {"1", "true", "yes"}
 
 
@asynccontextmanager
async def lifespan(app: FastAPI):
    # One search client per process. It owns the bundled voice-runtime and
    # pulls the org .mv2 snapshot on first use; aclose() tears it down.
    app.state.search = MemoAirVapiSearch(
        api_key=os.environ["MEMOAIR_API_KEY"],
        project_id=os.environ["MEMOAIR_PROJECT_ID"],
        agent_id=os.environ["MEMOAIR_AGENT_ID"],
        search_timeout_ms=int(os.getenv("MEMOAIR_SEARCH_TIMEOUT_MS", "250")),
    )
    try:
        yield
    finally:
        await app.state.search.aclose()
 
 
app = FastAPI(lifespan=lifespan)
 
 
@app.get("/health")
async def health() -> dict:
    return {"status": "ok"}
 
 
@app.post("/vapi/webhook")
async def vapi_webhook(request: Request) -> Response:
    raw_body = await request.body()
 
    if not SKIP_VERIFY:
        sig = request.headers.get("x-vapi-signature", "")
        if not verify_vapi_signature(raw_body, sig, WEBHOOK_SECRET):
            return Response(status_code=401, content='{"error":"invalid signature"}',
                            media_type="application/json")
 
    try:
        payload = json.loads(raw_body or b"{}")
    except json.JSONDecodeError:
        return Response(status_code=400, content='{"error":"invalid json"}',
                        media_type="application/json")
 
    try:
        # Non-knowledge-base messages are acked with {}. Retrieval failures
        # degrade to {"documents": []} so a live call never breaks.
        result = await request.app.state.search.handle_request(payload)
    except VapiWebhookError:
        return Response(status_code=400, content='{"error":"malformed payload"}',
                        media_type="application/json")
 
    return Response(status_code=200, content=json.dumps(result),
                    media_type="application/json")

Start it, then expose it publicly:

terminal

BASH

uvicorn memoair_vapi_server:app --port 8000
# In another terminal, expose it (Vapi's cloud must reach your webhook):
ngrok http 8000   # -> https://<subdomain>.ngrok-free.dev

A complete runnable server + seeder lives in examples/vapi/ in the repo.

Register the Custom Knowledge Base with Vapi

Custom knowledge bases are API-only — Vapi's dashboard does not expose them. Create the KB pointing at your public webhook URL, then attach it to your assistant. When sending the PATCH, include the complete model object — Vapi replaces it wholesale and does not merge nested fields.

terminal

BASH

# 1. Create the custom KB -> returns an id
curl -X POST https://api.vapi.ai/knowledge-base \
  -H "Authorization: Bearer $VAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "custom-knowledge-base",
    "server": {
      "url": "https://<subdomain>.ngrok-free.dev/vapi/webhook",
      "secret": "your-webhook-secret"
    }
  }'
 
# 2. Attach the returned id to your assistant (send the FULL model object)
curl -X PATCH https://api.vapi.ai/assistant/$ASSISTANT_ID \
  -H "Authorization: Bearer $VAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": {
      "provider": "openai",
      "model": "gpt-4o",
      "knowledgeBaseId": "<KB_ID_FROM_STEP_1>",
      "messages": [
        { "role": "system", "content": "Answer using the knowledge base context provided." }
      ]
    }
  }'

Set server.secret to the same value as VAPI_WEBHOOK_SECRET so signatures verify. For a first smoke test you can set VAPI_SKIP_SIGNATURE_VERIFY=1 and omit the secret.

Test it

Talk to your assistant in the Vapi dashboard (or call its number) and watch requests arrive:

terminal

BASH

# Tail the server log — every turn shows POST /vapi/webhook 200
# Or smoke-test the webhook directly:
curl -X POST https://<subdomain>.ngrok-free.dev/vapi/webhook \
  -H 'Content-Type: application/json' \
  -d '{"message":{"type":"knowledge-base-request",
       "messages":[{"role":"user","content":"What is your return policy?"}]}}'
# -> {"documents": [{"content": "...", "similarity": 0.9, "uuid": "..."}]}

How retrieval works

•Local embedding + search. On each turn the bundled runtime embeds the query and searches the org .mv2 snapshot in-process — no per-query call to MemoAir cloud, keeping the hot path fast.
•Vapi's document shape. handle_request maps org hits to {content, similarity, uuid} and returns at most top_k documents.
•Never breaks a call. Non-knowledge-base messages are acked with {}; retrieval failures degrade to {"documents": []} with a 200. Only genuinely malformed payloads return 400.

Troubleshooting

Search returns empty documents. Expected before the org index is seeded (step 4), or after the runtime evicts an idle session (default 300s) — the next turn cold-starts and re-pulls. Relevance is in the similarity score: off-topic queries still return top_k docs but with low / negative scores.

401 invalid signature. The server.secret registered with Vapi must match VAPI_WEBHOOK_SECRET. Vapi signs as sha256=<hex>; verify_vapi_signature accepts both that and a bare hex digest.

Vapi can't reach the webhook. The URL must be public (ngrok or a deployed host) and end in your route (e.g. /vapi/webhook). localhost will not work — Vapi calls from its cloud.

Patch wiped my assistant config. The assistant PATCH replaces the entire model object. GET /assistant/$ASSISTANT_ID first and resend its existing messages / settings alongside knowledgeBaseId.