Home/Documentation
Retell AI

Retell × MemoAir Voice Memory

Drive memory from a Retell custom-function webhook. Your bridge server exposes search_memory / record_turn endpoints; each handler calls the public MemoAirVoiceClient surface with a per-call user={id,name,metadata}.

v0.3 status — use the custom-agent pattern

A first-class Retell adapter (memoair-retell) is on the roadmap for v0.4. For Retell today, expose your own FastAPI webhook routes and call MemoAirVoiceClient.search_memory + MemoAirVoiceClient.save_response from inside the handlers — see the LiveKit Option B walkthrough and the canonical custom-agent reference at examples/livekit/voice_agents/memoair_voice_custom_agent.py for the same shape.

Install

terminal
BASH
pip install "memoair-voice>=0.3.1" "fastapi>=0.110" "uvicorn>=0.27" python-dotenv

Environment

.env
BASH
MEMOAIR_API_KEY=memoair_pk_...
MEMOAIR_PROJECT_ID=proj_...
MEMOAIR_AGENT_ID=agent_...
RETELL_SIGNING_SECRET=...

The MemoAir bits — three calls

Construct one client at boot, then call search_memory from your POST /retell/search_memory handler and save_response from your POST /retell/record_turn handler. Resolve user.id from the call payload (a customer ID in metadata, a phone-number hash, etc.) per request.

retell_bridge.py
PYTHON
import os
from typing import Any
from dotenv import load_dotenv
from fastapi import FastAPI
from memoair_voice import MemoAirVoiceClient
 
load_dotenv()
 
app = FastAPI(title="MemoAir Retell bridge")
 
# ONE client per bridge process. Pool fans out across concurrent callers.
client = MemoAirVoiceClient(
api_key=os.environ["MEMOAIR_API_KEY"],
project_id=os.environ["MEMOAIR_PROJECT_ID"],
agent_id=os.environ["MEMOAIR_AGENT_ID"],
)
 
 
def user_id_from_call(call: dict[str, Any]) -> str:
metadata = call.get("metadata") or {}
return (
metadata.get("memoair_user_id")
or call.get("from_number")
or call.get("to_number")
or "anonymous"
)
 
 
@app.post("/retell/search_memory")
async def search_memory(payload: dict[str, Any]) -> dict[str, str]:
caller_id = user_id_from_call(payload.get("call") or {})
query = payload.get("args", {}).get("query", "")
ctx = await client.search_memory(
query, user={"id": caller_id}, timeout_ms=250,
)
# Retell expects a string in the function-tool result envelope.
return {"contextText": ctx.contextText}
 
 
@app.post("/retell/record_turn")
async def record_turn(payload: dict[str, Any]) -> dict[str, str]:
caller_id = user_id_from_call(payload.get("call") or {})
args = payload.get("args", {})
await client.save_response(
user_text=args.get("user_text", ""),
assistant_text=args.get("assistant_text", ""),
user={"id": caller_id},
)
return {"status": "ok"}
 
 
@app.get("/health")
def health() -> dict[str, str]:
return {"status": "ok"}

Configure Retell

Add these custom functions / webhooks on your Retell agent:

  • search_memory POST https://your-server.com/retell/search_memory
  • record_turn POST https://your-server.com/retell/record_turn

The search_memory function should accept a required query string. Return contextText for Retell to feed back to the model as a system-message slot.

Production shape — many concurrent callers

A Retell bridge usually serves many concurrent callers from one process. The shared MemoAirVoiceClient uses an internal RuntimePool that spawns one voice-runtime process per (project_id, user.id), capped by max_concurrent_users. Bump the cap to match the concurrency target of your Retell deployment.

bridge_at_scale.py
PYTHON
client = MemoAirVoiceClient(
api_key=os.environ["MEMOAIR_API_KEY"],
project_id=os.environ["MEMOAIR_PROJECT_ID"],
agent_id=os.environ["MEMOAIR_AGENT_ID"],
max_concurrent_users=64,
runtime_idle_ttl_s=300,
)

See the multi-workspace SaaS guide for the partner / project-per-customer pattern when one Retell bridge fronts multiple customer workspaces.

Advanced

Need custom turn pairing, lane gating, or LLM-decided memory tools? See the advanced tool surface.