Home/Documentation
Pipecat

Pipecat × MemoAir Voice Memory

Add persistent voice memory to a Pipecat pipeline. The pattern is the same as for any non-LiveKit framework: construct one MemoAirVoiceClient at boot, call search_memory before the LLM stage, and save_response after the assistant stage — all with a per-call user={id,name,metadata}.

v0.3 status — use the custom-agent pattern

A first-class Pipecat adapter (memoair-pipecat) is on the roadmap for v0.4. For Pipecat today, integrate via the public MemoAirVoiceClient surface — the same pattern documented as Option B on the LiveKit page and demoed end-to-end at examples/livekit/voice_agents/memoair_voice_custom_agent.py. Strip the LiveKit imports from that example and the only MemoAir surface left is the three calls below.

Install

terminal
BASH
pip install "memoair-voice>=0.3.1" python-dotenv \
"pipecat-ai[deepgram,cartesia,openai,silero,daily]>=0.0.55"

Environment

.env
BASH
MEMOAIR_API_KEY=memoair_pk_...
MEMOAIR_PROJECT_ID=proj_...
MEMOAIR_AGENT_ID=agent_...
OPENAI_API_KEY=sk_...
DEEPGRAM_API_KEY=...
CARTESIA_API_KEY=...

Resolve user.id from your room, websocket, or app auth at runtime and pass it on every search_memory / save_response call. Do not put a shared user ID in your env.

The MemoAir bits — three calls

MemoAir does not need a Pipecat-specific frame processor today. You wrap the LLM stage with three calls — construct, search_memory, save_response — and inject the recalled context into the LLM context aggregator the same way Pipecat memory services do (user context → memory → LLM → TTS).

pipecat_memoair.py
PYTHON
import os
from dotenv import load_dotenv
from memoair_voice import MemoAirVoiceClient
 
load_dotenv()
 
# 1. ONE client per bot process, constructed at boot.
client = MemoAirVoiceClient(
api_key=os.environ["MEMOAIR_API_KEY"],
project_id=os.environ["MEMOAIR_PROJECT_ID"],
agent_id=os.environ["MEMOAIR_AGENT_ID"],
)
 
 
async def on_user_text(*, user_text: str, caller_id: str) -> str:
# 2. Recall before the LLM stage. Returns a system-message-ready
# contextText covering profile + working + permanent + org.
ctx = await client.search_memory(
user_text,
user={"id": caller_id},
timeout_ms=250,
)
return ctx.contextText # splice into your OpenAILLMContext system slot
 
 
async def on_assistant_text(*, user_text: str, assistant_text: str, caller_id: str) -> None:
# 3. Persist the completed turn so memory grows over time.
await client.save_response(
user_text=user_text,
assistant_text=assistant_text,
user={"id": caller_id},
)

Wire on_user_text into your OpenAILLMContext system message right before the LLM processor consumes the frame; wire on_assistant_text into the post-assistant aggregator. Replace OpenAILLMContext with the equivalent for your provider — the surface is provider-agnostic.

Production shape — many concurrent callers

The single client process can serve many concurrent callers. MemoAirVoiceClient owns an internal RuntimePool that spawns one voice-runtime per (project_id, user.id), capped by max_concurrent_users. No sidecar to provision; no per-call cold-start.

bridge_at_scale.py
PYTHON
client = MemoAirVoiceClient(
api_key=os.environ["MEMOAIR_API_KEY"],
project_id=os.environ["MEMOAIR_PROJECT_ID"],
agent_id=os.environ["MEMOAIR_AGENT_ID"],
max_concurrent_users=64,
runtime_idle_ttl_s=300,
)

See the multi-workspace SaaS guide for the partner / project-per-customer pattern when one Pipecat bridge serves multiple customer workspaces.

Advanced

Need lane-level control or an LLM-decided search_memory tool? See the advanced tool surface.