Pipecat × MemoAir Voice Memory
Add persistent voice memory to a Pipecat pipeline. The pattern is the same as for any non-LiveKit framework: construct one MemoAirVoiceClient at boot, call search_memory before the LLM stage, and save_response after the assistant stage — all with a per-call user={id,name,metadata}.
v0.3 status — use the custom-agent pattern
A first-class Pipecat adapter (memoair-pipecat) is on the roadmap for v0.4. For Pipecat today, integrate via the public MemoAirVoiceClient surface — the same pattern documented as Option B on the LiveKit page and demoed end-to-end at examples/livekit/voice_agents/memoair_voice_custom_agent.py. Strip the LiveKit imports from that example and the only MemoAir surface left is the three calls below.
Install
pip install "memoair-voice>=0.3.1" python-dotenv \ "pipecat-ai[deepgram,cartesia,openai,silero,daily]>=0.0.55"Environment
MEMOAIR_API_KEY=memoair_pk_...MEMOAIR_PROJECT_ID=proj_...MEMOAIR_AGENT_ID=agent_...OPENAI_API_KEY=sk_...DEEPGRAM_API_KEY=...CARTESIA_API_KEY=...Resolve user.id from your room, websocket, or app auth at runtime and pass it on every search_memory / save_response call. Do not put a shared user ID in your env.
The MemoAir bits — three calls
MemoAir does not need a Pipecat-specific frame processor today. You wrap the LLM stage with three calls — construct, search_memory, save_response — and inject the recalled context into the LLM context aggregator the same way Pipecat memory services do (user context → memory → LLM → TTS).
import osfrom dotenv import load_dotenvfrom memoair_voice import MemoAirVoiceClient load_dotenv() # 1. ONE client per bot process, constructed at boot.client = MemoAirVoiceClient( api_key=os.environ["MEMOAIR_API_KEY"], project_id=os.environ["MEMOAIR_PROJECT_ID"], agent_id=os.environ["MEMOAIR_AGENT_ID"],) async def on_user_text(*, user_text: str, caller_id: str) -> str: # 2. Recall before the LLM stage. Returns a system-message-ready # contextText covering profile + working + permanent + org. ctx = await client.search_memory( user_text, user={"id": caller_id}, timeout_ms=250, ) return ctx.contextText # splice into your OpenAILLMContext system slot async def on_assistant_text(*, user_text: str, assistant_text: str, caller_id: str) -> None: # 3. Persist the completed turn so memory grows over time. await client.save_response( user_text=user_text, assistant_text=assistant_text, user={"id": caller_id}, )Wire on_user_text into your OpenAILLMContext system message right before the LLM processor consumes the frame; wire on_assistant_text into the post-assistant aggregator. Replace OpenAILLMContext with the equivalent for your provider — the surface is provider-agnostic.
Production shape — many concurrent callers
The single client process can serve many concurrent callers. MemoAirVoiceClient owns an internal RuntimePool that spawns one voice-runtime per (project_id, user.id), capped by max_concurrent_users. No sidecar to provision; no per-call cold-start.
client = MemoAirVoiceClient( api_key=os.environ["MEMOAIR_API_KEY"], project_id=os.environ["MEMOAIR_PROJECT_ID"], agent_id=os.environ["MEMOAIR_AGENT_ID"], max_concurrent_users=64, runtime_idle_ttl_s=300,)See the multi-workspace SaaS guide for the partner / project-per-customer pattern when one Pipecat bridge serves multiple customer workspaces.
Advanced
search_memory tool? See the advanced tool surface.