LangGraph × MemoAir Voice Memory
Use LangGraph as your voice agent's brain while MemoAir voice memory rides the same local voice-runtime hot path the LiveKit drop-in uses — recalled before every answer, persisted after every turn. The memoair-langchain package ships a prebuilt graph, a BaseTool, and node factories. LangGraph is the reasoning layer; pair it with a transport (LiveKit) for audio.
Where memory sits in the graph
START -> memoair_search -> llm -> memoair_persist -> END │ │ │ search_memory (≤250ms, local runtime) │ │ └ recalled context injected as a LEADING system message │ └ after_turn -> async cloud consolidationThe search node recalls the four lanes (profile, working, permanent, org) for the latest user turn; the LLM node sees that context before the user message; the persist node writes exactly one completed turn back (even if the graph did several internal hops).
Install
pip install "memoair-langchain>=0.1.0" langchain-openai # add a transport for audio (LiveKit shown here):pip install "livekit-agents[deepgram,openai,silero,turn-detector]" \ livekit-plugins-langchain python-dotenvmemoair-langchain depends on memoair-voice (the bundled local runtime) plus langgraph and langchain-core. No Docker.
Get your API key, project ID, and agent ID
Sign in at dashboard.memoair.space and copy your memoair_pk_… API key, proj_… project ID, and agent_… agent ID (My Agents → Create Agent). Per-user isolation uses the user_id you pass — use the caller's participant identity.
Prebuilt voice graph (text-first)
MemoAirGraphMemory owns the voice session and builds the compiled graph. The runtime auto-starts on the first start().
import asynciofrom langchain_openai import ChatOpenAIfrom langchain_core.messages import HumanMessagefrom memoair_langchain import MemoAirGraphMemory async def main() -> None: # One MemoAirGraphMemory == one voice session == one LangGraph thread. async with MemoAirGraphMemory( api_key="memoair_pk_...", project_id="proj_xxx", agent_id="agent_xxx", user_id="caller_phone_hash", ) as memory: # search -> llm -> persist, compiled. Hand to any LangGraph runner. graph = memory.build_graph(model=ChatOpenAI(model="gpt-4o-mini")) state = await graph.ainvoke( {"messages": [HumanMessage(content="when can I come in?")]} ) print(state["messages"][-1].content) if __name__ == "__main__": asyncio.run(main())LangGraph brain + LiveKit transport
LiveKit handles audio/STT/TTS/turn-taking; the graph is the LLM via livekit.plugins.langchain.LLMAdapter(graph). MemoAir memory is wired inside the graph, so recall + persist work with no extra glue.
import osfrom dotenv import load_dotenvfrom langchain_openai import ChatOpenAIfrom livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, clifrom livekit.plugins import deepgram, openai, silerofrom livekit.plugins import langchain as livekit_langchainfrom livekit.plugins.turn_detector.english import EnglishModel from memoair_langchain import MemoAirGraphMemory load_dotenv() async def entrypoint(ctx: JobContext) -> None: await ctx.connect() raw = ctx.room.local_participant.identity user_id = raw if isinstance(raw, str) and raw.strip() else "console-user" memory = MemoAirGraphMemory( api_key=os.environ["MEMOAIR_API_KEY"], project_id=os.environ["MEMOAIR_PROJECT_ID"], agent_id=os.environ["MEMOAIR_AGENT_ID"], user_id=user_id, ) await memory.start() # The graph IS the LLM. graph = memory.build_graph(model=ChatOpenAI(model="gpt-4o-mini")) session = AgentSession( stt=deepgram.STT(), llm=livekit_langchain.LLMAdapter(graph=graph), tts=openai.TTS(), turn_detection=EnglishModel(), vad=silero.VAD.load(), ) # Mandatory: close the session on shutdown so /v1/runtime/session/end fires # and the cloud consolidates the permanent lane. (Same gotcha as the # LiveKit drop-in — without it, cross-session memory is silently lost.) async def _close_memory() -> None: await memory.end(reason="worker_shutdown") await memory.aclose() ctx.add_shutdown_callback(_close_memory) await session.start( agent=Agent(instructions="You are a helpful voice assistant."), room=ctx.room, ) if __name__ == "__main__": cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))Reference example in the repo: examples/langchain/voice_agents/memoair_langgraph_agent.py.
Or use it as a tool
For function-calling agents where the LLM decides when to recall, use the async BaseTool in a ToolNode / create_react_agent.
from memoair_langchain import MemoAirVoiceSearchTool tool = memory.search_tool() # from MemoAirGraphMemory# or: MemoAirVoiceSearchTool(memory=voice_memory)# pass to create_react_agent(tools=[tool, ...]) — async onlyDesign notes
- •Recall before reply. The search node injects recalled context as a leading system message — no tool round-trip, no extra latency on the hot path.
- •One persist per user turn. The persist node sits at the terminal node, so multi-step graphs still write a single completed turn.
- •No checkpointer backing. MemoAir surfaces working/permanent/profile/org as retrieved context, not as LangGraph graph state. Use any standard LangGraph checkpointer for the graph's own control state — it's orthogonal.