Home/Documentation
LangChain · LangGraph

LangGraph × MemoAir Voice Memory

Use LangGraph as your voice agent's brain while MemoAir voice memory rides the same local voice-runtime hot path the LiveKit drop-in uses — recalled before every answer, persisted after every turn. The memoair-langchain package ships a prebuilt graph, a BaseTool, and node factories. LangGraph is the reasoning layer; pair it with a transport (LiveKit) for audio.

Where memory sits in the graph

graph
TEXT
START -> memoair_search -> llm -> memoair_persist -> END
│ │
│ search_memory (≤250ms, local runtime)
│ │
└ recalled context injected as a LEADING system message
└ after_turn -> async cloud consolidation

The search node recalls the four lanes (profile, working, permanent, org) for the latest user turn; the LLM node sees that context before the user message; the persist node writes exactly one completed turn back (even if the graph did several internal hops).

1

Install

terminal
BASH
pip install "memoair-langchain>=0.1.0" langchain-openai
 
# add a transport for audio (LiveKit shown here):
pip install "livekit-agents[deepgram,openai,silero,turn-detector]" \
livekit-plugins-langchain python-dotenv

memoair-langchain depends on memoair-voice (the bundled local runtime) plus langgraph and langchain-core. No Docker.

2

Get your API key, project ID, and agent ID

Sign in at dashboard.memoair.space and copy your memoair_pk_… API key, proj_… project ID, and agent_… agent ID (My Agents → Create Agent). Per-user isolation uses the user_id you pass — use the caller's participant identity.

3

Prebuilt voice graph (text-first)

MemoAirGraphMemory owns the voice session and builds the compiled graph. The runtime auto-starts on the first start().

quickstart.py
PYTHON
import asyncio
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from memoair_langchain import MemoAirGraphMemory
 
 
async def main() -> None:
# One MemoAirGraphMemory == one voice session == one LangGraph thread.
async with MemoAirGraphMemory(
api_key="memoair_pk_...",
project_id="proj_xxx",
agent_id="agent_xxx",
user_id="caller_phone_hash",
) as memory:
# search -> llm -> persist, compiled. Hand to any LangGraph runner.
graph = memory.build_graph(model=ChatOpenAI(model="gpt-4o-mini"))
state = await graph.ainvoke(
{"messages": [HumanMessage(content="when can I come in?")]}
)
print(state["messages"][-1].content)
 
 
if __name__ == "__main__":
asyncio.run(main())
4

LangGraph brain + LiveKit transport

LiveKit handles audio/STT/TTS/turn-taking; the graph is the LLM via livekit.plugins.langchain.LLMAdapter(graph). MemoAir memory is wired inside the graph, so recall + persist work with no extra glue.

agent.py
PYTHON
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli
from livekit.plugins import deepgram, openai, silero
from livekit.plugins import langchain as livekit_langchain
from livekit.plugins.turn_detector.english import EnglishModel
 
from memoair_langchain import MemoAirGraphMemory
 
load_dotenv()
 
 
async def entrypoint(ctx: JobContext) -> None:
await ctx.connect()
 
raw = ctx.room.local_participant.identity
user_id = raw if isinstance(raw, str) and raw.strip() else "console-user"
 
memory = MemoAirGraphMemory(
api_key=os.environ["MEMOAIR_API_KEY"],
project_id=os.environ["MEMOAIR_PROJECT_ID"],
agent_id=os.environ["MEMOAIR_AGENT_ID"],
user_id=user_id,
)
await memory.start()
 
# The graph IS the LLM.
graph = memory.build_graph(model=ChatOpenAI(model="gpt-4o-mini"))
 
session = AgentSession(
stt=deepgram.STT(),
llm=livekit_langchain.LLMAdapter(graph=graph),
tts=openai.TTS(),
turn_detection=EnglishModel(),
vad=silero.VAD.load(),
)
 
# Mandatory: close the session on shutdown so /v1/runtime/session/end fires
# and the cloud consolidates the permanent lane. (Same gotcha as the
# LiveKit drop-in — without it, cross-session memory is silently lost.)
async def _close_memory() -> None:
await memory.end(reason="worker_shutdown")
await memory.aclose()
 
ctx.add_shutdown_callback(_close_memory)
 
await session.start(
agent=Agent(instructions="You are a helpful voice assistant."),
room=ctx.room,
)
 
 
if __name__ == "__main__":
cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Reference example in the repo: examples/langchain/voice_agents/memoair_langgraph_agent.py.

5

Or use it as a tool

For function-calling agents where the LLM decides when to recall, use the async BaseTool in a ToolNode / create_react_agent.

tool.py
PYTHON
from memoair_langchain import MemoAirVoiceSearchTool
 
tool = memory.search_tool() # from MemoAirGraphMemory
# or: MemoAirVoiceSearchTool(memory=voice_memory)
# pass to create_react_agent(tools=[tool, ...]) — async only

Design notes

  • Recall before reply. The search node injects recalled context as a leading system message — no tool round-trip, no extra latency on the hot path.
  • One persist per user turn. The persist node sits at the terminal node, so multi-step graphs still write a single completed turn.
  • No checkpointer backing. MemoAir surfaces working/permanent/profile/org as retrieved context, not as LangGraph graph state. Use any standard LangGraph checkpointer for the graph's own control state — it's orthogonal.