LangGraph × MemoAir Voice Memory

Use LangGraph as your voice agent's brain while MemoAir voice memory rides the same local voice-runtime hot path the LiveKit drop-in uses — recalled before every answer, persisted after every turn. The memoair-langchain package ships a prebuilt graph, a BaseTool, and node factories. LangGraph is the reasoning layer; pair it with a transport (LiveKit) for audio.

Where memory sits in the graph

graph

TEXT

START -> memoair_search -> llm -> memoair_persist -> END
          │                       │
          │ search_memory (≤250ms, local runtime)
          │                       │
          └ recalled context injected as a LEADING system message
                                  │
                                  └ after_turn -> async cloud consolidation

The search node recalls the four lanes (profile, working, permanent, org) for the latest user turn; the LLM node sees that context before the user message; the persist node writes exactly one completed turn back (even if the graph did several internal hops).

Install

terminal

BASH

pip install "memoair-langchain>=0.1.0" langchain-openai
 
# add a transport for audio (LiveKit shown here):
pip install "livekit-agents[deepgram,openai,silero,turn-detector]" \
  livekit-plugins-langchain python-dotenv

memoair-langchain depends on memoair-voice (the bundled local runtime) plus langgraph and langchain-core. No Docker.

Get your API key, project ID, and agent ID

Sign in at dashboard.memoair.space and copy your memoair_pk_… API key, proj_… project ID, and agent_… agent ID (My Agents → Create Agent). Per-user isolation uses the user_id you pass — use the caller's participant identity.

Prebuilt voice graph (text-first)

MemoAirGraphMemory owns the voice session and builds the compiled graph. The runtime auto-starts on the first start().

quickstart.py

PYTHON

import asyncio
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from memoair_langchain import MemoAirGraphMemory
 
 
async def main() -> None:
    # One MemoAirGraphMemory == one voice session == one LangGraph thread.
    async with MemoAirGraphMemory(
        api_key="memoair_pk_...",
        project_id="proj_xxx",
        agent_id="agent_xxx",
        user_id="caller_phone_hash",
    ) as memory:
        # search -> llm -> persist, compiled. Hand to any LangGraph runner.
        graph = memory.build_graph(model=ChatOpenAI(model="gpt-4o-mini"))
        state = await graph.ainvoke(
            {"messages": [HumanMessage(content="when can I come in?")]}
        )
        print(state["messages"][-1].content)
 
 
if __name__ == "__main__":
    asyncio.run(main())

LangGraph brain + LiveKit transport

LiveKit handles audio/STT/TTS/turn-taking; the graph is the LLM via livekit.plugins.langchain.LLMAdapter(graph). MemoAir memory is wired inside the graph, so recall + persist work with no extra glue.

agent.py

PYTHON

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli
from livekit.plugins import deepgram, openai, silero
from livekit.plugins import langchain as livekit_langchain
from livekit.plugins.turn_detector.english import EnglishModel
 
from memoair_langchain import MemoAirGraphMemory
 
load_dotenv()
 
 
async def entrypoint(ctx: JobContext) -> None:
    await ctx.connect()
 
    raw = ctx.room.local_participant.identity
    user_id = raw if isinstance(raw, str) and raw.strip() else "console-user"
 
    memory = MemoAirGraphMemory(
        api_key=os.environ["MEMOAIR_API_KEY"],
        project_id=os.environ["MEMOAIR_PROJECT_ID"],
        agent_id=os.environ["MEMOAIR_AGENT_ID"],
        user_id=user_id,
    )
    await memory.start()
 
    # The graph IS the LLM.
    graph = memory.build_graph(model=ChatOpenAI(model="gpt-4o-mini"))
 
    session = AgentSession(
        stt=deepgram.STT(),
        llm=livekit_langchain.LLMAdapter(graph=graph),
        tts=openai.TTS(),
        turn_detection=EnglishModel(),
        vad=silero.VAD.load(),
    )
 
    # Mandatory: close the session on shutdown so /v1/runtime/session/end fires
    # and the cloud consolidates the permanent lane. (Same gotcha as the
    # LiveKit drop-in — without it, cross-session memory is silently lost.)
    async def _close_memory() -> None:
        await memory.end(reason="worker_shutdown")
        await memory.aclose()
 
    ctx.add_shutdown_callback(_close_memory)
 
    await session.start(
        agent=Agent(instructions="You are a helpful voice assistant."),
        room=ctx.room,
    )
 
 
if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Reference example in the repo: examples/langchain/voice_agents/memoair_langgraph_agent.py.

Or use it as a tool

For function-calling agents where the LLM decides when to recall, use the async BaseTool in a ToolNode / create_react_agent.

tool.py

PYTHON

from memoair_langchain import MemoAirVoiceSearchTool
 
tool = memory.search_tool()          # from MemoAirGraphMemory
# or: MemoAirVoiceSearchTool(memory=voice_memory)
# pass to create_react_agent(tools=[tool, ...]) — async only

Design notes

•Recall before reply. The search node injects recalled context as a leading system message — no tool round-trip, no extra latency on the hot path.
•One persist per user turn. The persist node sits at the terminal node, so multi-step graphs still write a single completed turn.
•No checkpointer backing. MemoAir surfaces working/permanent/profile/org as retrieved context, not as LangGraph graph state. Use any standard LangGraph checkpointer for the graph's own control state — it's orthogonal.