Connecting OpenClaw
OpenClaw is an open-source AI agent framework that gives your voice assistant superpowers — web search, calendar management, task tracking, long-term memory, and more. VoiceClaw acts as a thin voice layer on top of OpenClaw: you bring the agent, VoiceClaw gives it a voice.
When the voice model needs help (web lookups, scheduling, memory recall), it calls the ask_brain tool. The relay server forwards that query to your OpenClaw instance via the standard /v1/chat/completions endpoint with SSE streaming.
Prerequisites
Section titled “Prerequisites”- Docker or Node.js 20+
- OpenClaw installed and running
- VoiceClaw relay server cloned and ready
-
Install and run OpenClaw
Follow the OpenClaw README to get it running locally. By default it starts a gateway on port
18789.Terminal window # Example with Dockerdocker run -d -p 18789:18789 openclaw/openclawOnce running, find your auth token:
Terminal window cat ~/.openclaw/openclaw.json | grep token -
Configure the VoiceClaw relay server
In your relay server
.envfile, point at your OpenClaw instance:Terminal window BRAIN_GATEWAY_URL=http://localhost:18789BRAIN_GATEWAY_AUTH_TOKEN=<your-openclaw-token> -
Start the relay server
Terminal window cd relay-serveryarn devYou should see a log line confirming the brain agent is reachable when you make your first voice query that triggers
ask_brain. -
Connect from a client
Open the desktop or mobile app, make sure
brainAgentis set to"enabled"in the session config, and start talking. Ask something like “What’s on my calendar today?” — the voice model will callask_brain, the relay will hit OpenClaw, and you will hear the answer spoken back.
System prompt for your OpenClaw agent
Section titled “System prompt for your OpenClaw agent”Paste this into your OpenClaw agent’s system instructions so it knows how to behave as a voice backend:
You are the brain behind a voice assistant called VoiceClaw.
You receive questions via the ask_brain tool when the realtime voice modelneeds help -- web search, calendar lookups, task management, memory recall,file operations, or any knowledge beyond basic conversation.
Guidelines:- Respond concisely. Your answers will be spoken aloud, not read on a screen.- Keep responses to 2-3 sentences when possible. The user is listening.- Skip formatting (no markdown, no bullet lists, no headers). Plain text only.- Lead with the answer, then add context if needed.- If you performed an action (created a task, added a calendar event), confirm it in one sentence.- If you do not know something, say so briefly rather than guessing.- You have access to tools like web search, calendar, memory, and file operations. Use them freely -- you are the capable backend, the voice model is just the interface.How it works under the hood
Section titled “How it works under the hood”- The voice model (Gemini or OpenAI) decides it needs help and calls the
ask_braintool with a query. - The relay immediately returns
{"status": "searching"}so the voice model can say something like “Let me check on that…” while the brain works. - The relay sends a
POST /v1/chat/completionsrequest to OpenClaw with SSE streaming enabled. - As OpenClaw works, it emits
step_completeevents that the relay forwards to the client astool.progressmessages (useful for showing live search status in the UI). - When the final answer arrives, the relay injects it back into the voice conversation via
injectContext(), and the AI speaks the result naturally. - On disconnect, the full conversation transcript is synced to OpenClaw for long-term memory.