Real-time voice conversations
Speak naturally and hear AI responses with low latency. Full-duplex audio with barge-in support — interrupt the AI mid-sentence just like a real conversation.
Real-time voice conversations
Speak naturally and hear AI responses with low latency. Full-duplex audio with barge-in support — interrupt the AI mid-sentence just like a real conversation.
Multi-provider support
Switch between Gemini and OpenAI models without changing client code. The relay server handles protocol translation transparently.
Brain agent
An async tool-calling agent that gives the voice AI access to web search, calendars, tasks, memory, and more — powered by any OpenAI-compatible agent.
Screen sharing
Share your screen on desktop so the AI can see what you see. JPEG frames streamed at 1 FPS with context window compression.
Session resumption
Gemini sessions survive network drops and transparently reconnect. OpenAI sessions rotate with transcript summaries to maintain context.
Conversation history
Local SQLite storage on both mobile and desktop with full transcript search and conversation continuity.
Relay Server
TypeScript / Node.js WebSocket relay that translates between clients and AI providers. Handles brain agent calls, session management, and observability via Langfuse.
Desktop App
Electron + React + Tailwind macOS voice assistant with screen sharing capabilities.
Mobile App
React Native / Expo iOS voice assistant with native audio I/O and conversation history.
Clone and install
git clone https://github.com/yagudaev/voiceclaw.gitcd voiceclawyarn installStart the relay server
cd relay-servercp .env.example .env # add your API keysyarn devStart a client
cd desktopyarn devcd mobileyarn dev+-----------+ WebSocket +---------------+ WebSocket +----------------+| | ---session.config---> | | ---provider setup---> | || Client | ---audio.append-----> | Relay Server | ---audio stream-----> | AI Provider || (mobile | <---audio.delta------ | | <---model audio------ | (Gemini / || or | <---transcript.delta- | - protocol | <---transcription---- | OpenAI) || desktop) | ---frame.append-----> | translate | ---video frames-----> | || | <---tool.call-------- | - brain agent | +----------------+| | <---tool.progress---- | - tracing |+-----------+ +---------------+The relay server sits between clients and AI providers. It normalizes the different provider protocols into a single, clean WebSocket API. Clients never talk directly to Gemini or OpenAI — they speak the relay protocol, and the relay handles the translation.