Desktop App
The desktop app is a macOS Electron application that provides a voice assistant interface with screen sharing capabilities. It connects to the relay server using the same WebSocket protocol as the mobile app.
Features
Section titled “Features”- Voice conversations — full-duplex audio with barge-in support via Web Audio API
- Screen sharing — share any window or your entire screen with the AI (Gemini only)
- Conversation history — local SQLite database stores all conversations with searchable transcripts
- Audio device selection — choose input and output devices
- Volume control — adjustable playback volume
- Dark/light theme — follows system preference or manual toggle
- Tab navigation — Chat, History, and Settings pages
- Auto-reconnect — reconnects up to 3 times on unexpected disconnects
Tech Stack
Section titled “Tech Stack”- Electron 35 — desktop runtime
- React 19 — UI framework
- Tailwind CSS 3 — styling
- electron-vite — build tooling (Vite for renderer, esbuild for main)
- better-sqlite3 — local conversation storage
- lucide-react — icons
- Web Audio API — microphone capture and audio playback
Building from Source
Section titled “Building from Source”cd desktopyarn installyarn dev # start in development mode (hot reload)yarn build # compile renderer + main processyarn typecheck # run TypeScript type checkingScreen Sharing
Section titled “Screen Sharing”Screen sharing lets the AI see your screen content in real time. This uses Electron’s desktopCapturer API.
How it works:
- Click the screen share button in the chat interface
- Pick a screen source from the available windows/displays
- The
ScreenCaptureclass captures frames at 1 FPS - Frames are resized to max 768px and exported as JPEG (70% quality)
- Sent to the relay server as
frame.appendmessages - The relay forwards frames to the AI provider
Audio Engine
Section titled “Audio Engine”The desktop app uses the Web Audio API for audio:
- Capture:
getUserMediaat 24kHz mono with echo cancellation and noise suppression - Playback:
AudioBufferSourceNodechain with aGainNodefor volume control - Format: PCM16 at 24kHz, base64 encoded (matching the relay protocol)
- Frame size: 2400 samples (100ms chunks)
- Input level: RMS computed per capture frame for the level meter UI
Settings
Section titled “Settings”Configurable via the Settings page:
- Server URL — relay server WebSocket address
- API Key — relay server authentication key
- Provider — Gemini or OpenAI
- Model — provider-specific model identifier
- Voice — voice selection
- Brain Agent — enable/disable the brain agent
- Audio devices — input and output device selection
- Volume — playback volume slider
- Theme — dark, light, or system