AI Voice Assistant

Reachy AnyVoice

Your robot. Your models. Your rules. Zero cloud.

Ollama OpenAI OpenRouter faster-whisper 300+ Models

How It Works

Architecture: Mic → Brain → Speaker

Mic → faster-whisper (local STT) → Any LLM (Ollama / OpenAI / OpenRouter) → TTS → Robot Speaker + Emotions

Reachy Mini on desk

Any Model. Any Provider.

Run Qwen3 32B, Llama 3.1, Gemma, or any of 300+ models through a single app. Switch between local Ollama, OpenAI, and OpenRouter with one API call.

curl -X POST http://localhost:8042/config \ -d '{"provider":"ollama","model":"qwen3:32b"}'

Powered by the same universal model routing from AnyModel.

🔒

100% Private

Everything runs on your machine. No audio leaves your network. No API keys needed for local models.

🔄

Hot-Swap Models

Switch between Qwen3, Llama, Gemma, GPT-4o, or any OpenRouter model via the web UI at :8042.

Apple Silicon Optimized

Metal GPU acceleration. Qwen3 32B at ~30 tok/s on M4 Max. Sub-3s responses with Llama 3.1 8B.

🗣️

Wake Word

Say "Reachy" to activate. Ignores background noise, TV, music, and ambient speech.

😀

81 Robot Emotions

The LLM triggers expressive movements — happy, curious, surprised, thinking — automatically.

🌐

Web Search Built-in

Ask about news, weather, facts — the robot searches the internet and speaks results back.

Tested Models

ModelProviderLatencyQualityCost
Qwen3 32BOllama (local)~5s*ExcellentFree
Llama 3.1 8BOllama (local)~3sGoodFree
GPT-4o MiniOpenAI~2sGreat$0.15/1M tok
Llama 3.1 70BOpenRouter~2sExcellent$0.06/1M tok
Qwen3 Coder 30BOllama (local)~5s*GreatFree
Gemma 3 27BOllama (local)~4sGreatFree

* With num_ctx: 4096 optimization. Benchmarked on Apple M4 Max 128GB. Pull any model: ollama pull MODEL

Quick Start

1

Install Ollama + a model

brew install ollama && ollama pull llama3.1:8b
2

Install from Reachy Mini Control

Find reachy_local_voice in the marketplace and click Install. Or install via the daemon API.

3

Talk to your robot

Click Start. Say "Reachy, hello!" — it runs entirely on your machine.

Configuration

OLLAMA_MODEL=qwen3:32b # Any Ollama model LLM_PROVIDER=ollama # ollama | openai | openrouter OLLAMA_URL=http://localhost:11434 WHISPER_MODEL=base # tiny | base | small | medium | large-v3 MACOS_VOICE=Samantha # macOS TTS voice name WAKE_WORD=reachy # Activation word ENABLE_WAKE_WORD=true # true | false SILENCE_THRESHOLD=0.015 # Mic sensitivity