Frequently Asked Questions

Everything you need to know about MojoVoice — privacy-first voice dictation for developers.

Getting Started

What is MojoVoice?

MojoVoice is a privacy-first voice dictation tool that runs 100% locally on your machine. It uses GPU acceleration for sub-second transcription and works completely offline with zero telemetry or cloud dependencies.

How do I install MojoVoice?

Download from GitHub releases. On Linux with CUDA, extract and run the binary. On macOS, download the .dmg installer. Desktop app available as AppImage (Linux) and .dmg (macOS). See the documentation for detailed installation steps.

Does MojoVoice work offline?

Yes, MojoVoice works completely offline. All speech recognition processing happens locally on your GPU with zero cloud dependencies. Your voice data never leaves your machine.

Is MojoVoice free and open source?

Yes, MojoVoice is 100% free and open source under the MIT license. You can audit the code, contribute features, and customize it for your needs on GitHub.

Platform & Compatibility

Which platforms does MojoVoice support?

MojoVoice supports Linux (Wayland and X11) and macOS (Apple Silicon and Intel via Rosetta 2). Windows support is planned for v1.0. GPU acceleration works with CUDA on Linux and Metal on macOS.

Do I need a GPU to use MojoVoice?

While a GPU provides the best performance (sub-second transcription), MojoVoice also works on CPU-only systems. NVIDIA GPUs with CUDA and Apple Silicon Macs with Metal provide optimal speed.

What are the system requirements?

MojoVoice requires Linux (Wayland/X11) or macOS. For GPU acceleration: NVIDIA GPU with CUDA on Linux or Apple Silicon/Intel Mac with Metal. Minimum 4GB RAM for smaller models, 8GB+ recommended for larger models.

Can I use MojoVoice with my IDE or terminal?

Yes, MojoVoice integrates seamlessly with any application. Press your configured hotkey, speak, and transcribed text appears at your cursor in your IDE, terminal, browser, or any text input.

Features & Capabilities

How accurate is MojoVoice for technical terms?

MojoVoice uses OpenAI Whisper models trained on documentation and codebases. It accurately transcribes technical terms like camelCase, snake_case, Kubernetes, GraphQL, and complex CLI commands that other tools often miss.

Which Whisper models can I use?

MojoVoice includes 31 pre-configured Whisper models ranging from tiny (75MB, fastest) to large-v3-turbo (1.5GB, most accurate). You can switch models dynamically to balance speed, accuracy, and VRAM usage for your hardware.

Does MojoVoice support multiple languages?

Yes, Whisper models support 99+ languages. MojoVoice inherits this multilingual capability and can transcribe in any language supported by the Whisper model you've selected.

Does MojoVoice have a desktop app?

Yes, MojoVoice includes a native desktop app with a glassmorphic UI for model management, transcription history, audio device selection, and real-time status monitoring. Available as AppImage, .deb, and .dmg.

Can I customize hotkeys and settings?

Yes, MojoVoice offers full customization of hotkeys, audio device selection, model preferences, and performance settings. Tailor every aspect to match your workflow and hardware capabilities.

Performance

How fast is voice transcription?

With GPU acceleration, MojoVoice achieves sub-second transcription latency. On an NVIDIA GPU with CUDA or Apple Silicon with Metal, typical 3-5 second voice clips transcribe in under 1 second, maintaining your flow state.

How does the mojo-audio engine improve performance?

MojoVoice uses a custom mojo-audio engine written in Mojo for mel-spectrogram processing. This provides 10x faster performance compared to traditional implementations while maintaining 99.9% accuracy for Whisper inference.

Privacy & Security

Is my voice data sent to any servers?

No, absolutely not. MojoVoice processes all audio 100% locally on your GPU. There is zero telemetry, zero cloud connectivity, and zero data collection. Your privacy is guaranteed by design.

How does MojoVoice compare to cloud services like Google Speech?

Unlike cloud services, MojoVoice processes everything locally with zero latency from network round-trips, complete privacy with no data leaving your machine, and offline functionality. It's specifically optimized for developer workflows and technical vocabulary.

Comparison & Support

What's the difference between MojoVoice and Talon Voice?

MojoVoice focuses on GPU-accelerated transcription with 31 Whisper models and a visual desktop app, while Talon Voice is primarily a voice command system. MojoVoice is MIT-licensed open source with zero cloud dependencies.

Can I use MojoVoice for RSI or accessibility?

Yes, MojoVoice is excellent for repetitive strain injury (RSI) prevention and accessibility. Voice dictation reduces keyboard usage and enables hands-free development workflows for developers with mobility challenges.

Where can I get support or report issues?

Join GitHub Discussions for community support, report bugs via GitHub Issues, or check the documentation at mojovoice.ai. The project is actively maintained with regular updates and responsive issue handling.

Still have questions?

Join our community and get help from developers using MojoVoice every day.