🎙
RP AIA · an area of the work

Dictation

Push-to-talk speech → text, offline and private. Wispr Flow, replaced.

LIVE

The need it answers

Typing is slow — and the fastest dictation tools are cloud-bound, sending your voice, your half-formed thoughts, and your jargon to someone else's server. Dictation makes speaking to your machine effortless and completely private: push to talk, get clean text in any app, fully on-device — a Wispr Flow you actually own.

What it is

A privacy-first voice-to-text tool. Push-to-talk only — no always-on mic. A local faster-whisper engine transcribes, a vocabulary layer corrects your jargon, and the text auto-pastes into any app. It can read text back aloud, fully offline. Every dictation is saved to a personal repository.

The evolutionHow it was distilled — and what shaped it

🌱 Seed
Push-to-talk speech → text, fully on-device.
← shaped by typing is slow and cloud dictation leaks your words.
🛤 Path
Built a local faster-whisper engine + push-to-talk hotkey + auto-paste into any app.
← shaped by privacy-first — no always-on mic, nothing leaves the machine.
🔀 Pivot
From raw transcription to a vocabulary corrector that learns your jargon (VANI, LIPI, 4M SAI…) so the words come out right.
← shaped by raw transcripts mangle domain terms; fidelity matters more than raw speed.
💎 Crystal
STT → vocabulary correction → auto-clipboard → saved to a personal repository, with offline read-aloud. A working Wispr-Flow replacement.
← shaped by a complete daily-driver, not a demo.
⭐ Principle
Speak naturally, get clean text anywhere, privately — routing correction depth by need.
← shaped by voice as the natural, private way in.

Where we stand todayBuilt & working

What's nextOn the path

★ the moonshot

Ensembled multi-model transcription with on-demand arbitration — routing error-correction depth by need: noise to the ensemble, jargon to vocabulary, long-form to a relay pool.

Home
🔊Om
🎙Ask Vision Roadmap