Bug#1051812: RFP: vosk-api -- Offline speech recognition API

Otto Kekäläinen Wed, 25 Mar 2026 05:27:13 -0700

Hi,

I see this RFP from 2023 never got anyone to do it, but the need to
have a nice voice-to-text or voice input or dictation thing in Debian
is still there.


It seems Mozilla DeepSpeech stopped development already in 2021, and
its successor, https://github.com/coqui-ai/TTS also stopped
development 2 years ago. Now, with the rise of LLMs and various
related technologies, there are a list of options both for the speech
recognition part and for the UI integration part.

A few years back users had to choose between legacy speech-to-text
systems with high error rates or new LLM things that were heavy to
run, mainly Whisper and Parakeet. Now in 2026, seems we have
lightweight LLM-based things that seem superior, e.g.
https://github.com/kyutai-labs/pocket-tts and
https://github.com/SYSTRAN/faster-whisper

Voxtype seems to have a nice overview at https://voxtype.io/compare/
Voxtype itself also seems nice, and apparently it is packaged in Arch and NixOS.

However, programs like Voxtype, Vocalinux
(https://github.com/jatinkrmalik/vocalinux), Handy
(https://github.com/cjpais/Handy) or Blurt
(https://extensions.gnome.org/extension/6742/blurt/) are apps or
extensions. Another approach entirely would be to hook into the iBus
input system which is the approach taken in e.g.
https://github.com/PhilippeRo/IBus-Speech-To-Text and
https://github.com/GitJuhb/voice-typing-linux. Seems Fedora included
ibus-speech-to-text (see
https://fedoraproject.org/wiki/Changes/ibus-speech-to-text).

Perhaps we should start a discussion with the Debian Input Method Team
<[email protected]> to see if they have any
opinions on this.

Bug#1051812: RFP: vosk-api -- Offline speech recognition API

Reply via email to