It has become commonplace to yell out commands to a little box and have it answer you. However, voice input for the desktop has never really gone mainstream. This is particularly slow for Linux users whose options are shockingly limited, although decent speech support is baked into recent versions of Windows and OS X Yosemite and beyond.
There are four well-known open speech recognition engines: CMU Sphinx, Julius, Kaldi, and the recent release of Mozilla’s DeepSpeech (part of their Common Voice initiative). The trick for Linux users is successfully setting them up and using them in applications. [Michael Sheldon] aims to fix that — at least for DeepSpeech. He’s created an IBus plugin that lets DeepSpeech work with nearly any X application. He’s also provided PPAs that should make it easy to install for Ubuntu or related distributions.
You can see in the video below that it works, although [Michael] admits it is just a starting point. However, the great thing about Open Source is that armed with a working set up, it should be easy for others to contribute and build on the work he’s started.
IBus is one of those pieces of Linux that you don’t think about very often. It abstracts input devices from programs, mainly to accommodate input methods that don’t lend themselves to an alphanumeric keyboard. Usually this is Japanese, Chinese, Korean, and other non-Latin languages. However, there’s no reason IBus can’t handle voice, too.
Oddly enough, the most common way you will see Linux computers handle speech input is to bundle it up and send it to someone like Google for translation despite there being plenty of horsepower to handle things locally. If you aren’t too picky about flexibility, even an Arduino can do it. With all the recent tools aimed at neural networks, the speech recognition algorithms aren’t as big a problem as finding a sufficiently broad training database and then integrating the data with other applications. This IBus plugin takes care of that last problem.