On Mon, Aug 31, 2020 at 3:16 AM Christian Gollwitzer <aurio...@gmx.de> wrote: > > Am 30.08.20 um 17:25 schrieb MRAB: > > On 2020-08-30 07:23, Muskan Sanghai wrote: > >> On Sunday, August 30, 2020 at 11:46:15 AM UTC+5:30, Chris Angelico wrote: > >>> I recommend looking into CMU Sphinx then. I've used that from Python. > >>> The results are highly entertaining. > >>> ChrisA > >> Okay I will try it, thank you. > >> > > Speech recognition works best when there's a single voice, speaking > > clearly, with little or no background noise. Movies tend not to be like > > that. > > > > Which is why the results are "highly entertaining"... > > > Well, with enough effort it is possible to build a system that is more > useful than "entertaining". Google did that, English youtube videos can > be annotated with subtitles from speech recognition. For example, try > this video: > https://www.youtube.com/watch?v=lYVLpC_8SQE > > Go to the settings thing (the little gear icon in the nav bar) and > switch on subtitles, English autogenerated. You'll see a word-by-word > transcription of the text, and most of it is accurate. > > There are strong arguments that anything one can build with open source > tools will be inferior. 1) They'll probably have a bunch of highly > qualified KI experts working on this thing 2) They have an enormous > corpus of training data. Many videos already have user-provided > subtitles. They can feed all of this into the training. > > I'm waiting to be disproven on this point ;) >
The OP doesn't want to use Google's services for this. That doesn't disprove your point, but....... :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list