Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Andre Natal Sun, 09 Nov 2014 05:15:28 -0800

Sorry, I forgot the links:

2 - Speechrtc offline on Firefox OS (Peak): http://youtu.be/FXKXhrRDEb8


3 - Continuous speech recognition on android with poc…:
http://youtu.be/3lTtCFaQF2A
 On Nov 9, 2014 11:12 AM, "Andre Natal" <ana...@gmail.com> wrote:

> Hi Marco.
>
> SpeechRTC was my first tentative with the platform. At early 2013 neither
> I had enough knowledge about gecko internals as even b2g was at very early
> stage (in the very beggining, Steven Lee needed to send me patches to gum
> work properly), so the fastest path was capture and stream online. The
> great part is that opus is pretty efficient plus nodejs + a speech server
> wrapping pocketsphinx turned the whole roundtrip really fast.
>
> But I knew that was not ideal for command and control / grammar, then I
> started to research a direct port of pocketsphinx using emscripten. Did
> work but three reasons made me move to a full cpp version:
>
> 1) the whole speech api frontend in gecko was ready to roll only waiting a
> backend, and this, as we know was built in cpp;
>
> 2) my tests ran very well, but on peak [2] for example, performed slower
> than on low end devices running android [3]
>
> 3) with emscripten, the model loading inside decoder's creation at each
> reload ended very slow and I couldn't figure out how to keep the decoder
> instance between tabs and reloads while in cpp this happens only once, due
> Gecko's architecture
> On Oct 31, 2014 12:27 AM, "Marco Chen" <mc...@mozilla.com> wrote:
>
>> Hi Andre,
>>
>> It is a nice work and expect the voice recognition on B2G.
>>
>> Beside this final result, I am also interesting in the reason of you
>> migrate from SpeechRTC -> emscripten -> Web Speech API.
>> Could you also share what is the factor triggered these transition? Then
>> that can be the lesson learn for us.
>>
>> ex: SpeechRTC -> voice recognition can't be performed on local.
>>      emscripten -> performance issue? or license issue? or ?
>>
>> Thanks,
>> Sincerely yours.
>>
>> ------------------------------
>> *From: *"Andre Natal" <ana...@gmail.com>
>> *To: *dev-platform@lists.mozilla.org, "Sandip Kamat" <ska...@mozilla.com>,
>> "Olli.Pettay" <opet...@mozilla.com>
>> *Sent: *Friday, October 31, 2014 7:18:06 AM
>> *Subject: *Intent to ship: Web Speech API - Speech Recognition with
>> Pocketsphinx
>>
>> I've been researching speech recognition in Firefox for two years. First
>> SpeechRTC, then emscripten, and now Web Speech API with CMU pocketsphinx
>> [1] embedded in Gecko C++ layer, project that I had the luck to develop
>> for
>> Google Summer of Code with the mentoring of Olli Pettay, Guilherme
>> Gonçalves, Steven Lee, Randell Jesup plus others and with the management
>> of
>> Sandip Kamat.
>>
>> The implementation already works in B2G, Fennec and all FF desktop
>> versions, and the first language supported will be english. The API and
>> implementation are in conformity with W3C standard [2]. The preference to
>> enable it is: media.webspeech.service.default = pocketsphinx
>>
>> The required patches for achieve this are:
>>
>>  - Import pocketsphinx sources in Gecko. Bug 1051146 [3]
>>  - Embed english models. Bug 1065911 [4]
>>  - Change SpeechGrammarList to store grammars inside SpeechGrammar
>> objects.
>> Bug 1088336 [5]
>>  - Creation of a SpeechRecognitionService for Pocketsphinx. Bug 1051148
>> [6]
>>
>>
>> Also, other important features that we don't have patches yet:
>>  - Relax VAD strategy to be les strict and avoid stop in the middle of
>> speech when speaking low volume phonemes [7]
>>  - Integrate or develop a grapheme to phoneme algorithm to realtime
>> generator when compiling grammars [8]
>>  - Inlcude and build models for other languages [9]
>>  - Continuous and wordspotting recognition [10]
>>
>> The wip repo is here [11] and this Air Mozilla video [12] plus this wiki
>> has more detailed info [13].
>>
>> At this comment you can see a cpu usage on flame while recognition is
>> happening [14]
>>
>> I wish to hear your comments.
>>
>> Thanks,
>>
>> Andre Natal
>>
>> [1] http://cmusphinx.sourceforge.net/
>> [2] https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html
>> [3] https://bugzilla.mozilla.org/show_bug.cgi?id=1051146
>> [4] https://bugzilla.mozilla.org/show_bug.cgi?id=1065911
>> [5] https://bugzilla.mozilla.org/show_bug.cgi?id=1088336
>> [6] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148
>> [7] https://bugzilla.mozilla.org/show_bug.cgi?id=1051604
>> [8] https://bugzilla.mozilla.org/show_bug.cgi?id=1051554
>> [9] https://bugzilla.mozilla.org/show_bug.cgi?id=1065904 and
>> https://bugzilla.mozilla.org/show_bug.cgi?id=1051607
>> [10] https://bugzilla.mozilla.org/show_bug.cgi?id=967896
>> [11] https://github.com/andrenatal/gecko-dev
>> [12] https://air.mozilla.org/mozilla-weekly-project-meeting-20141027/
>> (Jump
>> to 12:00)
>> [13] https://wiki.mozilla.org/SpeechRTC_-_Speech_enabling_the_open_web
>> [14] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148#c14
>> _______________________________________________
>> dev-platform mailing list
>> dev-platform@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-platform
>>
>>
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Reply via email to