Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Marco Chen Thu, 30 Oct 2014 19:29:11 -0700

Hi Andre, 

It is a nice work and expect the voice recognition on B2G.


Beside this final result, I am also interesting in the reason of you migrate 
from SpeechRTC -> emscripten -> Web Speech API. 
Could you also share what is the factor triggered these transition? Then that 
can be the lesson learn for us. 

ex: SpeechRTC -> voice recognition can't be performed on local. 
emscripten -> performance issue? or license issue? or ? 

Thanks, 
Sincerely yours. 

----- Original Message -----

From: "Andre Natal" <ana...@gmail.com> 
To: dev-platform@lists.mozilla.org, "Sandip Kamat" <ska...@mozilla.com>, 
"Olli.Pettay" <opet...@mozilla.com> 
Sent: Friday, October 31, 2014 7:18:06 AM 
Subject: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx 

I've been researching speech recognition in Firefox for two years. First 
SpeechRTC, then emscripten, and now Web Speech API with CMU pocketsphinx 
[1] embedded in Gecko C++ layer, project that I had the luck to develop for 
Google Summer of Code with the mentoring of Olli Pettay, Guilherme 
Gonçalves, Steven Lee, Randell Jesup plus others and with the management of 
Sandip Kamat. 

The implementation already works in B2G, Fennec and all FF desktop 
versions, and the first language supported will be english. The API and 
implementation are in conformity with W3C standard [2]. The preference to 
enable it is: media.webspeech.service.default = pocketsphinx 

The required patches for achieve this are: 

- Import pocketsphinx sources in Gecko. Bug 1051146 [3] 
- Embed english models. Bug 1065911 [4] 
- Change SpeechGrammarList to store grammars inside SpeechGrammar objects. 
Bug 1088336 [5] 
- Creation of a SpeechRecognitionService for Pocketsphinx. Bug 1051148 [6] 


Also, other important features that we don't have patches yet: 
- Relax VAD strategy to be les strict and avoid stop in the middle of 
speech when speaking low volume phonemes [7] 
- Integrate or develop a grapheme to phoneme algorithm to realtime 
generator when compiling grammars [8] 
- Inlcude and build models for other languages [9] 
- Continuous and wordspotting recognition [10] 

The wip repo is here [11] and this Air Mozilla video [12] plus this wiki 
has more detailed info [13]. 

At this comment you can see a cpu usage on flame while recognition is 
happening [14] 

I wish to hear your comments. 

Thanks, 

Andre Natal 

[1] http://cmusphinx.sourceforge.net/ 
[2] https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html 
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1051146 
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=1065911 
[5] https://bugzilla.mozilla.org/show_bug.cgi?id=1088336 
[6] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148 
[7] https://bugzilla.mozilla.org/show_bug.cgi?id=1051604 
[8] https://bugzilla.mozilla.org/show_bug.cgi?id=1051554 
[9] https://bugzilla.mozilla.org/show_bug.cgi?id=1065904 and 
https://bugzilla.mozilla.org/show_bug.cgi?id=1051607 
[10] https://bugzilla.mozilla.org/show_bug.cgi?id=967896 
[11] https://github.com/andrenatal/gecko-dev 
[12] https://air.mozilla.org/mozilla-weekly-project-meeting-20141027/ (Jump 
to 12:00) 
[13] https://wiki.mozilla.org/SpeechRTC_-_Speech_enabling_the_open_web 
[14] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148#c14 
_______________________________________________ 
dev-platform mailing list 
dev-platform@lists.mozilla.org 
https://lists.mozilla.org/listinfo/dev-platform 

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Reply via email to