2009. augusztus 27., csütörtök
2009. augusztus 7., péntek
SIMON: things to improve
Measuring noise levels
Mass train of words
Quickstart and wizards
Simon quickstart package (with Julius quickstart package)
SIMON vs. Windows 7 speech recognition
SIMON
Distribution: OSS, cross platform
Start year: 2007
Ease of use: quite hard - no demo or quickstart package (with Voxforge stuff - at least English). You should read the manual thoroughly. Don't expect quick results. If you want to test the core, just download the Julius quickstart package: http://www.voxforge.org/
Flexibility: Custom commands (programs, places, text macros, etc.)
Accuracy: After a week, it's still not sufficent (false positives, activation to wrong words or noise)
TIP: If recognition is below e.g. 85%, SIMON should drop the input and do nothing
Windows 7
Speech recognition in Windows 7 (trained for 20 minutes with reading specific text aloud + completed the tutorial that also trains the engine)
Distribution: Closed source, Windows integrated application
Start year: 1993
Ease of use: easy
Flexibility: built-in commands, Windows Speech Recognition Macros recommended
http://www.microsoft.com/downloads/details.aspx?FamilyID=fad62198-220c-4717-b044-829ae4f7c125&displaylang=en
Accuracy: about 80% (without rare words and if you speak 4-5 words at a time)
http://en.wikipedia.org/wiki/Windows_Speech_Recognition
Notable incident
The use of Windows Speech Recognition during a demonstration of Windows Vista at a Microsoft Financial Analyst Meeting on July 27, 2006, resulted in a well-publicized and embarrassing incident. The software failed to function correctly initially, resulting in an unintended output of "Dear aunt, let's set so double the killer delete select all".[5] [6] [7] A developer with Vista's speech recognition team later explained that Windows Speech Recognition's failure to function properly during the demonstration was the result of a bug in the volume control feature, which caused the application to pick up extra noise that affected its performance.[8] [9] The software bug was fixed by Microsoft prior to the release of Vista to the general public.[8]
Testing SIMON speech recogniton software
Why we need speech recognition? For the majority it's just for convenience, but for many handicapped people it means the only way to interact with their environment. Or it can make their life much easier. A worthy goal to fight for.
My project goal: to create a speaker (in)dependent engine that can recognise up to 50 Hungarian words with high accuracy.
Possible sources of errors?
1. WORSE RECOGNITION RATE IF: Recording one word several times in one file?
2. No indication of sound input! I must keep Audacity running all times. A little stay-on top window would help that could indicate the current sound strength (like Windows), a switch and the recognised word.
How to avoid common sources of errors?
* Use external sound card or USB microphone to avoid white noise - OK!
* Use a good quality microphone (I used a Logitech S 7500 webcam, it has echo cancellation and provides an almost noise free recording in silence)
Project homepage: http://simon-listens.org/
Project Wiki: http://www.cyber-byte.at/wiki/index.php/Main_Page
https://sourceforge.net/projects/speech2text/
Testing Simon: http://spirit.blau.in/simon
Simon blog: http://simon-listens.blogspot.com/