PDA

View Full Version : Voice recognition


Darkmax
06-16-2010, 02:13 PM
I was just wondering if someone had tried a voice recognition library in vizard, because i would like to make a application where i can control it with my voice.

I found another thread about this but it was a little old and said that the library that was used now doesn't work.

So if some knows an approach to make this, I'm all ears :)

farshizzo
06-16-2010, 02:53 PM
You can use the 3rd party pywin32 library along with the Microsoft Speech SDK to perform voice recognition in Python.

1) Download and install the pywin32 (http://kb.worldviz.com/articles/791) library.

2) Download and install the Microsoft Speech SDK (http://www.microsoft.com/downloads/details.aspx?FamilyId=5E86EC97-40A7-453F-B0EE-6583171B4530&displaylang=en).

3) Run the makepy script and select the Microsoft Speech Object Library from the list and click OK. The makepy script should be located in your "[Vizard]\bin\lib\site-packages\win32com\client" folder.

4) You should now be able to run the following sample script which allows you to change the background color by saying the name of the color:from win32com.client import constants
import win32com.client
import pythoncom

VOICE_COLORS = { "Red" : viz.RED
,"Green" : viz.GREEN
,"Blue" : viz.BLUE
,"Yellow" : viz.YELLOW
,"White" : viz.WHITE
,"Black" : viz.BLACK
,"Purple" : viz.PURPLE
,"Orange" : viz.ORANGE }

"""Sample code for using the Microsoft Speech SDK 5.1 via COM in Python.
Requires that the SDK be installed (it's a free download from
http://www.microsoft.com/downloads/details.aspx?FamilyId=5E86EC97-40A7-453F-B0EE-6583171B4530&displaylang=en
and that MakePy has been used on it (in PythonWin,
select Tools | COM MakePy Utility | Microsoft Speech Object Library 5.1).

After running this, then saying "One", "Two", "Three" or "Four" should
display "You said One" etc on the console. The recognition can be a bit
shaky at first until you've trained it (via the Speech entry in the Windows
Control Panel."""
class SpeechRecognition:
""" Initialize the speech recognition with the passed in list of words """
def __init__(self, wordsToAdd):
# For speech recognition - first create a listener
self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
# Then a recognition context
self.context = self.listener.CreateRecoContext()
# which has an associated grammar
self.grammar = self.context.CreateGrammar()
# Do not allow free word recognition - only command and control
# recognizing the words in the grammar only
self.grammar.DictationSetState(0)
# Create a new rule for the grammar, that is top level (so it begins
# a recognition) and dynamic (ie we can change it at runtime)
self.wordsRule = self.grammar.Rules.Add("wordsRule", constants.SRATopLevel + constants.SRADynamic, 0)
# Clear the rule (not necessary first time, but if we're changing it
# dynamically then it's useful)
self.wordsRule.Clear()
# And go through the list of words, adding each to the rule
[ self.wordsRule.InitialState.AddWordTransition(None , word) for word in wordsToAdd ]
# Set the wordsRule to be active
self.grammar.Rules.Commit()
self.grammar.CmdSetRuleState("wordsRule", 1)
# Commit the changes to the grammar
self.grammar.Rules.Commit()
# And add an event handler that's called back when recognition occurs
self.eventHandler = ContextEvents(self.context)

"""The callback class that handles the events raised by the speech object.
See "Automation | SpSharedRecoContext (Events)" in the MS Speech SDK
online help for documentation of the other events supported. """
class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
"""Called when a word/phrase is successfully recognized -
ie it is found in a currently open grammar with a sufficiently high
confidence"""
def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
newResult = win32com.client.Dispatch(Result)
print "You said: ",newResult.PhraseInfo.GetText()
viz.clearcolor(VOICE_COLORS[newResult.PhraseInfo.GetText()])

if __name__=='__main__':


speechReco = SpeechRecognition(VOICE_COLORS.keys())
import viz
viz.go()

import vizact
vizact.ontimer(0,pythoncom.PumpWaitingMessages)

speaker = win32com.client.Dispatch("SAPI.SpVoice")
speaker.Speak('This script shows how to use the Microsoft Speech SDK with Vizard')

Darkmax
06-16-2010, 04:40 PM
wow thank you, but just a question where it say "wordsRule" in the code, this like a keyword to start recognizing the next word that i will say or what is this?

I tried and it works great, but if i want recognize only when a key word is said?

for example:
if i say "vizard(keyword) green(command)" will turn green the background, but if i just say green will do nothing.

Darkmax
06-16-2010, 05:03 PM
lol i think i just need to add the keyword before the word, like this:


VOICE_COLORS = { "vizard Red" : viz.RED
,"vizard Green" : viz.GREEN
,"vizard Blue" : viz.BLUE
,"vizard Yellow" : viz.YELLOW
,"vizard White" : viz.WHITE
,"vizard Black" : viz.BLACK
,"vizard Purple" : viz.PURPLE
,"vizard Orange" : viz.ORANGE }