Wednesday, February 14, 2007

C# starter code for SAPI 5.1 speech recognition from microphone

Here are the steps for getting a bare-bones C# program up and running to recognize speech input from the microphone using SAPI 5.1.

Caveat: SAPI 5.1 does not work under a C# console application, due to the Automation API's dependence on Windows' Message Pump, so you have to create a Form-based application.
  1. Create a new C# Windows Application in Visual Studio 2005.
  2. In the Solution Explorer, right click on References under your project node, and select Add Reference....
  3. Click on the COM tab, select Microsoft Speech Object Library (verify it's version 5.0), and click OK.
  4. Double click Form1.cs, and add a TextBox control, set its Multiline behavior property to True, change the Name design property to "result_textBox", and resize the control on the Form to an appropriate size (this will be where the recognized text will be output).
  5. Change your Form1.cs code to the following:
using System.Windows.Forms;
using SpeechLib;

namespace SimpleSAPI
{
public partial class Form1 : Form
{
// whether to use the command and control grammar or the dictation grammar
bool commandAndControl = false;
ISpeechRecoContext recoContext;
ISpeechRecoGrammar grammar;

public Form1()
{
InitializeComponent();
}

protected override void OnLoad(System.EventArgs e)
{
/****** BEGIN: set up recognition context *****/
result_textBox.AppendText("Dictation mode\n");

// create the recognition context
recoContext = new SpeechLib.SpSharedRecoContext();
((SpSharedRecoContext)recoContext).Recognition +=
new _ISpeechRecoContextEvents_RecognitionEventHandler(RecoContext_Recognition);
/****** END: set up recognition context *****/

// set up the grammar
grammar = recoContext.CreateGrammar(0);

// set up the dictation grammar
grammar.DictationLoad("", SpeechLoadOption.SLOStatic);
grammar.DictationSetState(SpeechRuleState.SGDSInactive);

// load the command and control grammar
grammar.CmdLoadFromFile(@"grammar.xml", SpeechLoadOption.SLOStatic);
grammar.CmdSetRuleIdState(0, SpeechRuleState.SGDSInactive);

// activate one of the grammars if we don't want both at the same time
if (commandAndControl)
grammar.CmdSetRuleIdState(0, SpeechRuleState.SGDSActive);
else
grammar.DictationSetState(SpeechRuleState.SGDSActive);
}

protected override void OnClosing(System.ComponentModel.CancelEventArgs e)
{
recoContext.State = SpeechRecoContextState.SRCS_Disabled;
}

void RecoContext_Recognition(int StreamNumber, object StreamPosition,
SpeechRecognitionType RecognitionType, ISpeechRecoResult Result)
{
result_textBox.AppendText(Result.PhraseInfo.GetText(0, -1, true) + "\n");
}
}
}
Note that the dictation grammar and the command-and-control grammar can be both active at the same time within the same speech recognition context.

To use a command-and-control grammar, set commandAndControl to true, and save the following file as grammar.xml in the same directory as the executable:
<GRAMMAR LANGID="409">
<RULE NAME="toplevel" TOPLEVEL="ACTIVE">
<L>
<P>selected</P>
<P>interface</P>
<P>default</P>
</L>
</RULE>
</GRAMMAR>

Note that the LANGID should be set to 409, it appears that is the ID for English.

No comments: