Monday, February 19, 2007

C# starter code for SAPI 5.3 speech recognition from microphone under WPF

So despite the indications here and here to the contrary, it appears that it is possible to successfully create a SAPI 5.3 application to run on Windows XP. Here are the steps for getting a bare-bones C# program (using .NET 3.0 Windows Presentation Foundation to run as a Window application) up and running to recognize speech input from the microphone using SAPI 5.3.

Caveat: When using SAPI 5.3 on Windows XP in a WPF Window application, it appears that you cannot use the more complex SpeechRecognitionEngine class, but have to resort to using the SpeechRecognizer class. One of the limitations that this entails is that you cannot specify an audio file as an input into the recognizer.
  1. Install .NET Framework 3.0 from here.
  2. (Optional) If you want the additional tools for Visual Studio 2005 to facilitate development using .NET Framework 3.0, install the following two components:
    1. Microsoft® Windows® Software Development Kit for Windows Vista™ and .NET Framework 3.0 Runtime Components (only the Documentation is needed for installing the next component)
    2. Visual Studio 2005 extensions for .NET Framework 3.0 (WCF & WPF), November 2006 CTP (provides support for visually editing XAML files)
  3. Create a new C# Windows Application (WPF) in Visual Studio 2005.
  4. In the Solution Explorer, right click on References under your project node, and select Add Reference....
  5. In the .NET tab, select System.Speech (verify it's version 3.0.0.0), and click OK.
  6. Double click Window1.xaml (if you didn't do Step 2 above, then you'll have to right click on it, select Open with..., and choose XML editor), and add the following snippet inside the <Grid> </Grid> element:
    <ScrollViewer>
    <TextBox x:Name="result_textBox" TextWrapping="WrapWithOverflow"
    ScrollViewer.CanContentScroll="True"></TextBox>
    </ScrollViewer>
  7. Change your Window1.xaml.cs code to the following:
using System;
using System.Speech;
using System.Speech.Recognition;

namespace SimpleSAPI_5_3
{
public partial class Window1 : System.Windows.Window
{
// whether to use the command and control grammar or the dictation grammar
bool commandAndControl = false;
SpeechRecognizer _speechRecognizer;

public Window1()
{
InitializeComponent();

// set up the recognizer
_speechRecognizer = new SpeechRecognizer();
_speechRecognizer.Enabled = false;
_speechRecognizer.SpeechRecognized +=
new EventHandler<
SpeechRecognizedEventArgs>(_speechRecognizer_SpeechRecognized);

// set up the dictation grammar
DictationGrammar dictationGrammar = new DictationGrammar();
dictationGrammar.Name = "dictation";
dictationGrammar.Enabled = true;

// set up the command and control grammar
Grammar commandGrammar = new Grammar(@"grammar.xml");
commandGrammar.Name = "main command grammar";
commandGrammar.Enabled = true;

// activate one of the grammars if we don't want both at the same time
if (commandAndControl)
_speechRecognizer.LoadGrammar(commandGrammar);
else
_speechRecognizer.LoadGrammar(dictationGrammar);
}

protected override void OnClosing(System.ComponentModel.CancelEventArgs e)
{
_speechRecognizer.UnloadAllGrammars();
_speechRecognizer.Dispose();
}

void _speechRecognizer_SpeechRecognizediobject sender, SpeechRecognizedEventArgs e)
{
result_textBox.AppendText(e.Result.Text + "\n");
}
}
}
Note that the dictation grammar and the command-and-control grammar can be both active at the same time.

SAPI 5.3 uses W3C's Speech Recognition Grammar Specification (SRGS) Version 1.0 for its grammar files (see here for the grammar specification). To use a command-and-control grammar, set commandAndControl to true, and save the following file as grammar.xml in the same directory as the executable:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE grammar PUBLIC "-//W3C//DTD GRAMMAR 1.0//EN"
"http://www.w3.org/TR/speech-grammar/grammar.dtd">
<!-- the default grammar language is US English -->
<grammar xmlns="http://www.w3.org/2001/06/grammar"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2001/06/grammar
http://www.w3.org/TR/speech-grammar/grammar.xsd"
xml:lang="en-US" version="1.0" root="command">
<rule id="command" scope="public">
<one-of>
<item>selected</item>
<item>interface</item>
<item>default</item>
</one-of>
</rule>
</grammar>

2 comments:

Unknown said...
This comment has been removed by the author.
Unknown said...

result_textBox.AppendText(e.Result.Text + "\n"); works only for dicatation grammar! can you please show how to retrieve words from the xml file using SAPI 5.3?
Thanks in Advance.