Tuesday, February 13, 2007

Understanding SAPI 5.1

The SAPI SDK 5.1 comes with a documentation file (or you can download just the documentation from here) in the form of Windows Help, but it's not quite easily navigable.

Some good places to start in the Contents tab (after opening Start -> All Programs -> Microsoft Speech SDK 5.1 -> Microsoft Speech SDK 5.1 Help) are:
  • Automation -> Sp[Shared/InProc]Recognizer:
    Description of the interface to the underlying speech recognition engine and their different types (shared versus in-process).
  • Automation -> Sp[Shared/InProc]RecoContext:
    A nice description of what "Recognition Contexts" are, and how one should create as many of them as appropriate for the application.
  • Automation -> ISpeechPhraseRule -> Code Example:
    Lists the properties that can be queried on a phrase rule that was recognized, including rule name and confidence values.
  • Automation -> ISpeechPhraseElement -> Code Example:
    Lists the properties that can be queried on a phrase element, including confidence values.
  • Automation -> ISpeechPhraseProperty -> Confidence:
    Example of how the confidence values can be extracted along with its corresponding property name.
  • Automation -> ISpeechAlternate:
    A way to get at a list of alternate phrase candidates for dictation mode recognition.
  • Automation -> Sp[Shared/InProc]RecoContext (Events):
    The list of events that the recognition context can receive, and thus the clients can listen for.
  • Application-Level Interfaces -> Grammar Compiler Interfaces -> Text Grammar Format:
    Description of the context-free grammar format used for command and control (as opposed to dictation) recognition.
  • White Papers -> SAPI 5.0 SR Properties White Paper:
    The list of recognition engine properties that can be queried and set using the SetPropertyNumber method of Sp[Shared/InProc]Recognizer class, including the confidence thresholds.

No comments: