Voice objects can be played or recorded to/from memory buffers, voice file descriptors (which are very similar to system file descriptors), or files. Speech is stored in standard file systems. In order to support applications and packages that rely on storing phrases in the file system, the IRAPI supports mapping talkfile and phrase numbers to files and vice versa.
Speech input and output is under the complete control of the application. It can be stopped by the application explicitly or implicitly by an interrupt. During play and coding, the IRAPI can notify the application of the progress of the action via events. Voice file descriptors can be opened, closed, positioned, and converted to file descriptors. Applications can query speech files to determine the coding algorithm and convert speech files from one algorithm to another. Internal components of the IRAPI are responsible for managing the real-time interface between the file system and speech resources. In most instances, the platform reduces the chance that gaps in speech requests may occur by queuing up speech files for continuous play. The IRAPI includes capabilities to speak numbers and characters with correct inflections for custom speech provided by the application developer. Applications have a similar interface and level of control over TTS activities.
Telephony
The IRAPI provides basic telephony for a variety of signaling interfaces. Applications can answer incoming calls, place outbound calls, query and set per-call information such as Automatic Number Identification (ANI) and Dialed Number Identification Service (DNIS), dial dual-tone multi frequency (DTMF) digits, flash, and hang up. The IRAPI handles the specifics of the telephony type for the application. In cases where a telephony action is not supported for a given telephony type assigned to a channel, the library reports that the operation is unsupported.
Input queue and speech recognition
Touch tones are collected in a unified input queue that can be manipulated in a variety of ways. The same input queue is used for touch tone and speech recognition input. The IRAPI supports a flexible built-in mechanism for editing input digits, delimiting sequences of digits, timing user responses for the first and subsequent touch-tone digits, and alerting the application when certain input criteria are reached.
Applications have complete control over speech recognition. Recognized strings are returned via the input queue and therefore have access to all of the input queue features. In addition, applications can use echo cancellation to improve recognizer accuracy when speech recognition is required during voice play. Applications can control the interruption of speech after receiving input.
Timeslot management
The IRAPI provides functions for managing the H.110 bus and network interface connections to the bus. Applications running on several channels can bridge their H.110 bus time slots together in a variety of ways. An application can monitor an arbitrary channel, which allows an application to listen to all input and output on that channel. Applications can allocate timeslots and start activities on them.
Channel ownership
The IRAPI uses a default owner to internally manage channel ownership. The default owner is an application that is notified when a channel is freed and there are no other pending requests that this channel will satisfy. Typically, the default owner is the process that is responsible for default owner listening for new calls and dispatching applications in response to them. Any process can become the default owner for a channel.
Applications can negotiate to acquire specific channels or a channel from a group of channels. As with resources, applications can choose not to wait, to wait for a fixed period of time, or to wait indefinitely for a channel.
Types of IRAPI processes
IRAPI applications can be processes that start and initialize themselves before they are actually needed by any caller (called permanent processes) or they can be dynamically created only when needed (called transient processes). Any number of applications of each type can be configured or be actively running on any system.
The IRAPI includes a family of functions that allow applications of any type to invoke one another. These functions model the exec(2) function and allow one application to replace another from the caller's point of view. This interface is flexible enough to allow IRAPI applications to pass control to transaction state machine (TSM) based applications. When processes invoke one another they can pass information to the invoked process. This facility supports both standard information such as ANI and DNIS as well as user-definable information.