IRIX 5.3 » Books » Developer »
IRIS Digital Media Programming Guide
(document number: 007-1799-040 / published: 1994-11-14)
table of contents | additional info | download find in page
Chapter 6. Programming with the Audio Library
The Audio Library (AL) provides a uniform application programming interface (API) for audio input to and output from Silicon Graphics workstations that feature high-quality digital audio systems.
The AL comprises routines that provide these basic capabilities:
creating digital audio input and output connections
reading and writing digital audio data
querying and controlling digital audio data attributes
querying and controlling the configuration of the audio system
handling errors
In this chapter:
This section discusses the basic concepts and data structures used by the AL— with particular attention devoted to the programming model, sample data formats, error handling, and programming concepts.
Features of the AL include:
Binary compatibility—AL programs written on one Silicon Graphics workstation equipped with an audio system are executable on other audio-equipped workstations across the product line.
Shared audio resources—more than one audio application can be active at a time, and multiple programs can have input and output streams open concurrently.
Real-time performance—a special group of AL functions useful specifically for writing real-time code.
Audio Library Programming Model
The AL programming model has two basic objects:
| Audio device | | The audio hardware used by the AL, which is shared among audio applications. The audio device contains settings pertaining to the configuration of both the internal audio system and the external electrical connections.
| | ALport | | A one-way (input or output) audio data connection between an application program and the host audio system. An ALport contains:
an audio sample queue, which stores audio samples awaiting input or output
settings pertaining to the attributes of the digital audio data it transports
Some of the settings of an ALport are static; they cannot be changed once the ALport has been opened. Other settings are dynamic; they can be changed while an ALport is open.
| | ALconfig | | An opaque data structure for configuring these settings of an ALport:
audio device (static setting)
size of the audio sample queue (static setting)
number of channels (static setting)
format of the sample data (dynamic setting)
width of the sample data (dynamic setting)
range of floating point sample data (dynamic setting)
|
Digital Audio Data Representation
The digital representation of an audio signal is generated by periodically sampling the amplitude (voltage) of the audio signal. The samples represent periodic “snapshots” of the signal amplitude. The Nyquist Theorem provides a way of determining the minimum sampling frequency required to accurately represent the information (in a given bandwidth) contained in an analog signal. Typically, digital audio information is sampled at a frequency that is at least double the highest interesting analog audio frequency. See The Art of Digital Audio or a similar reference on digital audio for more information.
Digital Audio Sample Rates
The sample rate is the frequency at which samples are taken from the analog signal. Sample rates are measured in hertz (Hz). A sample rate of 1 Hz is equal to one sample per second. For example, when a mono analog audio signal is digitized at a 48 kilohertz (kHz) sample rate, 48,000 digital samples are generated for every second of the signal.
To understand how the sample rate relates to sound quality, consider the fact that a telephone transmits voice-quality audio in a frequency range of about 320 Hz to 3.2 kHz. This frequency range can be represented accurately with a sample rate of 6.4 kHz. The range of human hearing, however, extends up to approximately 18–20 kHz, requiring a sample rate of at least 40 kHz.
The sample rate used for music-quality audio, such as the digital data stored on audio CDs is 44.1 kHz. A 44.1 kHz digital signal can theoretically represent audio frequencies from 0 kHz to 22.05 kHz, which adequately represents sounds within the range of normal human hearing. The most common sample rates used for DATs are 44.1 kHz and 48 kHz. Higher sample rates result in higher-quality digital signals; however, the higher the sample rate, the greater the signal storage requirement.
Digital Audio Sample Frames
A sample frame is a set of audio samples that are coincident in time. A sample frame for mono data is a single sample. A sample frame for stereo data consists of a left-right sample pair. A sample frame for 4-channel data has two left-right sample pairs (L1, R1, L2, R2).
Stereo samples are interleaved; left-channel samples alternate with right-channel samples. 4-channel samples are also interleaved, but each frame has two left-right sample pairs.
Figure 6-1 shows the relationship between the number of channels and the frame size of audio sample data.
Digital Audio Sample Formats
The AL uses a digital data format called linear pulse code modulation (PCM) (see the audio references for a definition of this term) to represent digital audio samples.
The formats supported by the AL and the audio system are:
8-bit and 16-bit signed integer
24-bit signed, right-justified within a 32-bit integer
32-bit and 64-bit floating point
 | Note: The audio hardware supports 16-bit I/O for analog data and 24-bit
I/O for AES/EBU digital data.
|
For floating point data, the application program specifies the desired range of values for the samples; for example, from -1.0 to 1.0.
Digital Audio Input and Output Sample Resolutions
The native data format used by the audio hardware is 24-bit two's complement integers. The audio hardware sign-extends each 24-bit quantity into a 32-bit word before delivering the samples to the Audio Library.
Audio input samples delivered to the Audio Library from the Indigo, Indigo2, and Indy audio hardware have different levels of resolution, depending on the input source that is currently active; the AL provides samples to the application at the desired resolution. You can also write your own conversion routine if desired.
Microphone/line-level input samples come from analog-to-digital (A/D) converters, which have 16-bit resolution. These samples are treated as 24-bit samples with 0's in the low 8 bits.
AES/EBU digital input samples have either 20-bit or 24-bit resolution, depending on the device that is connected to the digital input; for the 20-bit case (the most common), samples are treated as 24-bit samples, with 0's in the least significant 4 bits. The AL passes these samples through to the application if 24-bit two's complement is specified. If two's complement with 8-bit or 16-bit resolution is specified, the AL right-shifts the samples so that they will fit into a smaller word size. For floating point data, the AL converts from the 24-bit format to floating point, using a scale factor specified by the application to map the peak integer values to peak float values.
For audio output, the AL delivers samples to the audio hardware as 24-bit quantities sign-extended to fill 32-bit words. The actual resolution of the samples from a given output port depends on the application program connected to the port. For example, an application may open a 16-bit output port, in which case the 24-bit samples arriving at the audio processor will contain 0's in their least significant 8 bits.
The Audio Library is responsible for converting between the output sample format specified by an application and the 24-bit native format of the audio hardware. For 8-bit or 16-bit integer samples, this conversion is accomplished by left-shifting each sample written to the output port by 16 bits and 8 bits, respectively. For 32-bit or 64-bit floating point samples, this con version is accomplished by rescaling each sample from the range of floating point values that is specified by the application to the full 24-bit range and then rounding the sample to the nearest integer value.
Handling Audio Library Errors
This section describes techniques for error handling in AL applications.
When the AL encounters an error, it:
Checks to see whether an error handler is set, and if so, calls the specified routine.
Sets an error code, and returns a failure from the function call.
The default error handler prints a message to stderr. Although these error messages may be helpful for debugging during the development phase, you should turn off the default error handler in order to provide more effective error handling by using the IRIX oserror(3C) system call to retrieve function return codes.
To turn off the default error handler, call ALseterrorhandler(). Its function prototype is:
ALerrfunc ALseterrorhandler ( ALerrfunc efunc )
|
where:
| efunc | | is a pointer to an alternate error-handling routine of type ALerrfunc that is declared as:
void errorfunc ( long arg1, const char* arg2, [args] )
|
|
Substituting zero for efunc turns off the error handler.
Most AL routines set error codes when they fail. Throughout this guide, the return values and relevant error codes are listed along with the description of each routine. You can retrieve these error codes by calling oserror(3C). Based on these return codes, programs can adapt or recover, and/or alert the user by displaying a dialog box type of notifier or by printing information to the shell window from which the application was launched.
Audio Library Application Programming Concepts
Typically, your AL program must:
initialize data structures
set up buffers for passing data between your application and the CPU
query for available features
configure and open audio connections
pass data to and from the ALport and operate on that data
process errors
close audio connections
free system resources
The sections that follow explain these concepts in detail.
Initializing an Audio Library Application
To enable audio input and output, your application must create and configure the required audio I/O connections. This section describes how to set up and use the AL data structures that provide audio I/O capability.
The AL provides an opaque data structure called an ALport for audio I/O connections. An ALport provides a one-way (input or output) mono, stereo, or 4-channel audio data connection between an application program and the host audio system. More than one ALport can be opened by the same application; the number of ALports that can be active at the same time depends on the hardware and software configurations you are using.
An ALport consists of a sample queue and static and dynamic state information. For audio input, the hardware places audio samples in an input port's queue at a constant rate, and your application program reads the samples from the queue. Similarly, for audio output, your application writes audio samples to an output port's queue, and the audio hardware removes the samples from the queue. A minimum of two ALports are necessary to provide input and output capability for an audio application.
Using ALconfig Structures to Configure ALports
You can open an ALport with the default configuration or you can customize an ALconfig for configuring an ALport suited to your application needs.
The default ALconfig has:
These settings provide an ALport that is compatible with CD- and DAT-quality data, but if your application requires different settings, you must create an ALconfig with the proper settings before opening a port. The device, channel, and queue-size settings for an ALport are static—they cannot be changed after the port has been opened.
The steps involved in configuring and opening an ALport are listed below, followed by a sample code fragment that illustrates each of these steps. The sample program is followed by subsections that describe these concepts more fully and explain the use of each routine listed here.
Turn off the default error handler by passing a 0 to ALseterrorhandler().
If the default ALconfig settings are satisfactory, you can simply open a default ALport by using 0 for the configuration in the ALopenport() routine; otherwise, create a new ALconfig by calling ALnewconfig().
If nondefault values are needed for any of the ALconfig settings, set the desired values as follows:
Call ALsetchannels() to change the number of channels (page 77).
Call ALsetqueuesize() to change the sample queue size (page 79).
Call ALsetsampfmt() to change the sample data format (page 80).
Call ALsetwidth() to change the sample data width (page 82).
Call ALsetfloatmax() to set the maximum amplitude of floating point data (not necessary for integer data formats) (page 84).
Open an ALport by passing the ALconfig to the ALopenport() routine.
Create additional ALports with the same settings by using the same ALconfig to open as many ports as are needed.
Example 6-1 demonstrates how to configure and open an output ALport that accepts floating point mono samples.
Example 6-1. Configuring and Opening an ALport
ALconfig audioconfig;
ALport audioport;
int err;
void audioinit /* Configure an audio port */
{
ALseterrorhandler(0);
audioconfig = ALnewconfig();
ALsetsampfmt(audioconfig, AL_SAMPFMT_FLOAT);
ALsetfloatmax(audioconfig, 10.0);
ALsetqueuesize(audioconfig, 44100);
ALsetchannels(audioconfig,AL_MONO);
audioport = ALopenport("surreal","w",audioconfig);
if (audioport == (ALport) 0) {
err = oserror();
if (err == AL_BAD_NO_PORTS) {
fprintf(stderr, " System is out of audio ports\n");
} else if (err == AL_BAD_DEVICE_ACCESS) {
fprintf(stderr, " Couldn't access audio device\n");
} else if (err == AL_BAD_OUT_OF_MEM) {
fprintf(stderr, " Out of memory\n");
}
exit(1);
}
|
The sections that follow explain how to use ALconfigs in greater detail.
To create a new ALconfig structure that is initialized to the default settings, call ALnewconfig(). Its function prototype is:
ALconfig ALnewconfig ( void )
|
The ALconfig that is returned can be used to open a default ALport, or you can modify its settings to create the configuration you need. In Example 6-1, the channel, queue size, sample format, and floating point data range settings of an ALconfig named audioconfig are changed.
ALnewconfig() returns an ALconfig structure upon successful completion; otherwise, it returns 0 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
Setting and Getting the Number of Channels for an ALconfig
An ALport can be configured for one, two, or four audio channels. The channel setting remains in effect as long as the port is open.
To set the number of channels for an ALconfig structure, call ALsetchannels(). Its function prototype is:
int ALsetchannels ( ALconfig config, long channels )
|
where:
| config | | is the ALconfig for which you want to set the channels
| | channels | | is the number of channels to configure: 1, 2, or 4
|
Any ALport that you open with this config will have the number of channels that you set in channels.
ALsetchannels() returns 0 upon successful completion; otherwise, it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
To retrieve the channel setting of a given ALconfig structure, call ALgetchannels(). Its function prototype is:
long ALgetchannels ( ALconfig config )
|
where:
| config | | is the ALconfig structure being queried
|
ALgetchannels() returns the channel setting of config, upon successful completion; otherwise, it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
Setting and Getting the Sample Queue Size for an ALconfig
Selecting the proper size for the sample queue is very important, because continuous sound output depends on the ability of the application to fill the queue at least as fast as the hardware empties it. For example, if the queue is too small, the application may take too long supply new samples, resulting in audible breaks that sound like pops or clicks. The size of the queue determines the maximum delay that can be tolerated while waiting for the application to get more samples at the given sample rate. To determine how much space to allocate for the sample queue, consider the data type and rate. For example, the default queue size of 100,000 samples provides buffer space for slightly more than one second of 48 kHz stereo audio data, and a little more than three seconds of 32 kHz mono data. To better understand these phenomena, see Figure 6-2 on page 91 for an illustration of a sample queue and read the associated discussion.
 | Tip: The main point to be concerned about is how full to keep the queue, regardless of its size. If the queue is full, more time passes before samples are played. The ideal situation is to keep enough samples in the queue to allow for the longest possible delay that will be experienced in retrieving the next batch of samples. See “Real-time Programming Techniques for Audio” for an explanation of how to set the fill threshold for a queue.
|
The noninclusive values for minimum and maximum allowable queue sizes for ALports on Indigo, Indigo2, and Indy workstations are listed in Table 6-1
Table 6-1. Minimum and Maximum Allowable Queue Sizes for ALports
ALport Type
| Minimum Size
| Maximum Size
|
|---|
Mono
| 510
| 131,069
| Stereo
| 1019
| 262,139
| 4-channel on Indigo
| 2038
| 524, 278
| 4-channel on Indigo2 or Indy
| 1019
| 262,139
|
To specify an ALconfig with a sample queue size other than the default for an ALport, call ALsetqueuesize(). Its function prototype is:
int ALsetqueuesize ( ALconfig config, const long size )
|
where:
| config | | is the ALconfig structure for which you want to change the sample queue size
| | size | | is the number of sample locations to allocate for the queue
|
Any ALport that you open with this config will have a queue size of size.
ALsetqueuesize() returns 0 upon successful completion; otherwise, it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
To retrieve the size of the sample queue in a given ALconfig structure, call ALgetqueuesize(). Its function prototype is:
long ALgetqueuesize ( ALconfig config )
|
where:
| config | | is the ALconfig structure being queried
|
ALgetqueuesize() returns the queuesize of config upon successful completion; otherwise, it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
Setting and Getting the Sample Data Format for an ALconfig
The AL allows you to choose between three sample formats:
To set the sample format type of a given ALconfig structure, call ALsetsampfmt(). Its function prototype is:
int ALsetsampfmt ( Alconfig config, long sampleformat )
|
where:
| config | | is the ALconfig structure for which you want to change the sample format
| | sampleformat | | must be one of three symbolic constants:
|
Any ALport that you open with this config will have a sample format of sampleformat.
ALsetsampfmt() returns 0 upon successful completion; otherwise, it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
To retrieve the sample format of a given ALconfig structure, call ALgetsampfmt(). Its function prototype is:
long ALgetsampfmt ( ALconfig config )
|
where:
| config
| | is the ALconfig structure being queried
|
ALgetsampfmt() returns the sampleformat setting of config upon successful completion; otherwise, it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
Setting and Getting the Integer Sample Width for an ALconfig
The sample width represents the degree of precision to which the full-scale range of an audio signal can be sampled. You can specify the width of two's complement integer sample data, but you can't specify the width of floating point samples. Thus, setting the sample width has no effect when the sample format is AL_SAMPFMT_FLOAT or AL_SAMPFMT_DOUBLE; however, the width setting does have an effect if the sample format is subsequently changed to AL_SAMPFMT_TWOSCOMP.
The sample width also determines which data type the AL uses when reading and writing samples. The sample widths available for two's complement data, and their associated resolutions and data types, are:
| 8-bit samples | | representing a total of 28 quantized signal values. The AL treats 8-bit samples as packed, signed characters (chars).
| | 16-bit samples | | representing a total of 216 quantized signal values. The AL treats 16-bit samples as packed, signed short integers (shorts).
| | 24-bit samples | | representing a total of 224 quantized signal values. The AL treats 24-bit samples as right-justified, sign-extended, signed 32-bit integers (longs).
|
For all sample widths, sample values map linearly to intermediate signal amplitudes.
To specify the sample width setting of two's complement data for an ALconfig structure, call ALsetwidth(). Its function prototype is:
int ALsetwidth ( ALconfig config, long samplesize )
|
where:
| config | | is the ALconfig structure for which you want to change the sample width
| | samplesize | | is a symbolic constant denoting the sample width
|
Any ALport that you open with this config will have a sample width of samplesize.
ALsetwidth() returns 0 upon successful completion; otherwise it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
To retrieve the current sample width setting of an ALconfig structure, call ALgetwidth(). Its function prototype is:
long ALgetwidth ( ALconfig config )
|
where:
| config | | is the ALconfig structure being queried
|
ALgetwidth() returns the samplesize of config upon successful completion; otherwise, it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
Getting and Setting the Floating Point Data Range
If you configure an ALport to use floating point data (a sample format of either AL_SAMPFMT_FLOAT or AL_SAMPFMT_DOUBLE), you need to define a maximum value in order to set the upper and lower bounds of the samples that pass through that port. Setting the floating point maximum value specifies a symmetrical range that is centered about zero.
 | Tip: To have more control over the scaling, you can program your application to perform its own floating point-to-integer conversion before passing samples through the ALport.
|
To set the maximum value of floating point data, call ALsetfloatmax(). Its function prototype is:
int ALsetfloatmax ( ALconfig config, double maximum_value )
|
where:
Samples read into any ALport that you open with this config are scaled to the range [-μαξιμυμ_ϖαλυε, maximum_value]. Samples output from this ALport should be in the range [-μαξιμυμ_ϖαλυε, maximum_value] to avoid limiting. The default maximum value is 1.0.
 | Note: The number of quantization steps that can be represented by floating point samples is a function of the value of maximum_value. If maximum_value is too small, you will not be able to represent 216 evenly spaced amplitude levels.
|
ALsetfloatmax() has no function when the sample format is AL_SAMPFMT_TWOSCOMP; however, the maximum_value setting takes effect if the sample format is subsequently changed to AL_SAMPFMT_FLOAT or AL_SAMPFMT_DOUBLE.
ALsetfloatmax() returns 0 upon successful completion; otherwise, it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
To retrieve the floating point maximum value, call ALgetfloatmax(). Its function prototype is:
double ALgetfloatmax ( Alconfig config )
|
where:
| config | | is the ALconfig structure being queried
|
ALgetfloatmax() returns the maximum_value of config upon successful completion; otherwise, it returns 0 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
Retrieving the Setup of an Existing ALport
You can retrieve an ALconfig whose settings match those of an existing ALport. This is an easy way to create an ALconfig to use for changing the dynamic settings of an ALport, as described next in “Modifying the Audio Data Attributes of an Open ALport”.
To retrieve a new ALconfig structure that is a clone of an existing ALconfig structure already in use by an existing audio port, call ALgetconfig(). Its function prototype is:
ALconfig ALgetconfig ( ALport port )
|
where:
| port | | is the audio port whose ALconfig structure is being cloned
|
You should call ALfreeconfig() to deallocate the ALconfig when it is no longer needed.
ALgetconfig() returns an ALconfig structure upon successful completion; otherwise, it returns 0 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
Modifying the Audio Data Attributes of an Open ALport
In general, you don't change the settings for an ALport while it is open, but sometimes you might need to modify the audio data attributes of an ALport while it is open. For example, to create continuous output from multiple sound files that have different sample widths, such as 8-bit and 16-bit data, an application might need to change the sample width of the output port to match the output data, without closing and reopening the port, in order to prevent interruptions in the output.
To change the data attributes of an ALport instantaneously, use ALsetsampfmt(), ALsetfloatmax(), and ALsetwidth() as needed to define the settings of an ALconfig, which you then pass to the ALsetconfig() routine. The only settings that can be modified with this method are the sample format, the sample width, and the maximum floating point value. You can't use this method to change the audio device, the queue size, or the number of channels in an ALport.
ALsetconfig() changes an audio port's ALconfig structure to match that of a given ALconfig. Its function prototype is:
int ALsetconfig ( ALport port, ALconfig config )
|
where:
| port | | is the audio port for which you want to change the ALconfig settings
| | config | | is the ALconfig from which the new settings are copied
|
ALsetconfig() returns 0 upon successful completion; otherwise, it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
Freeing Resources Associated with an ALconfig
To minimize memory consumption, you should free the memory associated with an ALconfig that is no longer needed. An ALconfig is no longer needed if the application is not going to open any more ports with it.
To deallocate an ALconfig structure, call ALfreeconfig(). Its function prototype is:
int ALfreeconfig ( ALconfig config )
|
where:
| config | | is the ALconfig to deallocate. Freeing an ALconfig structure does not affect any port(s) that have already been opened using that ALconfig
|
ALfreeconfig() returns 0 on successful completion; otherwise, it returns -1 and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
Opening and Closing Audio Ports
An ALport provides a one-way (input or output) mono, stereo, or 4-channel audio data connection between an application program and the host audio system. More than one ALport can be opened by the same application; the number of ALports that can be active at the same time depends on the hardware and software configurations you are using. Open ALports use CPU resources, so be sure to close an ALport when I/O is completed and free the ALconfig when it is no longer needed.
Audio ports are opened and closed by using ALopenport() and ALcloseport(), respectively. Unless you plan to use the default port configuration, you should set up an ALconfig structure by using ALnewconfig() and then use the routines for setting ALconfig fields, such as ALsetchannels(), ALsetqueuesize(), and ALsetwidth() before calling ALopenport().
To allocate and initialize an ALport structure, call ALopenport(). Its function prototype is:
ALport ALopenport ( char *name, char *direction,
ALconfig config )
|
where:
| name | | is an ASCII string used to identify the port for humans (much like a window title in a graphics program). The name is limited to 20 characters and should be both descriptive and unique, such as an acronym for your company name or the application name, followed by the purpose of the port
| | direction | | specifies whether the port is for input or output:
| | config | | is an ALconfig that you have previously defined or is
null (0) for the default configuration.
|
Upon successful completion, ALopenport() returns an ALport structure for the named port; otherwise, it returns a null-valued ALport, and sets an error code that you can retrieve by calling oserror(3C). Possible errors include:
ALcloseport() closes and deallocates an audio port—any samples remaining in the port will not be output. Its function prototype is:
int ALcloseport ( ALport port )
|
where:
| port | | is the ALport you want to close
|
Example 6-2 opens an input port and an output port and then closes them.
Example 6-2. Opening Input and Output ALports
input_port = ALopenport("waycoolinput", "r", 0);
if (input_port == (ALport) 0 {
err = oserror();
if (err == AL_BAD_NO_PORTS) {
fprintf(stderr, " System is out of audio ports\n");
} else if (err == AL_BAD_DEVICE_ACCESS) {
fprintf(stderr, " Couldn't access audio device\n");
} else if (err == AL_BAD_OUT_OF_MEM) {
fprintf(stderr, " Out of memory: port open failed\n");
}
exit(1);
}
...
output_port = ALopenport("killeroutput", "w", 0);
if (input_port == (ALport) 0 {
err = oserror();
if (err == AL_BAD_NO_PORTS) {
fprintf(stderr, " System is out of audio ports\n");
} else if (err == AL_BAD_DEVICE_ACCESS) {
fprintf(stderr, " Couldn't access audio device\n");
} else if (err == AL_BAD_OUT_OF_MEM) {
fprintf(stderr, " Out of memory: port open failed\n");
}
exit(1);
...
ALcloseport(input_port);
ALcloseport(output_port);
|
Reading and Writing Audio Data
This section explains how an audio application reads and writes audio samples to and from ALports.
Using Audio Sample Queues
Audio samples are placed in the sample queue of an ALport to await input or output (see Figure 6-2). The audio system uses one end of the sample queue; the audio application uses the other end.
During audio input (left side of Figure 6-2), the audio hardware continuously writes audio samples to the tail of the input queue at the selected input rate, for example, 44,100 sample pairs per second for 44.1 kHz stereo data. If the application can't read the samples from the head of the input queue at least as fast as the hardware writes them, the queue fills up and some incoming sample data is irretrievably lost.
During audio output (right side of Figure 6-2), the application writes audio samples to the tail of the queue. The audio hardware continuously reads samples from the head of the output queue at the selected output rate, for example, 44,100 sample pairs per second for 44.1 kHz stereo data, and sends them to the outputs. If the application can't put samples in the queue as fast as the hardware removes them, the queue empties, causing the hardware to send 0-valued samples to the outputs (until more data is available), which are perceived as pops or breaks in the sound.
For example, if an application opens a stereo output port with a queue size of 100,000, and the output sample rate is set to 48 kHz, the application needs to supply (2 × 48,000 = 96,000) samples to the output port at the rate of at least 1 set of samples per second, because the port contains enough space for about one second of stereo data at that rate. If the application fails to supply data at this rate, an audible break occurs in the audio output.
On the other hand, if an application tries to put 40,000 samples into a queue that already contains 70,000 samples, there isn't enough space in the queue to store all the new samples, and the program will block (wait) until enough of the existing samples have been removed to allow for all 40,000 new samples to be put in the queue. The AL routines for reading and writing block; they do not return until the input or output is complete.
Figure 6-2 shows how input and output ports use audio sample queues.
Monitoring the Audio Sample Queue Status to Provide Nonblocking I/O
This section explains how to use the AL routines for monitoring the status of an ALport's sample queue.
The AL maintains the following status information about the queue:
| filled | | the number of queue locations that contain valid data
| | fillable | | the number of empty locations in the queue
|
The sum of the empty locations and the full locations is the total size of the queue:
filled + fillable = queuesize
Checking the filled and fillable statuses before reading and writing prevents blocking and helps prevent overflow and underflow errors.
ALgetfillable() and ALgetfilled() provide instantaneous information on the state of an audio port's queue.
To prevent blocking during output, you can determine how many samples will fit into the queue by calling ALgetfillable() before writing any samples, and then write only that many samples to the queue.
To get the number of empty queue locations in a given ALport, call ALgetfillable(). Its function prototype is:
long ALgetfillable ( ALport port )
|
where:
| port | | is the audio port whose queue is being examined
|
The value returned indicates how many samples can still be written without blocking.
To prevent blocking during input, you can determine how many samples are in the queue by calling ALgetfilled() before reading any samples, then read only that many samples from the queue. You can also periodically check ALgetfilled() to find out whether all of your output data has drained before you shut down a port by calling ALcloseport().
To find out how many queue locations in a given audio port currently have valid samples in them at a given instant, call ALgetfilled(). Its function prototype is:
long ALgetfilled ( ALport port )
|
where:
| port | | is the audio port whose queue is being examined
|
The value returned indicates how many samples can still be read without blocking if port is an input port or how many samples have yet to be played if it is an output port.
More Methods for Working with Queues
Besides using these routines, you can use ALgetstatus() to check for underflow and overflow errors, as described in “Detecting Errors in the Audio Stream”.
“Real-time Programming Techniques for Audio” discusses how to use several other routines that allow an application to view and modify the dynamic state of an audio port. These routines are most useful in developing real-time audio applications.
Reading and Writing Samples
Audio input is accomplished by reading audio data samples from an input ALport's sample queue. Similarly, audio output is accomplished by writing audio data samples to an output ALport's sample queue.
ALreadsamps() and ALwritesamps() provide mechanisms for transferring audio samples to and from sample queues. They are blocking routines, which means that a program will halt execution within the ALreadsamps() or ALwritesamps() call until the request to read or write samples can be completed.
Reading Samples from an Input ALport
ALreadsamps() reads a specified number of samples from an input port to a sample data buffer, blocking until the requested number of samples have been read from the port. Its function prototype is:
int ALreadsamps( const ALport port, void *samples,
const long samplecount )
|
where:
| port | | is an audio port configured for input
| | samples | | is a pointer to a buffer into which you want to transfer the samples read from input. samples is treated as one of the following types, depending on the configuration of the ALport:
| | samplecount | | is the number of samples to read
|
To prevent blocking, samplecount must be less than the return value of ALgetfilled().
 | Note: When the application is reading samples into an ALport that has channels set to 4, samplecount must be an integer multiple of the frame size, or an error will be returned and no samples will be transferred.
|
When 4-channel data is input on systems that do not support 4 line-level electrical connections, that is, when setting AL_CHANNEL_MODE to AL_4CHANNEL is not possible, ALreadsamps() will provide 4 samples per frame, but the second pair of samples will be set to 0.
Table 6-2 shows the input conversions that are applied when reading mono, stereo, and 4-channel input in stereo mode (default) and in 4-channel mode hardware configurations. Each entry in the table represents a sample frame.
Table 6-2. Input Conversions for ALreadsamps()
| Hardware Configuration
|
|
|---|
Input
| Indigo, and Indigo2 or Indy in
Stereo Mode
| Indigo2 or Indy in 4-channel Mode
|
|---|
Frame at
physical inputs
| (L1, R1)
| (L1, R1, L2, R2)
| Frame as read by
a mono port
| (L1 + R1) /2
| (Clip (L1 + L2), Clip (R1 + R2)) /2
| Frame as read by
a stereo port
| (L1, R1)
| (Clip (L1 + L2), Clip (R1 + R2))
| Frame as read by
a 4-channel port
| (L1, R1, 0, 0)
| (L1, R1, L2, R2)
|
 | Note: If the summed signal is greater than the maximum allowed by the audio system, it is clipped (limited) to that maximum, as indicated by the Clip function.
|
Writing Samples to an Output ALport
Samples placed in an output queue are played by the audio hardware after a specific amount of time, which is equal to the number of samples that were present in the queue before the new samples were written, divided by the (sample rate × number of channels) settings of the ALport.
ALwritesamps() writes a specified number of samples to an output port from a sample data buffer, blocking until the requested number of samples have been written to the port. Its function prototype is:
int ALwritesamps ( ALport port, void *samples,
long samplecount )
|
where:
| port | | is an audio port configured for input
| | samples | | is a pointer to a buffer from which you want to transfer the samples to the audio port
| | samplecount | | is the number of samples you want to read
|
 | Note: When the application is writing samples from an ALport that has channels set to 4, samplecount must be an integer multiple of the frame size, or an error will be returned and no samples will be transferred.
|
Table 6-3 shows the output conversions that are applied when writing mono, stereo, and 4-channel data to stereo mode (default) and 4-channel mode hardware configurations.
Table 6-3. Output Conversions for ALwritesamps()
|
| Hardware Configuration
|
|
|---|
Output
| Frame as
Written into Port
| Indigo, and Indigo2 or Indy in
Stereo Mode
| Indigo2 or Indy in
4-channel Mode
|
|---|
Mono Port
| (L1)
| (L1, L1)
| (L1, L1, 0, 0)
| Stereo Port
| (L1, R1)
| (L1, R1)
| (L1, R1, 0, 0)
| 4-channel
Port
| (L1, R1, L2, R2)
| (Clip (L1 + L2), Clip (R1 +
R2))
| (L1, R1, L2, R2)
|
Detecting Errors in the Audio Stream
Errors in an input or output audio stream may occur if an application is unable to read samples from or write samples to a queue fast enough to satisfy the demands of the real-time hardware.
This section explains how to use two AL routines that let you identify errors and define custom error-reporting functions.
If a program cannot provide samples to an output port fast enough to keep up with the hardware, an audible break in the output may be heard. Likewise, if an application does not read input samples as fast as the hardware puts them in the queue, some samples will be lost.
The audio system detects such discontinuities in audio sample streams, and information concerning these breaks can be gathered by the application. This information can be used to dynamically tune the application execution, to increase the priority of a process, or merely to alert the user to errors.
ALgetstatus() provides access to information regarding the most recent error in the audio stream associated with a port. Its function prototype is:
int ALgetstatus ( Alport port, long *PVbuffer,
long bufferlength )
|
where:
| port | | is the audio port being queried
| | PVbuffer | | is an array of longs, the even elements of which should contain the error parameters you want to read
| | bufferlength | | is the number of elements in the PVbuffer array
|
The odd element directly following each parameter will then be written with the current values associated with each corresponding parameter.
ALgetstatus() lets you determine the number of errors associated with the stream, the type of the last error, the length of the last error, and the location of the error relative to the total lifetime of the port.
The location of the error marks the point in the port's lifetime, that is, the time since the port was opened, at which the error was detected. This value is a 48-bit number representing the number of sample frames sent through the port. The value is generated by concatenating the least significant 24 bits of the values associated with AL_ERROR_LOCATION_LSP and AL_ERROR_LOCATION_MSP.
Table 6-4 lists and describes the error parameters
Table 6-4. Error Parameters for ALgetstatus()
Error Parameter
| Description
|
|---|
AL_ERROR_LENGTH
| Current length in sample frames of the current
error. Consecutive values of this variable may
differ if the current error has not completed. Only
the least significant 24 bits of this variable are
valid.
| AL_ERROR_LOCATION_LSP
| Least significant portion (LSP) of the location of
the beginning of the current error. Only the least
significant 24 bits of this variable are valid.
| AL_ERROR_LOCATION_MS
P
| Most significant portion of the location of the
beginning of the current error (in sample frames).
Only the least significant 24 bits of this variable
are valid.
| AL_ERROR_NUMBER
| Number of errors associated with the port since it
was opened.
| AL_ERROR_TYPE
| Type of error that has most recently occurred on
the port. Supported types are
AL_ERROR_INPUT_OVERFLOW and
AL_ERROR_OUTPUT_UNDERFLOW.
|
Querying and Controlling the Global Audio Device State
This section explains how to use the AL routines for querying and modifying the global audio device state. Your application should query for the availability of special audio features because different workstations have different capabilities, and because programming in this way makes it easy to update your application when new features are added.
Because the audio device is a shared resource, it is especially important to query whether other audio applications are running, so that your application does not inadvertently change a setting upon which another application relies. If no other audio applications are running, your program can use the AL routines described in this section to modify the settings of the state variables, but an application should always verify that it is the only audio application in use before changing any system-wide settings.
There is a core set of parameters that exists on every system and special parameters for capabilities such as 4-channel mode and stereo mic mode that don't exist on all configurations. To query for the availability of a noncore parameter, you have to query for both its existence and whether it supports the settings that you require. It is not necessary to query for the existence of core parameters.
Table 6-5 lists the core set of global parameters, describes their roles, and provides valid values.
Table 6-5. Core Global Parameters for AL_DEFAULT_DEVICE
Global Parameter
| Description and Valid Values
|
|---|
AL_INPUT_SOURCE
| Selects the active input source:
AL_INPUT_LINE—line-level input jack
AL_INPUT_MIC—microphone input jack
AL_INPUT_DIGITAL—serial digital input jack
| AL_LEFT_INPUT_ATTEN
| Controls the left input attenuation level for both the line-in level and the microphone level.
Range = 0-255, 0 = no attenuation, 255 = maximum attenuation.
| AL_RIGHT_INPUT_ATTEN
| Controls the right input attenuation level for both the line-in level and the microphone level.
Range = 0-255, 0 = no attenuation, 255 = maximum attenuation.
| AL_INPUT_RATE
| Indicates the sample rate at the analog (line or microphone) inputs. A positive value indicates
a specific sampling rate in Hz. The AL rounds unsupported values to the nearest supported
value.
A negative value indicates a logical value, including AL_RATE_AES_1, meaning to match the
analog sampling rate to the rate at which data is arriving at the digital input.
Note that AL_INPUT_RATE does not apply when the digital input jack is in use. The digital
input data stream has its own sample rate, which is determined strictly by the device
generating the digital data.
| AL_OUTPUT_RATE
| Indicates the sample rate at the analog and digital outputs. A positive value indicates a specific
sampling rate in Hz. The AL rounds unsupported values to the nearest supported value.
A negative value indicates a logical value, such as AL_RATE_INPUT_RATE, meaning to match
the output sample rate to the rate used by the currently active input, or AL_RATE_AES_1,
meaning to match the output sample rate to the rate at which samples are arriving at the digital
input.
| AL_LEFT_SPEAKER_GAIN
| Controls the left speaker and headphone volume levels; does not affect line-level and digital
outputs. Range = 0-255, 0 = no gain, 255 = maximum gain. Zero gain does not necessarily mean
zero volume.
| AL_RIGHT_SPEAKER_GAI
N
| Controls the right speaker and headphone volume levels; does not affect line-level and digital
outputs. Range = 0-255, 0 = no gain, 255 = maximum gain. Zero gain does not necessarily mean
zero volume.
| AL_INPUT_COUNT
| Read-only value that indicates the number of system-wide open input ALports.
| AL_OUTPUT_COUNT
| Read-only value that indicates the number of system-wide open output ALports.
| AL_UNUSED_COUNT
| Read-only value that indicates the number of system-wide unopened ALports.
| AL_MONITOR_CTL
| Controls monitoring. When monitoring is enabled, audio input is passed through to the
output. Input and output sample rates must be precisely matched to prevent distortion.
AL_MONITOR_ON enables monitoring; AL_MONITOR_OFF disables monitoring.
| AL_SPEAKER_MUTE_CTL
| AL_SPEAKER_MUTE_ON mutes speaker and headphones; AL_SPEAKER_MUTE_OFF
unmutes speaker and headphones. Any change to AL_LEFT_SPEAKER_GAIN or
AL_RIGHT_SPEAKER_GAIN shuts off speaker muting.
|
Table 6-6 lists and describes special parameters that are available on some systems. You should query for the existence of these parameters and whether they support the required values before using them.
Table 6-6. Special Global Parameters for System-Dependent Audio Capabilities
Global Parameter
| Description and Valid Values
|
|---|
AL_CHANNEL_MODE
| Configures the audio hardware. AL_STEREO
configures the hardware for stereo audio;
AL_4CHANNEL configures the hardware for 4-
channel audio on systems that support it.
| AL_MIC_MODE
| Selects the microphone mode. AL_MONO selects the
mono microphone; AL_STEREO selects stereo mic
input on systems that support it.
| AL_LEFT2_INPUT_ATTEN
| Controls the attenuation for the L2 line-level or mic-
level input.
| AL_RIGHT2_INPUT_ATTEN
| Controls the attenuation for the R2 line-level or mic-
level input.
| AL_LEFT_MONITOR_ATTEN
| Controls the attenuation for the left half of the
monitor signal. Range = 0-255, 0 = no attenuation,
255 = maximum attenuation.
| AL_RIGHT_MONITOR_ATTEN
| Controls the attenuation for the right half of the
monitor signal. Range = 0-255, 0 = no attenuation,
255 = maximum attenuation.
| AL_DIGITAL_INPUT_RATE
| Read-only value; sample rate at which data is
arriving at the digital input. The rate is that signified
by the nonaudio bits of the incoming digital signal; it
is not actually measured. A positive value indicates a
specific sampling rate in Hz.
A negative value indicates a logical value, including
AL_RATE_UNDEFINED, meaning that the audio
system could not determine the digital input data
rate, or the device generating the digital data has
marked the data as having an indeterminate rate.
Note that the digital input data stream contains its
own clock signal; thus, its notion of a given rate will
differ slightly from an internally generated version of
the same rate.
|
Techniques for Working with Global Parameters
The AL routines for working with parameters are:
All of these routines expect a device argument of type long, representing the particular audio device whose state you want to know or change. The only currently supported device is AL_DEFAULT_DEVICE.
Several of these routines expect parameter-value buffer (PVbuffer) arguments. A PVbuffer is simply an array of long integers, where the integers are logically organized as pairs of elements. The first element of each pair is a parameter constant defined in the include file audio.h. The second element of each pair stores a value associated with the parameter. The second location can be used to pass a value for a parameter into a routine or to return a value for a given parameter from a routine.
 | Tip: You don't have to pass an array containing all of the possible parameters; create an array that contains only the values of interest.
|
Some methods for using these routines are:
If you need a complete list of all available parameters, call ALqueryparams(). To be certain that you have a large enough buffer to contain the parameter-value pairs, you can pass a zero in place of the buffer, then call malloc() to allocate a buffer the size of the returned value.
If you are interested only in certain values, create an array that is twice the size of the number of parameters you are querying, and fill the even locations with the parameters of interest, then:
call ALgetparams() to determine the current settings of the state variables.
fill in the even entries with the values that you want to change, and then call ALsetparams() to change the values.
Some parameters might exist but might not allow the needed settings, so call ALgetminmax() to get the parameter bounds and check to be sure that the values you want to use exist.
Getting a List of Available Parameters
ALqueryparams() asks the audio device to supply a list of descriptors and corresponding descriptions for all the currently available global state variables. Its function prototype is:
long ALqueryparams ( long device, long *PVbuffer,
long bufferlength )
|
where:
| device | | is the audio device (AL_DEFAULT_DEVICE)
| | PVbuffer | | is an array of longs, into which ALqueryparams() writes a descriptor/description pair for each state variable associated with device. The even (0, 2, 4, …) entries receive the descriptors. The odd entries (1, 3, 5, …) receive one of two description values (negative values indicate read-only parameters):
| | bufferlength | | is the number of elements in the PVbuffer array
|
ALqueryparams() returns a long value representing the buffer size necessary to hold all parameters and their values. If your PVbuffer is of smaller dimensions than this value, you have not received a complete set of descriptor/description pairs for device. See Table 6-5 for a list of currently supported core global parameters. See Table 6-6 for a list of special global parameters that are not supported on all systems.
ALsetparams() lets you modify the values of many of these global parameters, though you should take care in using these functions. See the description of ALsetparams() at the end of this section for details.
Getting the Bounds of Global Parameters
ALgetminmax() obtains maximum and minimum values for a given global parameter. Its function prototype is:
int ALgetminmax( long device, long param, long *minparam,
long *maxparam )
|
where:
| device
| | is the audio device (AL_DEFAULT_DEVICE)
| | param | | is the parameter whose range you want to know
| | minparam
| | is a pointer to a variable into which the minimum value will be written
| | maxparam | | is a pointer to a variable into which the maximum value will be written
|
Getting the Defaults of Global Parameters
ALgetdefault() returns the default value for a given audio hardware device state parameter. Its function prototype is:
long ALgetdefault ( long device, long parameter )
|
where:
| device | | is the audio device (AL_DEFAULT_DEVICE)
| | parameter | | is the parameter whose default value you want to obtain
|
Getting the Names Corresponding to the Global Parameters
ALgetname() returns a pointer to a null-terminated string that can be used to label an audio hardware device state parameter. Treat this string as a read-only string. Its function prototype is:
char* ALgetname ( long device, long parameter )
|
| device | | is the audio device (AL_DEFAULT_DEVICE)
| | parameter | | is the parameter whose name you want to know
|
Table 6-7 lists the global parameter name strings.
Table 6-7. Global Parameter Name Strings
Global Parameter
| Name String
|
|---|
AL_INPUT_SOURCE
| "Line/MIC/AES"
| AL_LEFT_INPUT_ATTEN
| "Left Input Atten"
| AL_RIGHT_INPUT_ATTEN
| "Right Input Atten"
| AL_INPUT_RATE
| "Input Rate"
| AL_OUTPUT_RATE
| "Output Rate"
| AL_LEFT_SPEAKER_GAIN
| "Left Output Gain"
| AL_RIGHT_SPEAKER_GAIN
| "Right Output Gain"
| AL_INPUT_COUNT
| "Input Count"
| AL_OUTPUT_COUNT
| "Output Count"
| AL_UNUSED_COUNT
| "Unused Count"
| AL_MONITOR_CTL
| "Monitor Control"
| AL_LEFT_MONITOR_ATTEN
| "Left Monitor Atten"
| AL_RIGHT_MONITOR_ATTEN
| "Right Monitor Atten"
| AL_SPEAKER_MUTE_CTL
| "Speaker Mute Control"
| AL_MIC_MODE
| "Microphone Mode"
| AL_CHANNEL_MODE
| "System Channel Mode"
| AL_DIGITAL_INPUT_RATE
| "Digital Input Rate"
|
Getting Current Parameter Settings
ALgetparams() gets the current value(s) of the device parameters referenced in the PVbuffer. Its function prototype is:
int ALgetparams ( long device, long *PVbuffer,
long bufferlength )
|
where:
| device | | is the audio device (AL_DEFAULT_DEVICE)
| | PVbuffer | | is an array of pairs of longs, the even (0, 2, 4, …) entries of which should contain the global parameters whose values you want to obtain
| | bufferlength | | is the number of elements in the PVbuffer array
|
ALgetparams() fills the odd (1, 3, 5, …) entries in the PVbuffer array with the current values associated with each corresponding parameter.
See Table 6-5 for a description of the currently supported core global parameters. See Table 6-6 for a list of special global parameters that are not supported on all systems.
Modifying the Values of the Global Parameters
ALsetparams() sets the current value(s) of one or more audio hardware device parameters. Its function prototype is:
int ALsetparams ( long device, long *PVbuffer,
long bufferlength )
|
where:
| device | | is the audio device (AL_DEFAULT_DEVICE)
| | PVbuffer | | is an array of pairs of longs, the even (0, 2, 4, …) entries of which should contain the global parameters whose values you want to change to the corresponding values listed in the odd (1, 3, 5, …) entries.
| | bufferlength | | is the number of elements in the PVbuffer array
|
See Table 6-5 for a description of the currently supported core global parameters. See Table 6-6 for a list of special global parameters that are not supported on all systems.
When an application program modifies a global state parameter such as the output sample rate, it may affect other processes on the system that are also engaged in audio processing. For example, if one application is playing a 44.1 kHz recording through an output port, and a second application changes the global output sample rate from 44.1 kHz to 16 kHz, the output of the original application will be distorted.
Sample Code for Querying Features and Values
This section provides sample code fragments that demonstrate the proper methods to use when querying for certain attributes.
Determining Whether Other Audio Applications Are Running
To determine whether other audio applications are running, query the system for open input or output ports. To determine the total number of ports available on your system, add the values returned for AL_INPUT_COUNT, AL_OUTPUT_COUNT, and AL_UNUSED_COUNT.
Example 6-3 demonstrates querying for other active audio output.
Example 6-3. Querying for the Existence of Other Audio Processes
/*
* 'Nonrude' behavior is defined as follows: before modifying global values, first check
* to see whether any other output ports are currently active; if any other processes have
* open output ports, don't modify anything.
*/
...
rude = 0;
...
/*
* Need to determine whether audio is in use. If not, then we
* can just go ahead and be "rude."
*/
pvbuf[0] = AL_OUTPUT_COUNT;
pvbuf[2] = AL_MONITOR_CTL;
if (ALgetparams(AL_DEFAULT_DEVICE, pvbuf, 4) < 0) {
if (oserror() == AL_BAD_DEVICE_ACCESS) {
fprintf(stderr,"%s: Can't play -- could not access audio hardware.\n");
return -1;
}
}
if ((pvbuf[1] == 0) && (pvbuf[3] == AL_MONITOR_OFF)) {
rude = 1;
}
|
Determining the Input and Output Rates
Querying the system for an input or output rate must be done carefully in order to obtain a valid result. Example 6-4 contains two routines, get_input_rate() and get_output_rate(), each of which returns a rate either in Hz or AL_RATE_UNDEFINED if the rate cannot be determined. A minimal main() program calls the routines. See
ratequery.c
in /usr/people/4Dgifts/examples/dmedia/audio for another example of rate querying.
Example 6-4. Querying for Input and Output Rates
#include <audio.h>
...
/*
* These routines expect to be run with the AL error handler shut off.
* (call ALseterrorhandler(0)).
*/
...
int
get_input_rate()
{
long buf[6];
...
buf[0] = AL_INPUT_RATE;
buf[2] = AL_INPUT_SOURCE;
buf[4] = AL_DIGITAL_INPUT_RATE;
ALgetparams(AL_DEFAULT_DEVICE,buf,6);
...
if (buf[1] == AL_RATE_AES_1 || buf[3] == AL_INPUT_DIGITAL) {
/*
* We are clocked off of the digital input. Find the
* real input rate, if we can.
*/
if (ALgetdefault(AL_DEFAULT_DEVICE,AL_DIGITAL_INPUT_RATE) >= 0) {
return buf[5];
}
}
else if (buf[1] > 0) {
/*
* Input rate is in Hz and we're using an analog input -- return rate.
*/
return buf[1];
}
return AL_RATE_UNDEFINED;
}
...
int
get_output_rate()
{
long buf[4];
buf[0] = AL_OUTPUT_RATE;
buf[2] = AL_DIGITAL_INPUT_RATE;
ALgetparams(AL_DEFAULT_DEVICE,buf,4);
if (buf[1] > 0) {
/*
* Output rate is in Hz -- return it.
*/
return buf[1];
}
else {
/*
* Output rate is a logical rate -- track down what it means.
*/
if (buf[1] == AL_RATE_AES_1) {
/*
* We are clocked off of the digital input. Find the
* real input rate, if we can. If we can't, return AL_RATE_UNDEFINED
*/
if (ALgetdefault(AL_DEFAULT_DEVICE,AL_DIGITAL_INPUT_RATE) >= 0) {
return buf[3];
}
}
else if (buf[1] == AL_RATE_INPUTRATE) {
return get_input_rate();
}
return AL_RATE_UNDEFINED;
}
}
...
main()
{
int x;
ALseterrorhandler(0);
x = get_output_rate();
if (x == AL_RATE_UNDEFINED) {
printf("can't get output rate\n");
}
else {
printf("output rate = %d\n",x);
}
x = get_input_rate();
if (x == AL_RATE_UNDEFINED) {
printf("can't get input rate\n");
}
else {
printf("input rate = %d\n",x);
}
}
|
Determining Whether 4-channel Capability Exists
Although you can open a 4-channel ALport on any system, you cannot change the system's electrical configurations if it does not support 4-channel mode.
To determine whether a system has 4-channel capability, use ALgetminmax(), then verify that the maximum value is 4.
Example 6-5 demonstrates how to query for 4-channel hardware capability.
Example 6-5. Querying for 4-channel Capability
/*
* Query to see if we are on a machine with 4-channel
* HW capability. If so,switch into 4-channel mode.
* If AL_CHANNEL_MODE both exists (ALgetminmax doesn't
* fail) AND has a maximum of 4,then we're OK.
*
* If we wanted to be really nice, we could check,
* by querying AL_INPUT_COUNT and AL_OUTPUT_COUNT, to
* see if any other apps were doing audio. If so, we
* might not want to switch to 4-channel mode, lest
* we introduce artifacts into their audio streams.
*/
if (ALgetminmax(AL_DEFAULT_DEVICE, AL_CHANNEL_MODE,
&min, &max) >= 0 && max == 4) {
long buf[2];
buf[0] = AL_CHANNEL_MODE;
buf[1] = 4;
ALsetparams(AL_DEFAULT_DEVICE, buf, 2);
}
/*
* Even if we don't have 4-channel HW capability,
* the AL will let us use a 4-channel buffer, so
* we can continue at this point without regard to
* HW type.
*/
|
Audio Library Synchronization Facilities
The AL provides two different facilities for synchronization:
The AL allows for multiple audio ports (ALports) to be synchronized in a sample accurate manner, by using the absolute sample frame count.
The AL allows audio data to be related to other media based on common time line, by using the unadjusted system time (UST).
The AL provides a method of determining the absolute sample count of the current sample frame under program control (that is, the sample frame which can be read/written with a call to the Audio Library) and a method of relating UST values to the count of samples which have entered or exited the audio device.
As mentioned in Chapter 2, “Programming with the Digital Media Library,”the digital media libraries provide a single time line, UST, through which media may be related. This time value is the number of nanoseconds since the operating system was started. As an absolute time value, UST is not particularly useful. However, it is extremely useful for relating different media types and for evaluating the relative timing of events.
Absolute sample frame count is the basis for timing within the AL. Whenever audio is input or output on a device, a count is kept of the sample frames elapsed. This sample frame count is the absolute number of sampling periods elapsed since input or output started. If the audio sample rate is set to 44100 kHz, the sample frame count advances at the nominal rate of 44100 counts per second, regardless of the channel setting for the port (see ALsetchannels() for more details on setting the number of channels for a port).
The sample frame count increases regardless of whether an application is reading or writing audio samples using the ALreadsamps() or ALwritesamps() function calls, respectively. As long as an audio port (ALport) is open, the sample frame count advances.
The AL function ALgetframenumber() provides a way for an application to query the absolute sample frame count associated with the current sample frame to be written (in the case of an output port) or read (in the case of an input port).
The function prototype for ALgetframenumber() is:
int ALgetframenumber(const ALport port,
unsigned long long *framenum);
|
where:
| port | | is the audio port of interest
| | framenum | | is a pointer to a 64-bit number in which to hold the resultant frame count value
|
If ALgetframenumber() succeeds, 0 is returned; otherwise a -1 is returned.
Since the sample frame count is an absolute value of sample frames entering or exiting an audio device, two audio ports (ALports) can be synchronized by reading/writing samples at the identical sample frame count. This “port-to-port” synchronization is guaranteed to be sample accurate.
In general, ALgetframenumber() does not return equal values for the sample frame count for different ports. In order to synchronize two audio ports, you will need to make the sample frame count of the two ports match by reading/writing samples from/to one of the sample queues. Example 6-7 demonstrates synchronizing two audio ports.
 | Note: The absolute sample frame count is valid only if the port in question does not overflow (in the case of input) or underflow (in the case of output). When your port underflows or overflows, the value of the sample frame count continuously changes, and you cannot reliably place samples in the queue at a desired location. In order to reestablish a valid value for sample frame count (and hence synchronization) your application must recover from the underflow or overflow (read or write samples as appropriate) and then query for the value of sample frame count again.
|
Figure 6-3 shows the relationship of the sample frame count returned by ALgetframenumber() to sample frames in the queue associated with an input or output audio port (ALport).
In Example 6-6, the first two ALwritesamps() calls are used to bring the audio ports out of an underflow condition. This ensures that subsequent calls to ALgetframenumber() will result in valid sample frame counts.
Example 6-6. Synchronizing Audio Between Two Output Ports: align.c
/* align.c - synchronize audio of two output audio ports */
#include <stdio.h>
#include <dmedia/audio.h>
main(void)
{
ALport port_1, port_2;
short buf_1[10000], buf_2[10000];
short zilch[10000];
unsigned long long count_1, count_2, delta_count;
int i;
/* get two output ports with default configurations */
port_1 = ALopenport("port_1", "w", NULL);
port_2 = ALopenport("port_2", "w", NULL);
if (port_1 == NULL || port_2 == NULL) {
printf("oops...no audio ports\n");
exit(-1);
}
/* set up the output sample buffers */
for (i = 0; i < 10000; i++) {
buf_1[i] = i;
buf_2[i] = -i;
zilch[i] = 0;
}
/* bring the output ports out of underflow state */
ALwritesamps(port_1, zilch, 10000);
ALwritesamps(port_2, zilch, 5000);
ALgetframenumber(port_1, &count_1);
ALgetframenumber(port_2, &count_2);
/* count_1 should be > count_2 at this point */
delta_count = count_1 - count_2;
printf("frame count difference = %lld\n", delta_count);
/* write delta_count frames of zeroes to port_2 */
ALwritesamps(port_2, zilch, delta_count*2);
ALgetframenumber(port_1, &count_1);
ALgetframenumber(port_2, &count_2);
delta_count = count_1 - count_2;
printf("frame count difference = %lld\n", delta_count);
while (1) {
ALwritesamps(port_1, buf_1, 10000);
ALwritesamps(port_2, buf_2, 10000);
ALgetframenumber(port_1, &count_1);
ALgetframenumber(port_2, &count_2);
if (count_1 != count_2) {
printf("lost synchronization of audio port.\n");
}
}
ALcloseport(port_1);
ALcloseport(port_2);
}
|
Relating Audio Sample Frame Count to UST
The IRIS digital media libraries provide a time line called unadjusted system time (UST) for relating media to one another. The UST is a 64-bit count of the number of nanoseconds elapsed since the workstation operating system was started.
The AL provides a way for application programs to relate the number of audio sample frames input to or output from a device to UST values, by providing a pair of values (UST, sample frame count) simultaneously. The UST value is the time when the samples in the frame entered the audio device (in the case of input) or exited the audio device (in the case of output). That is, the UST is the time at which the samples physically “hit the jacks.” The audio system software accounts for any latency in hardware and intermediate buffering.
The AL function ALgetframetime() provides both UST and sample frame count for an audio port (ALport) to an application. The function prototype for ALgetframetime() is:
int ALgetframetime(const ALport port,
unsigned long long *fnum,
unsigned long long *ustime);
|
where:
| port | | is the audio port of interest
| | ustime | | is a pointer to a 64-bit number to hold the value of UST
| | fnum | | is a pointer to a 64-bit number to hold the value of sample frame count
|
If ALgetframetime() succeeds, it returns 0 to the application; otherwise, it returns a -1 and sets an error number which can be retrieved with oserror(3C).
When an application program calls the ALgetframetime() function, the AL provides the most recent pair of (UST, sample frame count) that it has calculated. In general, the value of sample frame count returned by ALgetframetime() is not the same as the sample frame count value returned by ALgetframenumber(). However, a UST value corresponding to the sample frame count returned by ALgetframenumber() can be calculated from (UST, sample frame count) pairs.
Example 6-7 demonstrates calculating the UST value for the next sample to be read from an input port.
Example 6-7. Calculating UST
/* getust.c - get ustime for first sample in input port */
#include <stdio.h>
#include <audio.h>
main(void)
{
ALport port;
long long count_1, count_2, ustime_1, ustime_2;
double nrate;
nrate = 1e+9/44100.0; /* nanosecs per sample @ 44.1 kHz*/
port = ALopenport("my_input", "r", NULL);
if (port == NULL) exit(-1);
ALgetframenumber(port, (unsigned long long*)&count_2);
ALgetframetime(port, (unsigned long long*)&count_1,
(unsigned long long*)&ustime_1);
ustime_2 = ustime_1 - (count_1 - count_2)*nrate;
/* ustime_2 corresponds to the first sample frame in port */
printf("ust(1) = %lld msc(1) = %lld\n",
ustime_1, count_1);
printf("ust(2) = %lld msc(2) = %lld\n",
ustime_2, count_2);
ALcloseport(port);
}
|
This example code could have calculated the sample frame rate from multiple (UST, sample frame count) pairs and used that value instead of calculating it from the nominal audio frame rate.
 | Note: The sample frame value returned by ALgetframenumber() is valid only if the port does not overflow/underflow. In the case of underflow or overflow, the (UST, sample frame count) pair will continue to be valid (though you may wish to request a new, more recent, pair). Note, however, that two back-to-back invocations of ALgetframetime() are not guaranteed to result in unique (UST, sample frame count) pairs.
|
For a more involved use of UST and sample frame count, see recordmidi.c++
in /usr/people/4Dgifts/examples/dmedia/midi/syncrecord. This code demonstrates synchronization of audio and MIDI using the UST to relate the two streams of data and is discussed further in “Hands-On MIDI and Audio Synchronization Experience” in Chapter 10.
Real-time Programming Techniques for Audio
The Audio Library provides several routines that modify or return information about the dynamic state of an audio port. These routines, together with the select() or poll() IRIX system calls, make it possible to write applications that can multiplex audio processing tasks with other processing such as user interfaces, audio signal processing, or graphics. Other IRIX system calls, such as prctl(), schedctl(), and sproc(), also help audio applications to achieve efficient real-time performance. This section discusses these routines and techniques for using them effectively. See the online book, Topics in IRIX Programming, for a description of the IRIX real-time programming facilities.
Multiplexing Synchronous I/O
The select() system call makes it possible for an application to multiplex synchronous I/O tasks. An application passes select() three (optionally null) lists of file descriptors, along with an optional timeout parameter. select() blocks until one or more of the following conditions occur:
one or more of the file descriptors in the “read list” are ready for reading
one or more of the file descriptors in the “write list” are ready for writing
an exceptional condition is pending for one of the file descriptors in the “exception list”
a timeout occurs (if specified)
When select() returns, it replaces the original file descriptor lists with subsets containing the file descriptors for which requested events have occurred. See the select(2) man page for details.
The AL provides a mechanism to control the behavior of select() such that you can wake a process before an output queue runs out of samples or before an input sample queue overflows. The functions described in this section control the behavior of select().
Getting a File Descriptor for an ALport
ALgetfd() returns an IRIX file descriptor for a port that may be used with the select() call. Its function prototype is:
int ALgetfd ( ALport port )
|
where:
| port | | is the audio port whose file descriptor you want. This descriptor can then be used to construct the arguments for a call to select() or poll()
|
When using select(), an input port's file descriptor is used in a read fdset and an output port's file descriptor is used in a write fdset.
When using poll(), an input port's file descriptor is used with the POLLIN event flag and an output port's file descriptor is used with the POLLOUT event flag.
These select() and poll() system calls are used to give up application control of the CPU until the audio port is emptied or filled past a previously set fill point (see the description of ALsetfillpoint() below).
Setting and Getting the Fill Point for a Queue
ALsetfillpoint() allows an application to set a threshold level for an input or output port that controls the behavior of the select() function. Its function prototype is:
int ALsetfillpoint ( ALport port, long fillpoint )
|
where:
| port | | is the audio port whose fill point you want to set
| | fillpoint | | is the fill point value, in number of samples
|
For an input port, the fill point is the number of locations in the sample queue that must be filled in order to trigger a return from select(). For an output port, the fill point is the number of locations that must be free in order to wake up from select().
When used in conjunction with ALgetfd() and select() or poll(), ALsetfillpoint() lets you programmatically relinquish control from an audio application to other processes.
 | Note: ALreadsamps() and ALwritesamps() may alter the fill point, so you should (re)set it just before you call select() or poll().
|
ALgetfillpoint() returns the current fill point of a port. Its function prototype is:
long ALgetfillpoint ( ALport port )
|
where:
| port | | is the audio port being queried
|
Figure 6-4 shows how the relationship between the number of samples and the fill point affects the behavior of the select() call during input and output.
Using Scheduling Control to Give Audio High Priority
IRIX provides control of process scheduling through the use of the schedctl() function. This function allows the program to change its execution priority. To maintain real-time audio processing, an application may need to be placed at a high priority relative to other jobs in the system. See the schedctl(2) manual page and for more information on usage. See “Using Shared Arenas and Semaphores” for an example program that demonstrates how to use schedctl().
Preventing Memory Swapout
prctl() is an IRIX function that gives you control of certain attributes of a process. By using the PR_RESIDENT argument, you can make your audio process immune to kernel memory swapout, thus helping to ensure uninterrupted audio input and output. See the prctl(2) man page for more details.
You can also use mpin() or plock() to lock user pages into memory. See the man pages for those functions for more information.
Creating Multiple Process Threads
The sproc() system call lets you split a process into two threads. sproc() is an IRIX system call similar to fork(), except that it makes use of shared memory. The shared memory features of sproc() allow sharing of data, file descriptors, and address space between the two process threads. When using sproc() in an application with audio, you can create one thread that services audio and another thread that handles the user interface. Using sproc() permits the use of graphical user interfaces without interrupting the audio data stream. See “Using Shared Arenas and Semaphores” for an example program that demonstrates how to use sproc() in conjunction with an IRIS IM menu (IRIS IM is Silicon Graphics' port of the industry-standard OSF/Motif).
Using Shared Arenas and Semaphores
Another real-time programming technique is to use an IRIX shared arena. In essence, a shared arena is a memory-mapped file that you can access just like regular memory.
This section provides some hints for working with shared arenas; more information is available in Topics in IRIX Programming.
Shared arenas allow:
shared memory between unrelated processes
shared synchronization tools: locks for controlling access, and semaphores for process communication
Create a shared arena by calling usinit(). (The “us” prefix stands for user space.) The first process that calls usinit() creates an arena with the given file name; subsequent calls to usinit() invoking the same file name attach to the existing arena.
Using shared memory can create data dependency situations such as different process writing to the same memory location at the same time, or one process trying to read from a memory location before another has finished writing to that location. Areas where a potential data dependency exists are called critical regions.
Critical regions can be protected with locks, which keep trying until access is gained, or semaphores, which sleep until access is available. Semaphores can be used to allow multiple processes into a critical region at the same time. Processes waiting on a semaphore are queued on a first-come, first-served basis. To acquire (decrement) a semaphore, call uspsema(); to release (increment) call usvsema(). When uspsema causes the semaphore count to go negative, the process will block until some other process calls usvsema().
The motifexample.c
program in /usr/people/4Dgifts/examples/dmedia/audio demonstrates the Audio Library programming concepts presented in this chapter and some Audio File programming concepts that are discussed in Chapter 7, “Programming with the Audio File Library.”
Several real-time programming techniques are used in motifexample.c:
The sproc() system call creates two separate threads: a user interface thread, and an audio thread. The PR_SALL argument specifies the sharing of all data. Everything that pertains to handling audio is kept in the separate audio process.
Scheduling control gives the audio process high-priority, nondegrading scheduling.
Memory swapout is prevented by using mpin() to lock samples in memory.
A shared memory arena is used to share data.
Semaphores provide interprocess communication for handling commands from the application.
Polling is used to monitor two kinds of events: commands from the application and the need for more samples in the queue.
IRIS Digital Media Programming Guide
(document number: 007-1799-040 / published: 1994-11-14)
table of contents | additional info | download
Front Matter
About This Guide
Part I. Digital Media Programming
Part II. Digital Audio and MIDI Programming
Part III. Video Programming
Part IV. IndigoVideo Programming
Part V. Compression Programming
Part VI. Movie Programming
Appendix A. Audio Specifications
Appendix B. Aware Scalable Audio Compression Software
Glossary
Chapter 33.
Glossary
Chapter 34.
Glossary
Index
home/search |
what's new |
help
|