Audio Renderer API

Matthew Leditschke
(Matthew.Leditschke@cselt.it)

Aim

This document presents a simple AudioRenderer API which will be available on a number of platforms.

As a contrast to the AudioRenderer used in the IM1 player shown at Stockholm, this AudioRenderer runs in its own thread, asking audio nodes for data when required. This should hopefully address some of the problems with the previous AudioRenderer, such as repeated sounds when the processor load increases, and also enables smaller buffers to be used.

This API does not yet cover all of the required functionality for playing audio in an MPEG-4 terminal. It is hoped that this basic API can be successfully extended to handle the required functionality.

Overview

AudioSourceProxy nodes register themselves with the AudioRenderer, calling the AddSourceNode() method of the AudioRenderer. It is anticipated that this will be called when the node's PreRender() method is first called. This function also tells the AudioRenderer when to start playing this audio, and the format of the audio (sample rate, bits per sample and the number of channels).

Currently the format of an audio sample consists of a number of mono samples one after each other, one for each of the channels being played. Each mono sample is assumed to be an unsigned char for 8 bits/sample, and a signed short integer for 16 bits/sample (using the machines native byte order). [This may need to be extended to handle different byte ordering.]

The AudioRenderer runs in a loop, every N milliseconds updating the contents of its output buffers. To get more data from a particular node, it calls the GetData() method of the node. DataUsed() is used by the AudioRenderer to signal that the data returned by GetData() has been finished with.

Implementations available

The AudioRenderer is defined in terms of an abstract base class which defines the interfaces, and classes which derive from this to obtain actual platform specific implementations. Currently the following AudioRenderers are available:
class AudioRendererDS
Uses DirectSound under Windows 95/NT.
class AudioRendererSgi
For Silicon Graphics computers.
Consult the documentation in each directory for information on the specifics of each implementation.

Audio Renderer Methods

The audio renderer exposes the following methods. Some of these are in accordance with the IM-1 API document, others are used by audio source proxy nodes to have their assoicated audio data played.

    virtual void Init()
Initialise the renderer. This will be called once for every file that is opened.

    virtual void Terminate()
Stops rendering and performs cleanups, including terminating the audio renderer thread. This will be called once for every file that is closed.

    virtual void Start()
This function is called once all of the audio nodes in the scene have been known to have registered themselves with the audio renderer. It is anticipated that this will be called after the scene tree is first rendered, during which the audio source nodes will have registered themselves. This method will also start the main system clock running.

    virtual void AddSourceNode(AudioSourceProxy* pSourceNode,
                               DWORD startTime, DWORD nSamplesPerSec,
                               DWORD nBitsPerSample, WORD nNumChannels)
Add the given AudioSource node to the list of audio source nodes which need to be played. If a node is already in the list, then just leave it there. The time at which the audio is to start playing is also given, along with the audio data format.

It is intended that this method will be called when PreRender() is first called on this proxy node.

Audio Source Proxy Node Methods

The audio renderer also requires two functions to be present in the audio source proxy node. These functions are used to fetch data to be played, and are as follows:
    void GetData(DWORD nBytesRequired, LPBYTE& rpData, DWORD& rnBytesGiven)
This function is used by the audio renderer to get more data to play. The audio render will ask for a given number of bytes (which will be a multiple of the number of bytes required for a single sample), and the audio source node will return a pointer to some data and the number of bytes of data pointed to.

The number of bytes returned can be less than that asked for. If fewer bytes were returned to the audio renderer than expected, then the audio renderer will ask for more. If zero bytes are returned, then the audio renderer will assume that no data can be given, and will insert silence.

    void DataUsed(LPBYTE pData, DWORD nBytesGiven)
This function is used by the audio renderer to indicate that it has finished with some data given to it by GetData(). The parameters given to this function will be the same as those returned by GetData().