carb::audio::VoiceParams

Defined in carb/audio/IAudioPlayback.h

Structs

struct VoiceParams

voice parameters block.

This can potentially contain all of a voice’s parameters and their current values. This is used to both set and retrieve one or more of a voice’s parameters in a single call. The fVoiceParam* flags that are passed to setVoiceParameters() or getVoiceParameters() determine which values in this block are guaranteed to be valid.

Public Members

PlaybackModeFlags playbackMode

flags to indicate how a sound is to be played back.

These values is valid only when the fVoiceParamPlaybackMode, fVoiceParamMute, or fVoiceParamPause flags are used. This controls whether the sound is played as a spatial or non-spatial sound and how the emitter’s attributes will be interpreted (ie: either world coordinates or listener relative).

float volume

the volume level for the voice.

This is valid when the fVoiceParamVolume flag is used. This should be 0.0 for silence or 1.0 for normal volume. A negative value may be used to invert the signal. A value greater than 1.0 will amplify the signal. The volume level can be interpreted as a linear scale where a value of 0.5 is half volume and 2.0 is double volume. Any volume values in decibels must first be converted to a linear volume scale before setting this value. The default value is 1.0.

struct carb::audio::VoiceParams::VoiceParamBalance balance

Non-spatial sound positioning parameters.

float frequencyRatio

the frequency ratio for a voice.

This is valid when the fVoiceParamFrequencyRatio flag is used. This will be 1.0 to play back a sound at its normal rate, a value less than 1.0 to lower the pitch and play it back more slowly, and a value higher than 1.0 to increase the pitch and play it back faster. For example, a pitch scale of 0.5 will play back at half the pitch (ie: lower frequency, takes twice the time to play versus normal), and a pitch scale of 2.0 will play back at double the pitch (ie: higher frequency, takes half the time to play versus normal). The default value is 1.0.

On some platforms, the frequency ratio may be silently clamped to an acceptable range internally. For example, a value of 0.0 is not allowed. This will be clamped to the minimum supported value instead.

Note that the even though the frequency ratio can be set to any value in the range from 1/1024 to 1024, this very large range should only be used in cases where it is well known that the particular sound being operated on will still sound valid after the change. In the real world, some of these extreme frequency ratios may make sense, but in the digital world, extreme frequency ratios can result in audio corruption or even silence. This happens because the new frequency falls outside of the range that is faithfully representable by either the audio device or sound data itself. For example, a 4KHz tone being played at a frequency ratio larger than 6.0 will be above the maximum representable frequency for a 48KHz device or sound file. This case will result in a form of corruption known as aliasing, where the frequency components above the maximum representable frequency will become audio artifacts. Similarly, an 800Hz tone being played at a frequency ratio smaller than 1/40 will be inaudible because it falls below the frequency range of the human ear.

In general, most use cases will find that the frequency ratio range of [0.1, 10] is more than sufficient for their needs. Further, for many cases, the range from [0.2, 4] would suffice. Care should be taken to appropriately cap the used range for this value.

int32_t priority

the playback priority of this voice.

This is valid when the fVoiceParamPriority flag is used. This is an arbitrary value whose scale is defined by the host app. A value of 0 is the default priority. Negative values indicate lower priorities and positive values indicate higher priorities. This priority value helps to determine which voices are the most important to be audible at any given time. When all buses are busy, this value will be used to compare against other playing voices to see if it should steal a bus from another lower priority sound or if it can wait until another bus finishes first. Higher priority sounds will be ensured a bus to play on over lower priority sounds. If multiple sounds have the same priority levels, the louder sound(s) will take priority. When a higher priority sound is queued, it will try to steal a bus from the quietest sound with lower or equal priority.

float spatialMixLevel

the spatial mix level.

This is valid when fVoiceParamSpatialMixLevel flag is used. This controls the mix between the results of a voice’s spatial sound calculations and its non-spatial calculations. When this is set to 1.0, only the spatial sound calculations will affect the voice’s playback. This is the default when state. When set to 0.0, only the non-spatial sound calculations will affect the voice’s playback. When set to a value between 0.0 and 1.0, the results of the spatial and non-spatial sound calculations will be mixed with the weighting according to this value. This value will be ignored if fPlaybackModeSpatial is not set. The default value is 1.0. Values above 1.0 will be treated as 1.0. Values below 0.0 will be treated as 0.0.

fPlaybackModeSpatialMixLevelMatrix affects the non-spatial mixing behavior of this parameter for multi-channel voices. By default, a multi-channel spatial voice’s non-spatial component will treat each channel as a separate mono voice. With the fPlaybackModeSpatialMixLevelMatrix flag set, the non-spatial component will be set with the specified output matrix or the default output matrix.

float dopplerScale

the Doppler scale value.

This is valid when the fVoiceParamDopplerScale flag is used. This allows the result of internal Doppler calculations to be scaled to emulate a time warping effect. This should be near 0.0 to greatly reduce the effect of the Doppler calculations, and up to 5.0 to exaggerate the Doppler effect. A value of 1.0 will leave the calculated Doppler factors unmodified. The default value is 1.0.

struct carb::audio::VoiceParams::VoiceParamOcclusion occlusion

Occlusion factors for a voice.

EmitterAttributes emitter

the attributes of the emitter related for this voice.

This is only valid when the fVoiceParamEmitter flag is used. This includes the emitter’s position, orientation, velocity, cone, and rolloff curves. The default values for these attributes are noted in the EmitterAttributes object. This will be ignored for non-spatial sounds.

const float *matrix

the channel mixing matrix to use for this Voice.

The rows of this matrix represent each output channel from this Voice and the columns of this matrix represent the input channels of this Voice (e.g. this is a inputChannels x outputChannels matrix). The output channel count will always be the number of audio channels set on the Context. Each cell in the matrix should be a value from 0.0-1.0 to specify the volume that this input channel should be mixed into the output channel. Setting negative values will invert the signal. Setting values above 1.0 will amplify the signal past unity gain when being mixed.

This setting is mutually exclusive with balance; setting one will disable the other. This setting is only available for spatial sounds if fPlaybackModeSpatialMixLevelMatrix if set in the playback mode parameter. Multi-channel spatial audio is interpreted as multiple emitters existing at the same point in space, so a purely spatial voice cannot have an output matrix specified.

Setting this to nullptr will reset the matrix to the default for the given channel count. The following table shows the speaker modes that are used for the default output matrices. Voices with a speaker mode that is not in the following table will use the default output matrix for the speaker mode in the following table that has the same number of channels. If there is no default matrix for the channel count of the Voice, the output matrix will have 1.0 in the any cell (i, j) where i == j and 0.0 in all other cells.

Channels

Speaker Mode

1

kSpeakerModeMono

2

kSpeakerModeStereo

3

kSpeakerModeTwoPointOne

4

kSpeakerModeQuad

5

kSpeakerModeFourPointOne

6

kSpeakerModeFivePointOne

7

kSpeakerModeSixPointOne

8

kSpeakerModeSevenPointOne

10

kSpeakerModeNinePointOne

12

kSpeakerModeSevenPointOnePointFour

14

kSpeakerModeNinePointOnePointFour

16

kSpeakerModeNinePointOnePointSix

It is recommended to explicitly set an output matrix on a non-spatial Voice if the Voice or the Context have a speaker layout that is not found in the above table.

void *ext = nullptr

reserved for future expansion.

This must be set to nullptr.

struct VoiceParamBalance

non-spatial sound positioning parameters.

These provide pan and fade values for the voice to give the impression that the sound is located closer to one of the quadrants of the acoustic space versus the others. These values are ignored for spatial sounds.

Public Members

float pan

sets the non-spatial panning value for a voice.

This value is valid when the fVoiceParamBalance flag is used. This is 0.0 to have the sound “centered” in all speakers. This is -1.0 to have the sound balanced to the left side. This is 1.0 to have the sound balanced to the right side. The way the sound is balanced depends on the number of channels. For example, a mono sound will be balanced between the left and right sides according to the panning value, but a stereo sound will just have the left or right channels’ volumes turned down according to the panning value. This value is ignored for spatial sounds. The default value is 0.0.

Note that panning on non-spatial sounds should only be used for mono or stereo sounds. When it is applied to sounds with more channels, the results are often undefined or may sound odd.

float fade

sets the non-spatial fade value for a voice.

This value is valid when the fVoiceParamBalance flag is used. This is 0.0 to have the sound “centered” in all speakers. This is -1.0 to have the sound balanced to the back side. This is 1.0 to have the sound balanced to the front side. The way the sound is balanced depends on the number of channels. For example, a mono sound will be balanced between the front and back speakers according to the fade value, but a 5.1 sound will just have the front or back channels’ volumes turned down according to the fade value. This value is ignored for spatial sounds. The default value is 0.0.

Note that using fade on non-spatial sounds should only be used for mono or stereo sounds. When it is applied to sounds with more channels, the results are often undefined or may sound odd.

struct VoiceParamOcclusion

the occlusion factors for a voice.

This is valid when the fVoiceParamOcclusionFactor flag is used. These values control automatic low pass filters that get applied to the spatial sounds to simulate object occlusion between the emitter and listener positions.

Public Members

float direct

the occlusion factor for the direct path of the sound.

This is the path directly from the emitter to the listener. This factor describes how occluded the sound’s path actually is. A value of 1.0 means that the sound is fully occluded by an object between the voice and the listener. A value of 0.0 means that the sound is not occluded by any object at all. This defaults to 0.0. This factor multiplies by EntityCone::lowPassFilter, if a cone with a non 1.0 lowPassFilter value is specified. Setting this to a value outside of [0.0, 1.0] will result in an undefined low pass filter value being used.

float reverb

the occlusion factor for the reverb path of the sound.

This is the path taken for sounds reflecting back to the listener after hitting a wall or other object. A value of 1.0 means that the sound is fully occluded by an object between the listener and the object that the sound reflected off of. A value of 0.0 means that the sound is not occluded by any object at all. This defaults to 1.0.