USD Schemas#

omniAudioSchema#

OmniSound#

The Sound primitive defines the parameters for spatial or non-spatial audio playback from an audio asset.

The spatial effects are calculated based on the position given by the UsdGeomXformable base class, as well as some of the additional properties add by this schema. These prims should be attached to the object that’s emitting the audio, if applicable, so that the sound will not have to duplicate the animation of that object. The startTime and endTime attributes respect layer offsets and time scaling specified by layers. The time scale specified by layers does not affect audio playback speed; if an alteration to the playback speed is desired, there is a timeScale parameter that can be adjusted. Setting a negative time scale on the layer will not result in the audio being played in reverse.

filePath#

The file path to the sound that will be played. Omniverse Kit supports the following formats:

Vorbis (.ogg/.oga)
Opus (.ogg/.oga)
FLAC (.flac or .ogg/.oga)
MP3 (.mp3) It is recommended to use Vorbis or Opus instead of MP3 where possible.
WAVE (.wav/.wave) Supported WAVE formats:
8 bit unsigned PCM
16, 24, 32 bit signed PCM
32 bit float PCM Omniverse Kit supports up to 64 channels in sound assets.

auralMode#

This chooses whether the sound has spatial effects, such as distance attenuation, directionality, Doppler effects, etc. This must be one of the following values:

nonSpatial: The sound will play with no additional effects inherited from its position and velocity in 3D space. The intended usage of this is for non-spatial effects, such as music, narration and some types of ambient sound. Multi-channel audio will be played directly through to the audio device, if possible; otherwise it will be downmixed as determined appropriate by the application.
spatial: The sound will play with some effects due to its position and velocity in 3D space. Mono sounds are recommended for spatial audio as multiple channels will be effectively playing from the same point in 3D space.

enableDoppler#

choose whether the Doppler effect is applied to this sound. The Doppler effect alters the pitch of a sound based on its relative velocity to the listener. When the listener and a sound are moving toward each other, the sound will be played at a greater speed, and when they are moving away from each other, the sound will be played at a lower speed. Enabling the Doppler effect will increase the CPU cost of audio processing. This value is ignored if auralMode is set to “nonSpatial” This must be one of the following values:

default: The state of this is inherited from the active Listener in the scene.
on: The Doppler effect is applied.
off: The Doppler effect is not applied.

enableDistanceDelay#

Choose whether a distance delay effect is applied to this sound. The distance delay will cause the start time of the sound to be delayed relative to the distance between the sound and the listener. The delay is calculated based on the current speed of sound; for example, if the listener and sound are separated by 340 meters and the speed of sound is 340m/s, the delay will be 1 second. Enabling this will increase the CPU cost of audio processing. This value is ignored if auralMode is set to “nonSpatial” This must be one of the following values:

default: The state of this is inherited from the active Listener in the scene.
on: The distance delay effect is applied.
off: The distance delay effect is not applied.

enableInterauralDelay#

Choose whether an interaural delay effect is applied to this sound. This causes the sound to arrive slightly later on one ear than the other, based on the sound’s angle relative to the listener’s front. This simulates the real world effect of the delay between the right and left ear when audio arrives at an angle to the head, which should improve the quality of the directional audio effect. Omniverse Kit only will apply this effect on mono spatial sounds when the speaker configuration is stereo. Note that this effect may cause slight audio distortion when an emitter’s relative angle to the listener changes; this is only obvious when playing pure sine waves while the emitter’s relative angle to the listener changes rapidly. Enabling this will increase the CPU cost of audio processing. This value is ignored if auralMode is set to “nonSpatial” This must be one of the following values:

default: The state of this is inherited from the active Listener in the scene.
on: The interaural delay effect is applied.
off: The interaural delay effect is not applied.

loopCount#

The number of additional loops of this sound that will be played. If this is 0, the sound will play once; If this is 1, the sound will play twice; etc. Negative values will cause the sound to loop infinitely.

mediaOffsetStart#

The time offset to start playing the sound asset at. This is a time value measured in seconds. For example, setting this to 0.2 will skip the first 0.2 seconds of the sound asset being played. Setting this to a value less than 0 or greater than the length of the sound asset will cause the sound to not be played. Omniverse Kit will automatically apply a 20ms fade-in effect on the sound if this value is not set to 0, so that a pop will not occur if the offset does not correspond to a zero-crossing in the sound asset.

mediaOffsetEnd#

The time offset in the sound asset that the sound will finish playing at. This is a time value measured in seconds. The time value is relative to the start of the asset. For example, a sound with mediaOffset 2 and mediaOffsetEnd 12 will play for 10 seconds. This value is ignored if it is less than or equal to mediaOffset.

startTime#

The time in the animation timeline when this sound will be played. If this is negative, the sound will never be played (this functionality may be useful for sounds intended to be triggered from scripts). This can be converted to seconds by dividing by the timeCodesPerSecond value of the stage that this is within. Layer offsets will be applied because this is a timecode.

endTime#

The time in the animation timeline when this sound will end. If this value is less than or equal to startTime, then the sound will play until it naturally ends (or the animation timeline ends). timeScale has no effect on the end time, since it’s relative to the timeline. This can be converted to seconds by dividing by the timeCodesPerSecond value of the stage that this is within. Layer offsets will be applied because this is a timecode.

priority#

The priority of the sound. In Omniverse Kit, there is a limit on the number of sounds that can be played simultaneously. In a very busy scene which exceeds this number of sounds, some sounds will stop being played to the audio device (this is referred to as the sound being virtual). Priority can be used to specify that some sounds are not very important so they should be made virtual first and that some sounds are important so they should be made virtual last. A larger value corresponds to a higher priority and a lesser value corresponds to a lower priority; negative values are allowed. Priority values have no meaning by themselves; their only usage is to specify their priority relative to other sounds. For example, 2 sounds with 0 priority and 1 sound with 1 priority will function the same as 2 sounds with -100 priority and 1 sound with 100 priority.

attenuationType#

The curve that the distance attenuation will follow. This is only used when auralMode is ‘spatial’. These are the curves that the tokens calculate:

inverse: (attenuationRange.y * K) / (1.0 + distance - attenuationRange.x)
linear: (distance - attenuationRange.x) / (attenuationRange.y - attenuationRange.x)
linearSquare: (distance - attenuationRange.x) / (attenuationRange.y - attenuationRange.x)^2

The inverse rolloff curve is implemented as a polynomial approximation of the equation above to ensure that volume is attenuated to 0.0 when distance reaches attenuationRange.y. The constant K in the inverse calculation exists to specify that attenuationRange.y is not the 0.5 volume point on the curve.

attenuationRange#

The range at which distance attenuation will occur. This is only enabled when auralMode is ‘spatial’. When the listener’s distance from the sound is less than attenuationRange.x, the sound will be played at full volume. When the listener’s distance from the sound is between attenuationRange.x and attenuationRange.y, the sound will ramp down to 0.0 volume as the distance grows. When the listener’s distance from the sound is past attenuationRange.y, the sound will be silent. attenuationRange.y must be greater than attenuationRange.x.

gain#

The volume of the sound. This is a unitless linear volume scale where 0.0 is silence and 1.0 is full volume. Setting gain above 1.0 will amplify the signal but potentially result in some distortion during playback. Negative volumes will invert the signal.

timeScale#

The rate at which the sound is played relative to realtime. For example, setting this to 0.5 will play the sound at half speed and setting this to 2.0 will play the sound at double speed. Altering the playback speed of a sound will affect the pitch of the sound. This does not affect the timing of the distance delay effect. The limits of this setting under Omniverse Kit are [1/1024, 1024]. Omniverse Kit does not perform any form of antialiasing filter when applying this effect, so increasing this setting excessively will cause aliasing in the resulting audio.

coneAngles#

The angles used for the directional cone of a spatial sound’s audio. This is used to simulate sound sources that are directional, so the listener will hear the sound normally when standing in front of the sound source, but the listener will hear a quieter and muffled version of the sound when standing behind or at a wide angle from the sound source.

This is only enabled when auralMode is ‘spatial’. A cone defines the angle from the forward vector where the sound’s volume begins to attenuate. The cone has two angles: the inner angle (coneAngles.x) and the outer angle (coneAngles.y). The inner angle must be less than or equal to the outer angle. An omnidirectional emitter should have these set to (180.0, 180.0). When the listener is within the inner angle of this cone, the sound’s volume will be at coneVolumes.x and the sound’s low pass filter effect will be at coneLowPassFilter.x. Between the inner angle and the outer angle of the cone, the volume will ramp down to coneVolumes.y and the sound’s low pass filter effect will be ramp down to coneLowPassFilter.y. When the listener is past the outer angle of the cone, the volume will be at coneVolumes.y and the sound’s low pass filter effect will be at coneLowPassFilter.y. The cone angles are specified as degrees from the forward vector of the sound; these are clamped to the range of [0, 180].

coneVolumes#

The volume in the outer region of the cone. This is a volume modifier for cone calculations. This multiplies by the gain. coneVolumes.x is the volume for the inner cone. coneVolumes.y is the volume for the outer cone.

coneLowPassFilter#

The low pass filter effect that is applied onto the cone. This is a unitless range from 0.0 (disabled) to 1.0 (maximum filtering). An increase in low pass filtering will result in the sound being more muffled sounding, which can simulate the effect of hearing the sound mostly through reverberation. coneLowPassFilter.x is the low pass filter value for the inner cone. coneLowPassFilter.y is the low pass filter value for the outer cone.

Sound#

Deprecated. Use OmniSound instead.

OmniListener#

The Listener primitive defines the parameters for audio played to the audio device by the application viewing the USD scene. The most important role for the Listener is to have a position in 3D space so that the spatial effects can be applied correctly. This is separate from the camera because in some cases the camera does not play this role suitably. An example of a case where a listener detached from the camera is ideal would be a third person game; attaching the listener to the camera would cause undesirable effects like applying a doppler shift when the camera zooms in and attenuation of sounds that are near the character being played.

orientationFromView#

This specifies whether the Listener’s orientation should be taken from the current camera, rather than the xform of this Listener. It may be desirable to have the spatial audio listener’s position separate from the camera (ie: on a third person character or object), but have the orientation still come from the camera; having the orientation also attached to the world object may be disorienting to the user/viewer. An extreme example where this would be needed is a third person game where the player character is a marble that rolls around.

coneAngles#

The angles used for the directional cone of a listener’s hearing. A cone defines the angle from the forward vector where the listener’s volume begins to attenuate. The cone has two angles: the inner angle (coneAngles.x) and the outer angle (coneAngles.y). The inner angle must be less than or equal to the outer angle. An omnidirectional listener should have these set to (180.0, 180.0). When the listener is within the inner angle of this cone, the listener’s volume will be at coneVolumes.x and the listener’s low pass filter effect will be at coneLowPassFilter.x. Between the inner angle and the outer angle of the cone, the volume will ramp down to coneVolume.y and the listener’s low pass filter effect will be ramp down to coneLowPassFilter.y. When the listener is past the outer angle of the cone, the volume will be at coneVolume.y and the listener’s low pass filter effect will be at coneLowPassFilter.y. The cone angles are specified as degrees from the forward vector of the listener; these are clamped to the range of [0, 180].

coneVolumes#

The volume in the regions of the cone. coneVolumes.x is the volume for the inner cone. coneVolumes.y is the volume for the outer cone.

coneLowPassFilter#

The low pass filter effect that is applied onto the cone. This is a unitless range from 0.0 (disabled) to 1.0 (maximum filtering). An increase in low pass filtering will result in the audio being more muffled sounding, which can simulate the effect of hearing sounds mostly through reverberation.

Listener#

Deprecated. Use OmniListener instead.