Adjust Performance#

Performance is critical to a great looking and feeling experience. Our SDK’s take advantage of reprojection to ensure a comfortable immersive experience when latency is introduced. However we’re not immune to latency, and need to try and keep a full round trip of a frame from client to server under 80ms. In this section, we’ll take a look at how we diagnose the three common areas we need to extract performance from:

Omniverse RTX Performance#

This section provides notes on using the profiler and adjusting common settings to maximize performance.

Profiler#

Kit includes a general purpose profiler tool that measures the CPU and GPU performance of your stage.

To open the profiler:

  1. Navigate to the Window menu, or press F8.

  2. Enable the GPU Profiler

  3. Sort by Time to see the GPU time of your scene render is being spent.

Looking at the profiler can give you an indication of whether or not tasks on the CPU or the GPU are costing you the most time, things like animation will consume CPU resources.

_images/profiler-cpu.png

On a more complex scene, the profiler can give you insight into what areas of the renderer have the highest cost, such as reflections, translucency and reflections.

_images/profiler-trans-refl.png

Adjusting Render Settings#

You can adjust the quality and thus performance of your scene by adjusting settings for reflections, translucency, and lighting.

Reflections can be tuned by increasing or decreasing the number of bounces, and the roughness threshold for when to kick on the reflection cache. Tune these settings for the desired quality of reflections you require, but try to avoid using too many unnecessary bounces.

_images/refl-settings.png

Translucency can be tuned by reducing the refraction depth to the minimum number needed for your assets to work as intended, for example, see through a windshield and out the back of a car, you may need 3 refraction bounces if the glass is single sided, and 5 if it is double sided.

_images/trans-settings.png

Additional settings like fractional cutout opacity and rough translucency sampling may add additional cost to your render time, use them only when needed.

For lighting, we recommend sticking to a single sample, and only using indirect diffuse lighting when needed.

Sometimes you’ll want different render settings between Portal and Volume contexts, for example you may need indirect lighting in portal mode, but not in volume mode. You can use ActionGraph and the Session State to adjust specific render settings with Python using any logic you wish, such as context mode, or even a variant.

Here is an example of how we are setting render settings via Python in an ActionGraph Script node.

def compute(db: og.Database):
    import carb
    carb.settings.get_settings().set("/rtx/matteObject/enabled", True)
    carb.settings.get_settings().set("/rtx/post/matteObject/enableShadowCatcher", 1)
    carb.settings.get_settings().set("/rtx/post/histogram/enabled", 0)
    carb.settings.get_settings().set("/rtx/post/backgroundZeroAlpha/enabled", 1)
    carb.settings.get_settings().set("/rtx/material/translucencyAsOpacity", 1)
    carb.settings.get_settings().set("/rtx/raytracing/fractionalCutoutOpacity", 1)
    return True

Here’s how to format these settings:

  1. Identify the setting from the list of RTX Real-Time parameters.

  2. Grab the setting, for example: /rtx/indirectDiffuse/maxBounces

  3. Referencing the value type from the above guide, format it into a carb.setting like so:

    carb.settings.get_settings().set("/rtx/indirectDiffuse/maxBounces", 2)
    

Network and Internet Performance#

CloudXR requires the following ports to be open:

UDP

47998, 47999, 48000, 48002, 48005, 48008, 48012

TCP

48010

Additionally, the following addresses are used to authenticate with GDN, so ensure no firewall rules exist that might block the following URLs:

Captcha Endpoint

api-prod.nvidia.com/gfn-als-app/api/captcha/img

Nonce Endpoint

api-prod.nvidia.com/gfn-als-app/api/als/nonce

Account Linking

als.geforcenow.com

Built into the sample client is a Heads-up display (HUD) that can show you real-time performance of your client and network. In the Purse configurator you can enable and configure it from the UI here:

_images/hud.png _images/hud2.png _images/hud3.png _images/hud4.png

Then you can use the ellipsis menu to see the HUD:

_images/hud5.png

The top graph gives an indication of the client and server performance. The Pose rate is the client’s main-loop frequency, and should be very close to the device’s refresh rate (90-100Hz, depending on region). The Frame submission to RK is the number of rendered frames per second actually being displayed on the client (and can change depending on network performance, client performance and server render performance).

The second graph indicates various round-trip latency values. The pose-to-frame receive (blue) includes network delays, and some server and client processing times. The green and red lines (frame dequeue, and frame submit) must be relatively flat for smooth frame presentation. If you’re experiencing large spikes of latency in these graphs, the first step would be to troubleshoot network issues. If you’re working off of a busy network environment such as one in an office environment, consider getting a dedicated router that is set to a free channel on the 5Ghz spectrum (at least 40Mhz bandwidth). In an optimal environment, you should be able to achieve an average latency of 60ms to 70ms.

When developing your own experience, consider keeping the HUD accessible from your UI so you can troubleshoot performance issues during development.

If the HUD isn’t readily available in your custom app you should still be able to get a lot of important information by filtering the Xcode logs. When the user experience isn’t quite what it should be there are several stats in these logs that might help you identify what part of the pipeline is having issues. Use the following log filters to isolate each of the following stats (example log snippets below are from running the Purse sample in the simulator over ethernet, not WiFi):

Pose rate#

rr - [pose rate

This stat measures the rate at which poses are sent from the Vision Pro client to the Omniverse render server. In the simulator this should be 60 Hz, in the Vision Pro device it should be 90 Hz or at least close to 90 Hz. If this rate is significantly below these targets, it indicates that your sample app is likely doing a lot of work on the main thread. This might be limiting the rate at which RealityKit is allowing your application both to submit new poses and update the scene itself.

RR - [Pose Rate Logger]: 60.00 Hz - [mean 16.67 ms, min 15.82 ms, max 17.66 ms, diff 1.83 ms]
RR - [Pose Rate Logger]: 60.00 Hz - [mean 16.67 ms, min 16.13 ms, max 17.13 ms, diff 1.00 ms]
RR - [Pose Rate Logger]: 60.00 Hz - [mean 16.67 ms, min 16.32 ms, max 17.09 ms, diff 0.77 ms]

Complete Frame#

rr - [complete frame

This stat measures the rate at which frames are coming back from the Omniverse render server to the Apple Vision Pro client. Ideally this should match the frame rate seen in your stage inside Omniverse. In the simulator this can often be as high as 60 Hz for simple scenes or 30 Hz for more complex scenes. On the device it is possible to hit 90 Hz for trivial scenes, but for production scenes your target should be 45 Hz to balance visual fidelity with the best user experience. Dipping down to 30 Hz intermittently is OK, but ideally, you’ll want to be hitting 45 Hz for an optimal end user experience. If there is a large discrepancy between the Complete Frame rate and the frame rate observed inside Omniverse, this might indicate that many frames are being dropped due to poor network bandwidth.

RR - [Complete Frame]: 60.00 Hz - [mean 16.67 ms, min 16.36 ms, max 17.01 ms, diff 0.66 ms]
RR - [Complete Frame]: 60.00 Hz - [mean 16.67 ms, min 16.33 ms, max 17.11 ms, diff 0.78 ms]
RR - [Complete Frame]: 59.49 Hz - [mean 16.81 ms, min 16.21 ms, max 33.22 ms, diff 17.01 ms]

Pose to Present Latency#

rr - [pose to present

This stat tells you about the latency from initial pose capture to presenting RealityKit with the updated frame information. On a local network you would hope to be less than 100 ms, although on GDN this will likely be closer to 200 ms. In general, the lower the pose to present latency the less streaking artifacts you’ll notice between overlapping geometry at different depths. Client-side pose prediction does a good job at minimizing the negative effects of long roundtrip latencies most of the time, but there are still times when its prediction isn’t correct, which is why minimizing the roundtrip time is always desirable when possible.

RR - [Pose to Present Latency]: [mean 71.59 ms, min 70.35 ms, max 73.41 ms, diff 3.06 ms]
RR - [Pose to Present Latency]: [mean 71.72 ms, min 71.09 ms, max 73.35 ms, diff 2.26 ms]
RR - [Pose to Present Latency]: [mean 71.64 ms, min 70.11 ms, max 73.12 ms, diff 3.01 ms]

visionOS Client Performance#

If the client is not running smoothly (gestures are not smooth, 3D-model appears jittery), it could be due to SwiftUI updates, which can affect client performance.

The primary indicator of a client performance issue is the pose rate being lower than the device’s refresh rate (90Hz/100Hz depending on location).

We have typically observed this when performing frequent SwiftUI updates - UI updates should only happen on user interaction events, and any UI update logic must be quiescent when just viewing the content.

Lighting & Content Best Practices#

HDRI#

Digital Twins in Volume mode look best when they have accurate lighting of the space they’re in. Apple does not provide the environment map generated from the Vision Pro for use in Omniverse. Sébastien Lagarde has a definitive guide “An Artist-Friendly Workflow for Panoramic HDRI” (siggraph 2016) for capturing high quality HDRI’s. Alternatively, a device like a Richoh Theta Z1 with their HDRI Capture Plugin can also allow you to quickly capture HDRI’s for your experience.

We recommend tuning the intensity and color of your HDRI while wearing the device to ensure the asset looks believable.

If you view an asset in volume mode in a space that is not similar to the HDRI provided, you may create a high amount of contrast between the passthrough video feed and the HDRI. For example, if the floor of your HDRI is black, but you’re viewing the asset in a room with a white floor. This contrast can appear as an edge, or a matte line, around your asset due to the nature of alpha compositing. We recommend having a wide variety of generic HDRI with varying amounts of contrast so that you can achieve great visuals in a variety of environments.

Glass#

Compositing glass in a mixed reality context is a difficult computer graphics problem. We send depth, alpha and color to the device and attempt to allow glass to maintain its reflective and refractive properties while maintaining its sense of depth. Multiple layers of glass compound these difficulties, and can result in streaking artifacts on the device. We want you to take full advantage of raytracings’ ability to refract complex objects, so here are some suggestions for optimizing glass assets.

Thick Glass#

Thick glass denotes geometry that has multiple sides, a front pane and a rear pane. This is how glass is physically modeled in the real world, and so commonly CAD data will represent glass surfaces in this way. Thick glass involves a ray entering, passing through, and exiting the geometry to reveal what is behind the glass, three bounces in total. Keep this in mind when addressing performance, the maximum number of refractive bounces in your render settings can greatly alter the performance of your stage, with fewer bounces increasing performance.

_images/thick-glass.png

If you have thick glass, but not enough bounces, the glass may render black, or if there aren’t enough bounces for the ray to exit the glass, you’ll see an unexpected amount of distortion like the image below:

_images/thick-glass-bounces.png

Thin Walled#

One way to reduce the rendering cost of thick and thin glass objects is to use Thin Walled, on our OmniGlass material it’s in the Refraction section.

_images/thin-walled-settings.png

This treats the prim as only being single sided, and has no refractive properties: rays pass straight through. For thickglass this will remove some interesting internal refractive visual qualities at the gain of more performance and requiring fewer refraction bounces.

_images/thin-walled-glass.png

For single sided glass, this setting is required, as the more physically correct expectation is to have another surface for the ray to pass through. If left unchecked you’ll see an unexpected amount of distortion in the glass.

_images/glass-distorted.png
_images/glass-correct.png

OmniGlass#

While any PBR material can be used to create glass, we recommend the OmniGlass material when possible as it contains the minimum amount of material layers required for glass.

_images/omni-glass-menu.png

Placement#

When we view an object in tabletop mode on the device, we create an anchor on the device first, and then spawn the view from the stream. In Omniverse the coordinate systems are aligned to the origin, meaning 0,0,0 of the anchor is 0,0,0 of Omniverse. Any object you want to view in tabletop mode should be placed in the stage, either by hand or programmatically at 0,0,0 in order for it to appear in the correct location.

USD Structure#

USD compositions like variants, payloads and references can help you make your application more extensible and repeatable. Having client code that relies on USD prims to have specific names and locations can easily create a situation where clients are too specific to the data. In the Purse USD stage there are some examples of this:

Context#

We have everything we need for the portal and tabletop context modes added to the stage as references:

_images/stage-portal.png

We then have a variantSet on the context prim that toggles the visibility between the context_portal and context_tabletop prims. These prims are references to other USD stages, allowing us to compose all the different assets for each context separately from our root stage. What’s useful about this structure is that we can change out the reference to either context, or to context_portal or context_tabletop, and our scene logic will still function when changing between tabletop and portal mode.

Animation Playback#

If playing back animation in Omniverse, you should add these flags and extensions to get the best possible playback experience.

--/app/player/useFixedTimeStepping=False --/app/player/useFastMode=True --enable omni.anim.window.timeline

Capturing High Quality Recordings#

There are multiple ways to capture what the Vision Pro user is seeing:

  • They start a Recording from the Control Center.

  • They can Airplay/Share View their view to a Mac and capture that screen.

  • They can use Reality Composer Pro to perform a Developer Capture.

Developer Capture is the only way to get a high quality 4k capture of your experience, and there are some unique complexities of our SDK that require some setup to perform these captures. Developer Capture turns off foveation on the device locally, but we must also turn off foveation on the server in Omniverse. In our sample, we present this in the connect window under Resolution Prese as 4K Recording Mode:

_images/4k-recording.png

Enabling this tells Omniverse to turn off foveation in the XR settings:

_images/foveation.png

Note that you may not want to ship this mode in your final app, but only use it or enable it when capturing recordings for promotional purposes.

When we turn off foveation, we introduce latency in three places:

  1. We increase the amount of full resolution pixels Omniverse must create

  2. We increase the load on the encoder and decoder of the stream

  3. We increase the bandwidth used to deliver those pixels to the device.

This will reveal itself by creating a visible delay in the view updating in the portal and volume modes, usually with black pixels being shown outside the frustum as the camera attempts to update. To mitigate this latency, we recommend tethering the Vision Pro by way of the Developer Strap and a USB-C connection to a MacBook sharing its ethernet connection. Here is a basic diagram of how we setup for our shoots:

_images/shoot-setup.png

Note that this setup will not remove all latency, you will need to consider the speed of your head and body movements, and consider some amount of cropping of your frame in post production. Keeping your assets in a “title safe” region of your view can help. Be prepared for the Vision Pro to need a break between captures as it is prone to overheat depending on the ambient temperature.