Client & Server Overview#

This section is a general overview of some of the basic client and server functionality we’ve built for the Purse Configurator. Use it as reference for building your own application, and to establish best practices for maintaining client and server session synchronization.

Client Session Functions#

Clients use CloudXRSession to configure, initiate, pause, resume, and end a session with the server. In addition, it maintains the current session state for the client.

Session life cycle#

The following flowchart shows the life cycle of a streaming session.

As the above diagram illustrates, there are a few operations clients can perform to manage a streaming session:

configure(): The session needs to be configured before it can begin. The client first defines the configuration (i.e., local server ip or GDN zone with authentication method). It then provides the configuration to CloudXRSession in the constructor, and optionally it can update it later by calling configure() with a different configuration. Any prior sessions must be disconnected before calling configure().
connect(): The client calls connect() to start the session. This first performs any authentication steps that may be needed for GDN, and then initiates the connection. It is asynchronous and returns when the session reaches the connecting state.
disconnect(): This ends and deletes a session, as opposed to the pause() described below.
pause(): This disconnects a session without deleting it (e.g. to keep using the same GDN host when resuming). Clients can call pause() to disconnect without deleting the session. Note that GDN will time out after 180 seconds of being paused if the client does not reconnect.
resume(): This reconnects to a previously paused session.

Omniverse ActionGraph Logic#

Now that you’re familiar with the client, the next step is for you to see how those commands are intercepted by Omniverse and how you use those messages to update the data in your stage. To accomplish this, you’ll use ActionGraph. You can learn more about ActionGraph here.

Remote Scene State with Messaging and Events#

Next you’ll send messages back to the client using ActionGraph with the latentLoadComplete graph. It’s important for both the client and the server to be aware of each other’s current state. You can use ActionGraph to let your client know when data is loaded, or a variant change is complete by sending a message back to the client. You manage this with a series of events:

ack Events#

In Omniverse development, acknowledgment (ack) events are essential for indicating the completion of actions like changing a variant. You can incorporate a sendMessagebusEvent node at the end of your graph to dispatch an ack event. For instance, in the styleController graph, this event might carry the variantSetName handle. The ackController graph then listens for these events and uses a ScriptNode to convert them into a message.

def setup(db: og.Database):
    pass
def cleanup(db: og.Database):
    pass
def compute(db: og.Database):
    import json
    message_dict = {
        "Type": "switchVariantComplete",
        "variantSetName": db.inputs.variantSetName,
        "variantName": db.inputs.variantName
    }
    db.outputs.message = json.dumps(message_dict)
    return True  # Indicates the function executed successfully

We then send the result of this message composition to the latentLoadingComplete graph via the omni.graph.action.queue_loading_message event.

Loading Complete Messages#

The latentLoadingComplete graph contains all the logic for managing the messaging of these events. First, you need a location to accumulate these messages as they are generated by the stage. You do this with a Write node that takes messages sent to omni.graph.action.queue_loading_message and adds them to an array.

As this array is filled with messages, we use another portion of the graph to determine if loading is complete with a ScriptNode:

import omni
from omni.graph.action_core import get_interface
def setup(db: og.Database):
    pass
def cleanup(db: og.Database):
    pass
def compute(db: og.Database):
    _, files_loaded, total_files = omni.usd.get_context().get_stage_loading_status()
    db.outputs.is_loading = files_loaded or total_files
    return True

This graph detects if any messages have been queued, checks to see if the loading associated with those events is complete, and if so, a Send Custom Event is triggered for send_queued_messages. This event is what finally sends your message back to the client with the event name omni.kit.cloudxr.send_message containing the message that you generated with ackController, queued for sending with queue_loading_message, and executed with send_queued_messages.

Loading Complete Events#

Messages are sent back to the client, but events are what you send to other ActionGraph logic. Alongside the logic for generating, queuing, and sending messages is the same logic for events. This way, you could trigger a different portion of ActionGraph when a specific piece of login is finished loading. You can find this logic next to the messaging logic in the latentLoadingComplete graph.

Portal Camera Setup#

Let’s cover how we use cameras in your scene to help establish predetermined locations a user can teleport their view to in the client. You can see in our stage that we’ve set up a couple of example cameras under Cameras Scope called Front and Front_Left_Quarter.

Note

To approximate the user’s field of view inside the headset, the camera’s focal length is set to 70 and the aperture to 100.

Position your cameras to frame the desired area of the stage.

As a final step, we set our cameras to be on the “ground” of the environment we’re presenting. On the client, we can automatically infer the distance from our users head to the ground in their physical space. We then apply this same offset to the cameras in the USD stage - the effect of this setup causes the ground in our USD stage to appear at the same height as the ground in the user’s physical space, lending additional realism to the content inside the portal.

You can author this “ground” height as an opinion on your cameras so that you can artistically place them and later apply a “ground” offset with an opinion on a reference or layer. Additionally, if you do not want to align the ground between the USD Stage and the user’s physical ground, you can ignore this setup.

Because we cannot predict the head pose a user will have upon connecting, it’s important to remember that the “default” state of your stage should be one in which the camera is located on the ground of your stage. Once the head pose is detected from an active connection, we compose that coordinate system on top of the camera in the stage. For example, if your camera is positioned at (0, 15, 20) and the initial head pose upon client connection is at (0, 100, 200), the asset will appear to be at (0, 115, 220) relative to the user’s head.

In ActionGraph, we change cameras with the cameraController graph. First, we listen for a camera change event from the client with a script node. The client sends the full path of the camera prims in the message. If needed, we can pass a y_offset from the client.

...

# Set the CameraPath output if it exists in the parsed message
if "cameraPath" in state:
    db.outputs.cameraPath = state["cameraPath"]

# Initialize y_offset output to 0
db.outputs.y_offset = 0

# Set y_offset output if it exists in the parsed message
if "y_offset" in state:
    db.outputs.y_offset = state["y_offset"]

...

Next, we’ll use another script node to get the camera transforms and apply them to our XR teleport function: profile.teleport.

# Import necessary modules for XR functionality
from omni.kit.xr.core import XRCore, XRUtils

# Check if XR is enabled, return if it is not
if not XRCore.get_singleton().is_xr_enabled():
    return True

# Get the current XR profile
profile = XRCore.get_singleton().get_current_profile()

# Get the world transform matrix of the camera
camera_mat = XRUtils.get_singleton().get_world_transform_matrix(db.inputs.cameraPath)

# Extract the translation component of the camera matrix
t = camera_mat.ExtractTranslation()

# Adjust the Y component of the translation by the input offset
t[1] += db.inputs.y_offset

# Apply the updated translation back to the camera matrix
camera_mat.SetTranslateOnly(t)

# Log the new camera matrix position for teleportation
print(f"XR Teleport to: {camera_mat}")

# Teleport the XR profile to the updated camera matrix position
profile.teleport(camera_mat)

return True  # Return True to indicate successful execution

Variant Change#

To change variants, we use graph logic to listen for incoming variantSet names. For example when changing the environment, we have a node to catch any messages transmitting the “environment” message. We then route this to a script node with some python inside to update the variant on the specified prim.

Variants are powerful composition systems native to OpenUSD. You can learn more about creating and leveraging variants here.

When creating immersive experiences, it’s important to remember the performance implications of variants - loading and unloading large amounts of data from disk can cause your experience to slow down, consider keeping much of your variant composition loaded inside the stage and leverage mechanisms like visibility to turn prims on and off instead of composition mechanisms like prepending payloads and references.

With ActionGraph, it’s easy to create conditional graphs that can trigger multiple variants. For example, if we wanted the lighting variant to change when we change the environment variant, we could add additional Switch On Token nodes.

Toggle Viewing Mode Portal/Volume Mode#

We change between Portal and Volumes with the contextModeController graph.

Omniverse has two output modes for XR: AR and VR. AR mode enables specific settings for compositing an asset in a mixed reality context, whereas VR mode disables these compositing settings for a fully immersive render. When changing between Portal and Volume, we toggle between VR and AR mode.

First we listen for a setMode message from the client, and then we branch depending on if we send portal or tabletop. We then change the variant on our context prim, /World/Background/context. This variantSet hides and unhides prims that are specific for these modes. For portals, we show the environments, for volumes we hide the environments and show the HDRI. We then use a script node to turn ar_mode on and off:

def setup(db: og.Database):
    pass

def cleanup(db: og.Database):
    pass

def compute(db: og.Database):
    import carb
    from omni.kit.xr.core import XRCore, XRUtils
    profile = XRCore.get_singleton().get_current_profile()

    if hasattr(profile, "set_ar_mode"):
        print(f"Portal setting ar mode to: {db.inputs.ar_mode}")
        profile.set_ar_mode(db.inputs.ar_mode)

    else:
        carb.log_error("XRProfile missing function set_ar_mode")
    return True

Next, we use a script node to change some render settings based on the context mode, and finally send an ack event to signal the script has finished. We detail the render settings in the Performance section.

Additional Examples#

Similarly to how we execute variant changes, we can also listen for and execute other graph logic, such as animations, visibility changes and lighting adjustments. The Graphs folder contains examples that include AnimationController, PurseVisibilityController & LightSliderController.

Client Messaging with Swift#

Let’s take a look at how the client communicates with the above ActionGraph using Swift.

State Management#

On the client side, state is managed via sending and receiving messages to and from Omniverse. Specifically, OmniverseStateManager defines all possible states and whether the server notifies their completion.

All possible state variables should be defined in OmniverseStateManager using OmniverseMessageProtocol. See PurseColor in OmniverseClientEvents for an example of how a state is defined and used in the state manager. State synchronization between the server and client, happens automatically after each state update and can be forced via OmniverseClientEvents.sync() or OmniverseClientEvents.resync(). Resync forces client update for all state variables to ensure the server and client match.

Portal/Volume mode is managed by ViewingModeSystem which in turn uses OmniverseStateManager to switch between portal and volume state. While changing between modes, we use serverNotifiesCompletion: true to hide the UI while we wait for Omniverse to load and unload so much scene data. The portal is created using RealityKit portal API and the viewing mode state is synchronized with the OV using the state manager.

Portal Camera Setup#

The client can switch between cameras via CameraProtocol and OmniverseStateManager.send(protocol). We set some specific information about our scene cameras in OmniverseClientEvents:

...

public struct CameraClientInputEvent: EncodableInputEvent {
    // Note that we're setting an explicit prim path for cameras
    static let cameraPrefix = "/World/cameraViews/RIG_Main/RIG_Cameras/"
    static let setActiveCameraEventType = "set_active_camera"

    public let message: Dictionary<String, String>
    public let type = Self.setActiveCameraEventType

    public init(_ camera: any CameraProtocol) {
        message = ["cameraPath": "\(Self.cameraPrefix)\(camera.rawValue)"]
    }
}

// Cameras need a `description` since their rawValue may not be pretty
public protocol CameraProtocol: CustomStringConvertible, OmniverseMessageProtocol,
    RawRepresentable where RawValue: StringProtocol { }

extension CameraProtocol {
    public var encodable: any EncodableInputEvent { CameraClientInputEvent(self) }
}

public enum ExteriorCamera: String, CaseIterable, CameraProtocol {
    case front = "Front"
    case frontLeftQuarter = "Front_Left_Quarter"

    public var description: String { rawValue.replacingOccurrences(of: "_", with: " ") }
}

...

We then create a list of these cameras in CameraSheet:

...

func cameraButton(_ camera: any CameraProtocol, reset: Bool) -> some View {
    Button {
        appModel.stateManager.send(camera)
        dismiss()
    } label: {
        Text(camera.description)
            .font(.callout)
    }
}

...

Variant Change#

To change Variants, we first set up an event in OmniverseClientEvents:

public struct ColorClientInputEvent: EncodableInputEvent {
    public let message: Dictionary<String, String>
    public let type = setVariantEventType

    public init(_ color: PurseColor) {
        message = [
            // These are the messages we send to Omniverse
            "variantSetName": "color",
            "variantName": color.rawValue
        ]
    }
}
public enum PurseColor: String, CaseIterable, OmniverseMessageProtocol {
    case Beige
    case Black
    case BlackEmboss
    case Orange
    case Tan
    case White

    public var description: String {
        switch self {
        case .Beige:
            "Beige Leather"
        case .Black:
            "Black Leather"
        case .BlackEmboss:
            "Black Emboss Leather"
        case .Orange:
            "Orange Leather"
        case .Tan:
            "Tan Leather"
        case .White:
            "White Leather"
        }
    }

    public var encodable: any EncodableInputEvent { ColorClientInputEvent(self) }

}

We’ll also want the state of this variant in OmniverseStateManager, this is also where we could initiate a default variant, in this case Beige:

private var stateDict: StateDictionary = [
    "color": .init(PurseColor.Beige, serverNotifiesCompletion: false),

And then finally we add a Button in our ConfigureView SwiftUI:

func colorAsset(_ item: PurseColor, size: CGFloat = UIConstants.assetWidth) -> some View {
    Button {
        appModel.stateManager["color"] = item
    } label: {
        VStack {
            // Item image
            Image(String(item.rawValue))
                .resizable()
                .aspectRatio(contentMode: .fit)
                .font(.system(size: 128, weight: .medium))
                .cornerRadius(UIConstants.margin)
                .frame(width: size)
            // Item name
            HStack {
                Text(String(item.description))
                    .font(UIConstants.itemFont)
                Spacer()
            }
        }.frame(width: size)
    }
    .buttonStyle(CustomButtonStyle())
}

Gesture Change#

To see how we’re using native visonOS gestures open ImmersiveView.swift

For Scale, we use the standard RealityKit Entity methods, and communicate directly to Omniverse Kit to modify scale and rotation. In the future, we’ll provide deeper access to how these gestures communicate with Omniverse.

 .gesture(
    SimultaneousGesture(
        RotateGesture3D(constrainedToAxis: .z, minimumAngleDelta: minimumRotation)
            .onChanged { value in
                guard currentGesture != .scaling, viewModel.currentViewingMode == .tabletop
                else {
                    return
                }
                currentGesture = .rotating
                // Rotation direction is the Z axis direction (+/-)
                let radians = value.rotation.angle.radians * -sign(value.rotation.axis.z)
                rotate(to: Float(radians))
            }
            .onEnded { value in
                guard currentGesture != .scaling, viewModel.currentViewingMode == .tabletop
                else {
                    return
                }
                // Rotation direction is the Z axis direction (+/-)
                let radians = value.rotation.angle.radians * -sign(value.rotation.axis.z)
                rotate(to: Float(radians))
                lastRotation = 0
                // Turn off currentGesture after a short delay;
                // otherwise we might get a spurious scale gesture
                DispatchQueue.main.asyncAfter(deadline: .now() + 0.2) {
                    currentGesture = .none
                }
            },
        MagnifyGesture(minimumScaleDelta: minimumScale)
            .onChanged { value in
                guard currentGesture != .rotating, viewModel.currentViewingMode == .tabletop
                else {
                    return
                }
                currentGesture = .scaling
                scale(by: Float(value.magnification))
            }
            .onEnded() { value in
                guard
                    currentGesture != .rotating,
                    viewModel.currentViewingMode == .tabletop
                else {
                    return
                }
                scale(by: Float(value.magnification))
                lastScale = sessionEntity.scale.x
                // Turn off currentGesture after a short delay;
                // otherwise we might get a spurious rotation gesture
                DispatchQueue.main.asyncAfter(deadline: .now() + 0.2) {
                    currentGesture = .none
                }
            }
)

Example Swift Code for Placement Tool - ImmersiveView.swift#

Xcode Project Configuration for the Placement Tool#

Navigate to your project’s Configurator by going to Configurator > Info > Targets > Configurator.
Verify the following keys exist. These keys are required for AR placement functionality. They inform the user why the app needs access to world sensing and hand tracking capabilities.
- NSWorldSensingUsageDescription: Needed to track model position in the world
- NSHandsTrackingUsageDescription: Required to accurately place models, and streaming hand tracking data to a remote server

The ImmersiveView.swift file manages AR content, integrates the PlacementManager, and configures the RealityView to handle rendering, interaction, and dynamic placement of UI elements.

PlacementManager State#

The PlacementManager handles the logic for positioning, moving, and anchoring virtual objects within the AR environment, making sure they interact correctly with the user’s surroundings and inputs:

@State private var placementManager = PlacementManager()

RealityView#

The RealityView in this file is configured to work with the PlacementManager, ensuring the placement puck appears and tracks the user’s head movements.

var body: some View {
    RealityView { content, attachments in
        placementManager.placeable = viewModel
    } update: { content, attachments in
        if let session = appModel.session {
            placementManager.update(session: session, content: content, attachments: attachments)
        }
    } attachments: {
        if viewModel.isPlacing {
            placementManager.attachments()
        }
    }
    .placing(with: placementManager, sceneEntity: sceneEntity, placeable: viewModel)
}

UI and User Interaction#

The ImmersiveView.swift file manages UI elements, including the placement puck, and handles user interactions related to the placement process.

Connection to the Rest of the System#

ImmersiveView.swift interacts with ViewModel.swift, ConfigureView.swift, ViewSelector.swift and the PlacementManager ensuring the Placement Tool functions correctly in the AR environment. The majority of the Place code is located within the Placement Folder within the Configurator project.