Limitations, Restrictions & Known issues#
Kubernetes & Helm#
Streaming Sessions#
The number of active streams is determined by the number of GPU worker instances. To avoid performance issues, limit each GPU worker instance to a single stream.
No Multi-Namespace Installation on Cluster#
The Helm charts cannot be installed on multiple Kubernetes namespaces simultaneously on a single cluster. Currently, only one namespace is supported.
Performance#
Stream Latency#
To achieve low-latency streaming, make sure your API endpoints and microservices running in Kubernetes, as well as the browser-based clients, are located in the same geographical region.
Streaming Session Startup Times#
It generally takes 30 to 40 seconds for an application to start streaming. During the stream initialization, a 202
status code is returned when creating or retrieving the stream session’s parameters.
Shader Caching#
On the first use of the stack, it can take up to 15 minutes or more for the stream to become available. This delay occurs because the shader cache, hosted in memcached, is empty and needs to be populated during the initial run.
Similar delays can occur whenever new shaders are first encountered.
Dynamic Load Balancer Creation and Streaming Sessions#
Omniverse Kit App Streaming supports various options for routing a Omniverse Kit App stream through an NLB, from one NLB for all streams to an NLB per stream. The time to create an NLB and its DNS entry varies by platform, with most being several seconds.
On AWS, the time to create an NLB and its DNS entry can take several minutes. In order to mitigate this cost, NVIDIA optionally supports creating an NLB pool in conjunction with an NLB management service.
Supported Browsers#
Chromium-based browsers, such as Google Chrome version 122 and above, are supported.
Security#
Network#
The Network Load Balancer (NLB) must be able to communicate with the GPU nodes over both UDP and TCP to forward the streaming traffic.
The GPU nodes do not need to be exposed externally and can reside in a private subnet.
Egress traffic from the Kubernetes cluster, particularly from the nodes running the streaming session service, must be able to communicate with the NLB. During stream creation, the streaming session service conducts an end-to-end check to verify that the route is properly initialized through the NLB.
If this is a security concern, it can be disabled in a follow-up.
For more information, review Flux’s security posture.
For AWS only (optional)
An IAM role and policy with permissions to create and manage AWS Network Load Balancer listeners and target groups is required.
An example policy is available in the provided resource documents.
If the AWS NLB Manager (optional) is used for NLB pooling, listeners are added to preconfigured NLBs. By default, one TCP port on 443 and one UDP port on 80 are added, effectively configuring one NLB per stream.
It is possible to tune this to support a maximum of 25 streams per NLB.
The port ranges can be modified, provided that the UDP and TCP ranges do not overlap.
Kubernetes#
When necessary, NVIDIA-created services are assigned a Kubernetes service account and the required roles. They never require elevated privileges or cluster roles.
The RMCP service manages deployed session instances through Flux.
The RMCP service only requires the Flux Helm controller and Source controller.
Encryption in Transit#
WebRTC Media Stream#
The WebRTC Media Stream is encrypted using SRTP (Secure Real-Time Transport Protocol) over UDP, as specified by the WebRTC standard. Messaging, such as remote input, utilizes SCTP (Stream Control Transmission Protocol) over UDP.
Kit App Streaming API#
The Kit App Streaming APIs can be set up and secured with TLS. For more details, please refer to this document.
Secret Management#
Although this document provides examples of creating Kubernetes secrets via the command line, the use of solutions such as External Secrets Operator, HashiCorp Vault, or AWS Secrets Manager can be used, but are beyond the scope of this documentation.