Telemetry and Logging#
NVIDIA-managed services automatically deploy a ServiceMonitor
resource to allow metrics collection by a Kubernetes Prometheus stack. To disable this, set monitoring.enabled=false
in the Helm values file for each deployment.
By default, the ServiceMonitor
is deployed in the same namespace as the service. To specify a different namespace, set the monitoring.prometheusNamespace
value.
For alternative monitoring setups, Omniverse services export Prometheus and OpenTelemetry-compatible metrics. Prometheus metrics are available at the /metrics
endpoint, with default rate, error, and duration metrics exposed.