Customizing Endpoint Resolution#

The sample APIs provide an option for developers to customize how the streaming endpoints are resolved.

Latest Container Release: 1.8.0

By default the resolve class is Generic, the sample also includes an AWS Targetgroup specific resolver class.

Both can be found within the container at the following path: /usr/local/lib/python3.11/site-packages/nv/svc/streaming/_csp.py

The class and arguments to load at startup is managed by the following settings:

backend.csp.cls
backend.csp.args

These can be overwritten via the following settings in the helm values file:

streaming.serviceConfig.backend_csp_cls
streaming.serviceConfig.backend_csp_args

Below is the Generic class defined in _csp.py. It inherits from the _CSP class. The class always get passed a configured Kubernetes client that uses the service account defined in the helm chart of the streaming session manager. An example of it being used can be seen in the _fetch_resources function inside the _CSP class.

Generic class (sparse)

class Generic(_CSP):
    """Generic and default CSP manager."""

    def __init__(
        self,
        k8s_client: KubernetesClient,
        enable_wss: bool = False,
        hostname_annotation_key: str = "external-dns.alpha.kubernetes.io/hostname",
        service_annotations_location: str = "streamingKit.service.annotations",
        base_domain: str = ""
    ) -> Any:
        """Initialize."""
        super().__init__(k8s_client)
        self._enable_wss = enable_wss
        self._service_annotations_location = service_annotations_location
        self._hostname_annotation_key = hostname_annotation_key
        self._base_domain = base_domain

    async def on_create(self, profile_data: Dict, settings: Dict) -> Dict:
        """Process CSP customisations on stream creation."""
        if not self._enable_wss:
            return {}

        values = profile_data.get("settings", {}).get("values", {})
        keys = self._service_annotations_location.split(".")
        data = values
        for key in keys:
            data = data.get(key, {})

        prefix = self._generate_random_dns_prefix()
        data[self._hostname_annotation_key] = f"{prefix}.{self._base_domain}"
        settings[self._service_annotations_location] = data

        logging.debug(f"Generated settings {settings}")

        return settings

    async def resolve_endpoints(self, session_id: str) -> Tuple[Dict, bool]:
        """Resolve the endpoint to connect to for a given session."""
        services = await self._fetch_resources("service", selectors={"sessionId": session_id})
        routes, status = await self._extract_routes(services)
        return routes, status

(full)

class Generic(_CSP):
    """Generic and default CSP manager."""

    def __init__(
        self,
        k8s_client: KubernetesClient,
        enable_wss: bool = False,
        hostname_annotation_key: str = "external-dns.alpha.kubernetes.io/hostname",
        service_annotations_location: str = "streamingKit.service.annotations",
        base_domain: str = ""
    ) -> Any:
        """Initialize."""
        super().__init__(k8s_client)
        self._enable_wss = enable_wss
        self._service_annotations_location = service_annotations_location
        self._hostname_annotation_key = hostname_annotation_key
        self._base_domain = base_domain

    async def on_create(self, profile_data: Dict, settings: Dict) -> Dict:
        """Process CSP customisations on stream creation."""
        if not self._enable_wss:
            return {}

        values = profile_data.get("settings", {}).get("values", {})
        keys = self._service_annotations_location.split(".")
        data = values
        for key in keys:
            data = data.get(key, {})

        prefix = self._generate_random_dns_prefix()
        data[self._hostname_annotation_key] = f"{prefix}.{self._base_domain}"
        settings[self._service_annotations_location] = data

        logging.debug(f"Generated settings {settings}")

        return settings

    async def resolve_endpoints(self, session_id: str) -> Tuple[Dict, bool]:
        """Resolve the endpoint to connect to for a given session."""
        services = await self._fetch_resources("service", selectors={"sessionId": session_id})
        routes, status = await self._extract_routes(services)
        return routes, status

    def _extract_ports(self, service_spec: Dict) -> Tuple[Dict, bool]:
        """Extract the port information from the service specification.

        Args:
            service_spec (Dict): The service specification dictionary.

        Returns:
            Dict: A dictionary containing the route information.
        """
        ports = service_spec.spec.ports

        routes = []
        for port in ports:
            routes.append(
                {
                    "source_port": port.port,
                    "description": port.name,
                    "protocol": port.protocol,
                    "destination_port": port.node_port
                }
            )

        status = True if routes else False

        logging.debug(f"Extracted port mappings {routes}, readiness status {status}")
        return {"routes": routes}, status

    async def _extract_routes(self, services):
        routes = {}
        statuses = []

        for service in services:
            ports, port_ready = self._extract_ports(service)

            entries = []
            lb_ready = False
            if self._enable_wss:
                hostname, lb_ready = await self._extract_hostname(service)
                entries.append(hostname)
            else:
                ips, lb_ready = await self._extract_lb_ips(service)
                entries.extend(ips)

            statuses.append(all([port_ready, lb_ready]))
            if not lb_ready:
                continue

            for entry in entries:
                routes[entry] = ports

        status = all(statuses) if statuses else False
        logging.debug(f"Extracted routes {routes}, readiness status {status}")
        return routes, status

    async def _extract_lb_ips(self, service) -> Tuple[List, bool]:
        ips = []
        hostname = None

        ingress = service.status.load_balancer.ingress
        if not ingress:
            return ips, False

        for entry in ingress:
            if entry.ip:
                ips.append(entry.ip)
            hostname = entry.hostname

        if not ips and hostname:
            logging.debug("No IPs were found attached to the service, trying to resolve hostname")
            ips = await self._resolve_hostname(hostname)

        status = True if ips else False

        logging.debug(f"Extracted IPs {ips}, readiness status {status}")
        return ips, status

    async def _extract_hostname(self, service) -> Tuple[str, bool]:
        annotations = service.metadata.annotations
        hostname = None

        try:
            hostname = annotations[self._hostname_annotation_key]
        except KeyError:
            logging.error(f"Hostname field `{self._hostname_annotation_key}` not found in annotations")
            raise HostnameNotFoundError("Unable to find hostname")

        ready = False
        try:
            ips = await self._resolve_hostname(hostname)
            if ips:
                ready = True
        except aiodns.error.DNSError as exc:
            logging.warning(f"Unable to resolve {hostname}: {exc}")

        return hostname, ready

_CSP class (sparse)

class _CSP(object):
    """Custom CSP behaviour base class."""

    def __init__(self, k8s_client: KubernetesClient) -> None:
        """Initialize."""
        self._k8s_client = k8s_client
        self._default_namespace = "omni-streaming"

    async def on_create(self, profile_data: dict, settings: dict) -> Dict:
        """Process CSP customisations on stream creation."""
        return {}

    async def on_delete(self, data: dict) -> None:
        """Proccess CSP customisations on stream termination."""
        pass

    async def resolve_endpoints(self, session_id: str) -> Tuple[Dict, bool]:
        """Resolve the endpoint to connect to for a given session."""
        return {}, False

    async def _fetch_resources(self, resource_type, selectors=None, args: Dict = None):
        api_class = self._get_k8s_api_class(resource_type)

        args = args or {}
        async with self._k8s_client as api_client:
            api_instance = api_class(api_client.api_http)
            func_name = args.pop('func_name', f"list_namespaced_{resource_type}")
            func = getattr(api_instance, func_name)

            if selectors:
                args['label_selector'] = ",".join(f"{k}={v}" for k, v in selectors.items())

            res = await func(
                namespace=self._default_namespace,
                **args
            )
            return res['items'] if isinstance(res, dict) else res.items

(full)

class _CSP(object):
    """Custom CSP behaviour base class."""

    def __init__(self, k8s_client: KubernetesClient) -> None:
        """Initialize."""
        self._k8s_client = k8s_client
        self._default_namespace = "omni-streaming"

    def _generate_random_dns_prefix(self, length=6):
        letters = string.ascii_lowercase
        return ''.join(random.choice(letters) for _ in range(length))

    async def on_create(self, profile_data: dict, settings: dict) -> Dict:
        """Process CSP customisations on stream creation."""
        return {}

    async def on_delete(self, data: dict) -> None:
        """Proccess CSP customisations on stream termination."""
        pass

    async def resolve_endpoints(self, session_id: str) -> Tuple[Dict, bool]:
        """Resolve the endpoint to connect to for a given session."""
        return {}, False

    async def _resolve_hostname(self, hostname):
        ips = []

        resolver = aiodns.DNSResolver()
        try:
            ipv4_records = await resolver.query(hostname, 'A')
        except aiodns.error.DNSError as exc:
            logging.warning(f"Failed to resolve DNS for {hostname}. The domain might not have propagated yet if this is a new stream. {exc}")
            return []

        ips.extend([record.host for record in ipv4_records])
        return ips

    async def _fetch_resources(self, resource_type, selectors=None, args: Dict = None):
        api_class = self._get_k8s_api_class(resource_type)

        args = args or {}
        async with self._k8s_client as api_client:
            api_instance = api_class(api_client.api_http)
            func_name = args.pop('func_name', f"list_namespaced_{resource_type}")
            func = getattr(api_instance, func_name)

            if selectors:
                args['label_selector'] = ",".join(f"{k}={v}" for k, v in selectors.items())

            res = await func(
                namespace=self._default_namespace,
                **args
            )
            return res['items'] if isinstance(res, dict) else res.items

    def _get_k8s_api_class(self, resource_type):
        resource_to_api_class = {
            'pod': CoreV1Api,
            'service': CoreV1Api,
            'deployment': AppsV1Api,
            'targetgroupbinding': CustomObjectsApi,

        }
        return resource_to_api_class.get(resource_type, CoreV1Api)

    async def _tcp_port_ready(self, host: str, port: int) -> bool:
        try:
            logging.debug(f"Testing stream readiness on {host}:{port}")
            reader, writer = await asyncio.wait_for(asyncio.open_connection(host, port), timeout=2)
            writer.close()
            await writer.wait_closed()
            logging.debug(f"Stream connection on {host}:{port} successful")
            return True
        except asyncio.TimeoutError:
            logging.info(f"Failed to connect to {host}:{port}. Connection not ready.")
        except Exception as exc:
            logging.error(f"Failed to connect to {host}:{port}: {exc}")

        return False

AWS class (sparse)

class AWS(_CSP):
    """AWS customisations."""

    def __init__(
        self,
        k8s_client: KubernetesClient,
        nlb_mgmt_svc: str = "",
        enable_wss: bool = False
    ) -> Any:
        """Initialize AWS class."""
        super().__init__(k8s_client=k8s_client)
        self._nlb_mgmt_svc_url = nlb_mgmt_svc

        self._port_locations = {
            "media": "streamingKit.service.mediaPort",
            "signaling": "streamingKit.service.signalingPort"
        }

        self._targetgroup_arn_locations = {
            "media": "streamingKit.aws.targetgroups.media",
            "signaling": "streamingKit.aws.targetgroups.signaling",
        }

        self._listeners_arn_locations = {
            "media": "streamingKit.aws.listeners.media",
            "signaling": "streamingKit.aws.listeners.signaling",
        }

        self._nlb_location = "streamingKit.aws.nlb"
        self._alias_location = "streamingKit.aws.alias"
        self._enable_wss = enable_wss

    async def on_create(self, profile_data: dict, settings: dict) -> Dict:
        """Process AWS customisations on stream creation."""
        values = profile_data["settings"]["values"]

        ports = {}
        for port_name, location in self._port_locations.items():
            ports[port_name] = self._lookup_nested_dict(values, location)

        allocations = []
        default_protocol = "TLS" if self._enable_wss else "TCP"

        for name in ports.keys():
            allocations.append(
                {
                    "name": name,
                    "protocol": "UDP" if name == "media" else default_protocol
                }
            )

        url = f"{self._nlb_mgmt_svc_url}/allocation"
        async with aiohttp.ClientSession() as session:
            async with session.post(url, json={"allocations": allocations}) as resp:
                if resp.status not in [200]:
                    detail = await resp.text()
                    error_msg = f"Failed request to {url}: {resp.status}, {detail}"
                    logging.error(error_msg)
                    raise _APIError(status_code=resp.status, details=error_msg)

                arns = await resp.json()

        settings[self._nlb_location] = arns["loadbalancer"]["dnsName"]
        settings[self._alias_location] = arns["loadbalancer"].get("alias", "")

        for key, location in self._targetgroup_arn_locations.items():
            listener_arn = arns['allocations'][key]['listenerArn']
            listener_port = arns['allocations'][key]['listenerPort']
            listener_protocol = arns['allocations'][key]['listenerProtocol']
            settings[location] = arns["allocations"][key]["targetGroupArn"]
            settings[self._listeners_arn_locations[key]] = f"{listener_arn}@{listener_port}@{listener_protocol}@{key}"

        logging.debug(f"Generated settings {settings}")
        return settings

    async def resolve_endpoints(self, session_id: str) -> Tuple[Dict, bool]:
        """Resolve the endpoint to connect to for a given session."""

        args = {
            'func_name': 'list_namespaced_custom_object',
            'group': 'elbv2.k8s.aws',
            'version': 'v1beta1',
            'plural': 'targetgroupbindings',
        }

        tgbs = await self._fetch_resources(
            "targetgroupbinding",
            selectors={"sessionId": session_id},
            args=args
        )

        routes, status = await self._extract_routes(tgbs)

        return routes, status

(full)

class AWS(_CSP):
    """AWS customisations."""

    def __init__(
        self,
        k8s_client: KubernetesClient,
        nlb_mgmt_svc: str = "",
        enable_wss: bool = False
    ) -> Any:
        """Initialize AWS class."""
        super().__init__(k8s_client=k8s_client)
        self._nlb_mgmt_svc_url = nlb_mgmt_svc

        self._port_locations = {
            "media": "streamingKit.service.mediaPort",
            "signaling": "streamingKit.service.signalingPort"
        }

        self._targetgroup_arn_locations = {
            "media": "streamingKit.aws.targetgroups.media",
            "signaling": "streamingKit.aws.targetgroups.signaling",
        }

        self._listeners_arn_locations = {
            "media": "streamingKit.aws.listeners.media",
            "signaling": "streamingKit.aws.listeners.signaling",
        }

        self._nlb_location = "streamingKit.aws.nlb"
        self._alias_location = "streamingKit.aws.alias"
        self._enable_wss = enable_wss

    def _lookup_nested_dict(self, nested_dict, key_string):
        keys = key_string.split('.')
        value = nested_dict
        for key in keys:
            value = value[key]
        return value

    async def on_create(self, profile_data: dict, settings: dict) -> Dict:
        """Process AWS customisations on stream creation."""
        values = profile_data["settings"]["values"]

        ports = {}
        for port_name, location in self._port_locations.items():
            ports[port_name] = self._lookup_nested_dict(values, location)

        allocations = []
        default_protocol = "TLS" if self._enable_wss else "TCP"

        for name in ports.keys():
            allocations.append(
                {
                    "name": name,
                    "protocol": "UDP" if name == "media" else default_protocol
                }
            )

        url = f"{self._nlb_mgmt_svc_url}/allocation"
        async with aiohttp.ClientSession() as session:
            async with session.post(url, json={"allocations": allocations}) as resp:
                if resp.status not in [200]:
                    detail = await resp.text()
                    error_msg = f"Failed request to {url}: {resp.status}, {detail}"
                    logging.error(error_msg)
                    raise _APIError(status_code=resp.status, details=error_msg)

                arns = await resp.json()

        settings[self._nlb_location] = arns["loadbalancer"]["dnsName"]
        settings[self._alias_location] = arns["loadbalancer"].get("alias", "")

        for key, location in self._targetgroup_arn_locations.items():
            listener_arn = arns['allocations'][key]['listenerArn']
            listener_port = arns['allocations'][key]['listenerPort']
            listener_protocol = arns['allocations'][key]['listenerProtocol']
            settings[location] = arns["allocations"][key]["targetGroupArn"]
            settings[self._listeners_arn_locations[key]] = f"{listener_arn}@{listener_port}@{listener_protocol}@{key}"

        logging.debug(f"Generated settings {settings}")
        return settings

    async def resolve_endpoints(self, session_id: str) -> Tuple[Dict, bool]:
        """Resolve the endpoint to connect to for a given session."""

        args = {
            'func_name': 'list_namespaced_custom_object',
            'group': 'elbv2.k8s.aws',
            'version': 'v1beta1',
            'plural': 'targetgroupbindings',
        }

        tgbs = await self._fetch_resources(
            "targetgroupbinding",
            selectors={"sessionId": session_id},
            args=args
        )

        routes, status = await self._extract_routes(tgbs)

        return routes, status

    async def _extract_routes(self, tgbs: List) -> Tuple[Dict, bool] :
        routes = []
        statuses = []

        hostnames = []

        for tgb in tgbs:
            annotations = tgb["metadata"]["annotations"]
            listener = annotations.get('nvidia.com/omniverse.listener', '')
            nlb_hostname = annotations.get('nvidia.com/omniverse.nlb', '')
            alias = annotations.get('nvidia.com/omniverse.alias', '')

            hostname = alias if self._enable_wss else nlb_hostname

            if not listener or not hostname:
                statuses.append(False)
                logging.error(f"Invalid targetgroupbinding was found. No listener annotation was found {tgb}")
                continue

            listener_arn, port, protocol, name = listener.split("@")
            hostnames.append(hostname)
            routes.append({
                "source_port": int(port),
                "description": name,
                "protocol": protocol,
                "destination_port": -1
            })
            status = await self._tcp_port_ready(hostname, port) if protocol.lower() == "tcp" else True
            statuses.append(status)

        status = all(statuses) if statuses else False

        assembled_routes = {}

        ips = []
        if not self._enable_wss:
            for hostname in hostnames:
                ips.extend(await self._resolve_hostname(hostname))
            hostnames = ips

        for hostname in hostnames:
            assembled_routes[hostname] = {"routes": routes}

        logging.debug(f"Extracted routes {assembled_routes}, readiness status {status}")
        return assembled_routes, status

_csp.py

Creating a custom CSP resolver#

Because the class and arguments to load can be done dynamically it is possible for a developer to provide their own class and arguments as long as the python file containing the class is within the container and within a python package (ie: an __init__.py file is present).

There are three functions that allow custom behavior:

on_create
- Allows customizing the chart values that are passed to the rmcp service before the session gets instantiated.
- Takes as arguments the chart values that come from resolving the profile, this gives access to the resolved values and settings, the overrides specified within a version.
- It returns an updated settings dictionary that are resolved to update the chart values.
on_delete
- Allows additional actions to be taken when a session is terminated
- Takes as an argument a dictionary which currently only contains the session_id
- Returns nothing.
resolve_endpoints
- Determines the routes passed back to the web client, and if the session is ready to connect to
- Takes as an argument the session_id.
- Returns a tuple of a dictionary with routes and a boolean indicating if the session is ready.
  If the boolean is false, a 202 will be returned to the client.

Example custom endpoint resolver#

Here is an example of creating a custom endpoint resolver and deploying it to an Omniverse Kit App Streaming instance.

Example python class that extracts a streaming session’s POD IP and ports#

from typing import Tuple, Dict, List

from nv.svc.streaming._csp import Generic

class NodePortResolver(Generic):

    async def resolve_endpoints(self, session_id: str) -> Tuple[Dict, bool]:
        """Resolve the endpoint to connect to for a given session."""
        services = await self._fetch_resources("service", selectors={"sessionId": session_id})
        pods = await self._fetch_resources("pod", selectors={"sessionId": session_id})
        routes, status = await self._extract_routes(services, pods)
        return routes, status

    def _extract_ports(self, service_spec: Dict) -> Tuple[Dict, bool]:
        """Extract the port information from the service specification.

        Args:
            service_spec (Dict): The service specification dictionary.

        Returns:
            Dict: A dictionary containing the route information.
        """
        ports = service_spec.spec.ports

        routes = []
        for port in ports:
            routes.append(
                {
                    "source_port": port.node_port,
                    "description": port.name,
                    "protocol": port.protocol,
                    "destination_port": 0 # this field is no longer used.
                }
            )

        status = True if routes else False

        return {"routes": routes}, status

    async def _extract_routes(self, services, pods):
        routes = {}
        statuses = []

        for service in services:
            ports, port_ready = self._extract_ports(service)
            ips, ip_ready = await self._extract_pod_node_ips(pods)
            statuses.append(port_ready and ip_ready)

            if not ip_ready:
                continue

            for ip in ips:
                routes[ip] = ports

        status = all(statuses) if statuses else False
        return routes, status

    async def _extract_pod_node_ips(self, pods) -> Tuple[List[str], bool]:
        """Extract the host IPs of the nodes hosting the pods of the stream."""
        node_ips = set()
        for pod in pods.items:  # .items to iterate through pod list
            if pod.status.host_ip:
                node_ips.add(pod.status.host_ip)

        return list(node_ips), bool(node_ips)

node_port.py
Create a dockerfile that copies node_port.py into the streaming session manager container:

Note

The code snippets below include a variable for $APP_VERSION. This variable can be edited manually using the latest release version number, or set using an environment variable (e.g., export APP_VERSION=). The latest container versions are listed at the top of this page for quick reference.

FROM nvcr.io/nvidia/omniverse/kit-appstreaming-manager:$APP_VERSION

# Create the extensions directory
RUN mkdir -p /resolvers

# Copy the NodePortResolver.py into /extensions
COPY node_port.py /resolvers/

# Create an empty __init__.py to make it a package
RUN touch /resolvers/__init__.py

Build and push the container, replacing the registry, name and version:

docker build -t {my-registry}/{kit-appstreaming-manager-node-port}:{1.0.0}
docker push {my-registry}/{kit-appstreaming-manager-node-port}:{1.0.0}

Create or update the values file to use the new container and instruct it to use the new resolver class (this is a partial values file with just the required fields changed):

streaming:
   image:
      # -- Image repository.
      repository: "my-registry/kit-appstreaming-manager-node-port"
      # -- Image pull policy.
      pullPolicy: Always
      # -- Image tag.
      tag: 1.0.0

   serviceConfig:
      # -- CSP customization manager
      backend_csp_cls: "resolvers.node_port:NodePortResolver"

      # -- CSP Customization manager arguments
      backend_csp_args: {}

Apply the helm chart.

At this point, the NodePortResolver will be loaded and called to resolve the routes based on NodePort and HostIP.