Deploying Omniverse Farm on headless systems

../_images/app_farm_banner_baremetal.png

1. Introduction

This guide will go through the installation of Omniverse Farm in order to be able to run it headlessly.

This document will work for the deployment across a few nodes but for anything at a larger scale we’d recommend using a solution like Ansible to help with the orchestration of the nodes. If baremetal and/or VMs are not a hard requirement we’d recommend running OV Farm in a Kubernetes environment as it allows for better control and scalability.

This deployment is similar to a deployment done via the Launcher and will have similar limitations where, by default, scale and redundancy are a limiting factor in this deployment.

At the end of this guide there is some information on how to add some scalability and persistence by deploying a SQL database and Redis instance.

2. Installation

A. Queue installation

To automate deployment of Farm Queue on Linux, it may be convenient to install it in headless manner.

Prerequisites include Ubuntu Server 20.04 or greater, with an Internet connection in order to download the necessary additional software and packages.

Note

Other Linux distributions should also be compatible with Omniverse Farm Queue, although only Ubuntu 20.04 is officially supported for production use.

  1. Start by installing the required software dependencies:

    $ sudo apt-get install -y --no-install-recommends \
            curl \
            libatomic1 \
            libxi6 \
            libxrandr2 \
            libxt6 \
            libegl1 \
            libglu1-mesa \
            libgomp1 \
            libsm6 \
            unzip
    
  2. Upload the farm_queue_install.sh script to the server running Farm Queue, and place in the /opt/ove folder after creating it:

    farm_queue_install.sh
     1#!/bin/bash
     2
     3#
     4# Note: Specific package versions can be retrieved from the Omniverse Launcher.
     5#
     6
     7# Install the Omniverse Farm Queue package, containing the core Queue capabilities:
     8mkdir -p ov-farm-queue
     9cd ov-farm-queue
    10pwd
    11curl https://d4i3qtqj3r0z5.cloudfront.net/farm-queue-launcher@104.0.0%2B104.0.0.104.6a22d2ce.teamcity.linux-x86_64.release.zip > farm-queue-launcher.zip
    12unzip farm-queue-launcher.zip
    13rm farm-queue-launcher.zip
    14
    15# Install the Kit SDK package, containing the set of features and extensions shared by Omniverse applications:
    16mkdir kit
    17cd kit
    18pwd
    19curl https://d4i3qtqj3r0z5.cloudfront.net/kit-sdk-launcher@104.1%2Brelease.309.2a880d42.tc.linux-x86_64.release.zip > kit-sdk-launcher.zip
    20unzip kit-sdk-launcher.zip
    21rm kit-sdk-launcher.zip
    22
    23cd ..
    24
    25# Create a boilerplate launch script for the Queue:
    26cat << 'EOF' > queue.sh
    27#!/bin/bash
    28
    29BASEDIR=$(dirname "$0")
    30exec $BASEDIR/kit/kit $BASEDIR/apps/omni.farm.queue.headless.kit \
    31    --ext-folder $BASEDIR/exts-farm-queue \
    32    --/exts/omni.services.farm.management.tasks/dbs/task-persistence/connection_string=sqlite:///$BASEDIR//task-management.db
    33EOF
    34
    35chmod +x queue.sh
    
  3. Change the permission of the farm_queue_install.sh script in order to make it executable:

    $ chmod +x farm_queue_install.sh
    
  4. Run the script from within the /opt/ove folder as a non-root user:

    $ ./farm_queue_install.sh
    
  5. Once the files are downloaded and extracted, you will have a folder named /opt/ove/ov-farm-queue.

  6. Ensure that the /opt/ove/ov-farm-queue and all the files within are owned by a non-root user, then launch the Queue:

    $ ./queue.sh &
    

To confirm that the installation completed successfully, you may attempt to reach the API endpoint of the Queue responsible for providing a health status about the service, by emitting a curl request and validating it returns a response of “OK”:

$ curl http://localhost:8222/status

A successful response should contain information similar to the following, illustrating that all required Omniverse Extension were loaded, that configuration options we successfully applied, and that the Queue is ready to receive and dispatch tasks:

[user@machine ov-farm-queue] $ curl http://localhost:8222/status
"OK"

B. Agent installation

To automate deployment of Farm Agents on Linux, and scale the compute capabilities to multiple machines, it may be convenient to install Agents in headless manner.

Prerequisites include Ubuntu Server 20.04 or greater, with an Internet connection in order to download the necessary additional software and packages.

Note

Other Linux distributions should also be compatible with Omniverse Farm Agent, although only Ubuntu 20.04 is officially supported for production use.

  1. Start by installing the required software dependencies:

    $ sudo apt-get install -y --no-install-recommends \
            curl \
            libatomic1 \
            libxi6 \
            libxrandr2 \
            libxt6 \
            libegl1 \
            libglu1-mesa \
            libgomp1 \
            libsm6 \
            unzip
    
  2. Upload the farm_agent_install.sh script to the server running Farm Agent, and place in the /opt/ove folder after creating it:

    farm_agent_install.sh
     1#!/bin/bash
     2
     3#
     4# Note: Specific package versions can be retrieved from the Omniverse Launcher.
     5#
     6
     7# Install the Omniverse Farm Agent package, containing the core Agent capabilities:
     8mkdir -p ov-farm-agent
     9cd ov-farm-agent
    10pwd
    11curl https://d4i3qtqj3r0z5.cloudfront.net/farm-agent-launcher%40104.0.0%2B104.0.0.134.6896994f.teamcity.linux-x86_64.release.zip > farm-agent-launcher.zip
    12unzip farm-agent-launcher.zip
    13rm farm-agent-launcher.zip
    14
    15# Install the Kit SDK package, containing the set of features and extensions shared by Omniverse applications:
    16mkdir kit
    17cd kit
    18pwd
    19curl https://d4i3qtqj3r0z5.cloudfront.net/kit-sdk-launcher@104.1%2Brelease.309.2a880d42.tc.linux-x86_64.release.zip > kit-sdk-launcher.zip
    20unzip kit-sdk-launcher.zip
    21rm kit-sdk-launcher.zip
    22
    23# Install the Multiview Batch package:
    24cd ..
    25mkdir -p jobs/multiview-batch
    26cd jobs/multiview-batch
    27pwd
    28curl https://d4i3qtqj3r0z5.cloudfront.net/farm-job-multiview-batch-render%40104.1.0%2Bmain.69.a2e383e2.tc.linux-x86_64.release.zip > farm-job-multiview-batch.zip
    29unzip farm-job-multiview-batch.zip
    30rm farm-job-multiview-batch.zip
    31
    32# Install the "create-render" package, containing the job definition for the rendering task:
    33cd ../..
    34mkdir -p jobs/create-render
    35cd jobs/create-render
    36pwd
    37curl https://d4i3qtqj3r0z5.cloudfront.net/farm-job-create-render%40104.1.0%2Bmain.70.99816786.tc.linux-x86_64.release.zip > farm-job-create-render.zip
    38unzip farm-job-create-render.zip
    39rm farm-job-create-render.zip
    40
    41cd ../..
    42
    43# Create a boilerplate launch script for the Agent:
    44cat << 'EOF' > agent.sh
    45#!/bin/bash
    46
    47BASEDIR=$(dirname "$0")
    48exec $BASEDIR/kit/kit $BASEDIR/apps/omni.farm.agent.headless.kit \
    49    --ext-folder $BASEDIR/exts-farm-agent \
    50    --/exts/omni.services.farm.agent.operator/job_store_args/job_directories/0=$BASEDIR/jobs/* \
    51    --/exts/omni.services.farm.agent.controller/manager_host=http://<QUEUE IP>:<QUEUE PORT> \
    52    --/exts/omni.services.farm.agent.operator/manager_host=http://<QUEUE IP>:<QUEUE PORT>
    53EOF
    54
    55chmod +x agent.sh
    
  3. Change the permission of the farm_agent_install.sh script in order to make it executable:

    $ chmod +x farm_agent_install.sh
    
  4. Run the script from within the /opt/ove folder as a non-root user:

    $ ./farm_agent_install.sh
    
  5. Once the files are downloaded and extracted, you will have a folder named /opt/ove/ov-farm-agent.

  6. Ensure that the /opt/ove/ov-farm-agent and all the files within are owned by a non-root user.

  7. Configure the Farm Agent Controller and Operator addresses to supply the Farm Queue Server address to agent.sh:

    agent.sh
    # [...]
    
    --/exts/omni.services.farm.agent.controller/manager_host=http://<QUEUE IP>:<QUEUE PORT>
    --/exts/omni.services.farm.agent.operator/manager_host=http://<QUEUE IP>:<QUEUE PORT>
    
  8. Once the Controller and Operator addresses configured, launch the Agent:

    $ ./agent.sh &
    

Output Log

When executing the Agent, the console will display an output similar to the following to indicate it is running successfully:

[user@machine ov-farm-agent] $ ./agent.sh
[Info] [carb] Logging to file: /home/user/.nvidia-omniverse/logs/Kit/omni.farm.agent.headless/102.1/kit_20220502_132007.log
[0.346s] [ext: omni.kit.pipapi-0.0.0] startup
[0.360s] [ext: omni.services.pip_archive-0.3.0] startup
. . .
[1.464s] [ext: omni.farm.agent.headless-102.1.0] startup
[1.574s] app ready

In case of error due to an Agent is running without proper acceleration, an output similar to the following will be displayed:

[1.964s] [ext: omni.farm.agent.headless-102.1.0] startup
2022-04-29 20:57:11 [2,049ms] [Error] [omni.services.farm.facilities.agent.capacity.managers.base] Failed to load capacities for omni.services.farm.facilities.agent.capacity.GPU: NVML Shared Library Not Found
[2.075s] app ready

See the Linux Troubleshooting for any installation issues.

3. Scaling

A. SQL database

By default OV Farm, when installed in this configuration, will use a SQLite DB. This will suffer from performance and scalability as many agents come and request work. It also prevents multiple instances of the various services to be active.

It is possible to instead use a different SQL database such as MariaDB to run remotely or on the same host and provide better performance.

The user account will need to have create permissions to create the DB and the tables.

To change the database connection string to use a MariaDB based SQL instance, Add the follwing value in the omni.queue.headless.kit file:

omni.queue.headless.kit
[settings]
# Avoids shader chache compilation at startup/RTX requirements.
exts."omni.kit.renderer.core".compatibilityMode = true
exts."omni.kit.async_engine".event_loop_windows = "ProactorEventLoop"
exts."omni.services.transport.server.http".allow_port_range = false
exts."omni.services.transport.server.http".port = 8222
exts."omni.services.farm.management.tasks".dbs.task-persistence.connection_string="mysql://<username>:<password>@<host>:<port>/<db_name>"

And remove the following line from the queue.sh script:

queue.sh
#!/bin/bash

BASEDIR=$(dirname "$0")
exec $BASEDIR/kit/kit $BASEDIR/apps/omni.farm.queue.headless.kit \
    --ext-folder $BASEDIR/exts-farm-queue \
    --/exts/omni.services.farm.management.tasks/dbs/task-persistence/connection_string=sqlite:///$BASEDIR//task-management.db

B. Redis

B.1 Agent services

By default the agent store is in memory meaning that only a single instance of the agent service is supported. It is possible to swap out the in memory agent store with a Redis backed instance.

To change the agent’s backend, add the following values in the omni.queue.headless.kit file, installed as part of the installations steps above:

Be sure to replace the <host> and <port> with the host and port of the redis instance.

The connection_string format documentation is available here

omni.queue.headless.kit
[settings]
# Avoids shader chache compilation at startup/RTX requirements.
exts."omni.kit.renderer.core".compatibilityMode = true
exts."omni.kit.async_engine".event_loop_windows = "ProactorEventLoop"
exts."omni.services.transport.server.http".allow_port_range = false
exts."omni.services.transport.server.http".port = 8222
exts."omni.services.farm.management.agents".manager_class = "omni.services.farm.management.agents.managers.redis.RedisAgentManager"

[settings.exts."omni.services.farm.management.agents".manager_args]
connection_string="redis://<host>:<port>"