Deploying Omniverse Farm on headless systems#
1. Introduction#
This guide will go through the installation of Omniverse Farm in order to be able to run it headlessly.
This document will work for the deployment across a few nodes but for anything at a larger scale we’d recommend using a solution like Ansible to help with the orchestration of the nodes. If baremetal and/or VMs are not a hard requirement we’d recommend running OV Farm in a Kubernetes environment as it allows for better control and scalability.
This deployment is similar to a deployment done via the Launcher and will have similar limitations where, by default, scale and redundancy are a limiting factor in this deployment.
At the end of this guide there is some information on how to add some scalability and persistence by deploying a SQL database and Redis instance.
2. Installation#
A. Queue installation#
To automate deployment of Farm Queue on Linux, it may be convenient to install it in headless manner.
Prerequisites include Ubuntu Server 20.04 or greater, with an Internet connection in order to download the necessary additional software and packages.
Note
Other Linux distributions should also be compatible with Omniverse Farm Queue, although only Ubuntu 20.04 is officially supported for production use.
Start by installing the required software dependencies:
$ sudo apt-get install -y --no-install-recommends \ curl \ libatomic1 \ libxi6 \ libxrandr2 \ libxt6 \ libegl1 \ libglu1-mesa \ libgomp1 \ libsm6 \ unzip
Upload the farm_queue_install.sh script to the server running Farm Queue, and place in the /opt/ove folder after creating it:
1#!/bin/bash 2 3# 4# Note: Specific package versions can be retrieved from the Omniverse Launcher. 5# 6 7# Install the Omniverse Farm Queue package, containing the core Queue capabilities: 8mkdir -p ov-farm-queue 9cd ov-farm-queue 10pwd 11curl https://d4i3qtqj3r0z5.cloudfront.net/farm-queue-launcher%40105.1.0%2B105.1.x.174.c6feac39.teamcity.linux-x86_64.release.zip > farm-queue-launcher.zip 12unzip farm-queue-launcher.zip 13rm farm-queue-launcher.zip 14 15# patch the 105.1.0 headless release that shipped referencing old modules that cause errors 16sed -i'.backup' 's/\t"omni\.services\.farm\.management\.tasks-0\.19\.3"/\t"omni\.services\.farm\.management\.tasks-0\.19\.4"/g' apps/omni.farm.queue.headless.kit 17sed -i'.backup' 's/\t"omni\.services\.farm\.facilities\.store\.db-0\.11\.2"/\t"omni\.services\.farm\.facilities\.store\.db-0\.11\.3"/g' apps/omni.farm.queue.headless.kit 18 19# Install the Kit SDK package, containing the set of features and extensions shared by Omniverse applications: 20mkdir kit 21cd kit 22pwd 23curl https://d4i3qtqj3r0z5.cloudfront.net/kit-sdk-launcher@105.1%2Bmaster.120930.709ebe37.tc.linux-x86_64.release.zip > kit-sdk-launcher.zip 24unzip kit-sdk-launcher.zip 25rm kit-sdk-launcher.zip 26 27cd .. 28 29# Create a boilerplate launch script for the Queue: 30cat << 'EOF' > queue.sh 31#!/bin/bash 32 33BASEDIR=$(dirname "$0") 34exec $BASEDIR/kit/kit $BASEDIR/apps/omni.farm.queue.headless.kit \ 35 --ext-folder $BASEDIR/exts-farm-queue \ 36 --/exts/omni.services.farm.management.tasks/dbs/task-persistence/connection_string=sqlite:///$BASEDIR//task-management.db 37EOF 38 39chmod +x queue.sh
Change the permission of the farm_queue_install.sh script in order to make it executable:
$ chmod +x farm_queue_install.sh
Run the script from within the /opt/ove folder as a non-root user:
$ ./farm_queue_install.sh
Once the files are downloaded and extracted, you will have a folder named /opt/ove/ov-farm-queue.
Ensure that the /opt/ove/ov-farm-queue and all the files within are owned by a non-root user, then launch the Queue:
$ ./queue.sh &
To confirm that the installation completed successfully, you may attempt to reach the API endpoint of the Queue responsible for providing a health status about the service, by emitting a curl request and validating it returns a response of “OK”:
$ curl http://localhost:8222/status
A successful response should contain information similar to the following, illustrating that all required Omniverse Extension were loaded, that configuration options we successfully applied, and that the Queue is ready to receive and dispatch tasks:
[user@machine ov-farm-queue] $ curl http://localhost:8222/status "OK"
B. Agent installation#
To automate deployment of Farm Agents on Linux, and scale the compute capabilities to multiple machines, it may be convenient to install Agents in headless manner.
Prerequisites include Ubuntu Server 20.04 or greater, with an Internet connection in order to download the necessary additional software and packages.
Note
Other Linux distributions should also be compatible with Omniverse Farm Agent, although only Ubuntu 20.04 is officially supported for production use.
Start by installing the required software dependencies:
$ sudo apt-get install -y --no-install-recommends \ curl \ libatomic1 \ libxi6 \ libxrandr2 \ libxt6 \ libegl1 \ libglu1-mesa \ libgomp1 \ libsm6 \ unzip
Upload the farm_agent_install.sh script to the server running Farm Agent, and place in the /opt/ove folder after creating it:
1#!/bin/bash 2 3# 4# Note: Specific package versions can be retrieved from the Omniverse Launcher. 5# 6 7# Install the Omniverse Farm Agent package, containing the core Agent capabilities: 8mkdir -p ov-farm-agent 9cd ov-farm-agent 10pwd 11curl https://d4i3qtqj3r0z5.cloudfront.net/farm-agent-launcher@105.1.0%2Bmaster.267.63d5b393.tc.linux-x86_64.release.zip > farm-agent-launcher.zip 12unzip farm-agent-launcher.zip 13rm farm-agent-launcher.zip 14 15# Install the Kit SDK package, containing the set of features and extensions shared by Omniverse applications: 16mkdir kit 17cd kit 18pwd 19curl https://d4i3qtqj3r0z5.cloudfront.net/kit-sdk-launcher@105.1%2Bmaster.120930.709ebe37.tc.linux-x86_64.release.zip > kit-sdk-launcher.zip 20unzip kit-sdk-launcher.zip 21rm kit-sdk-launcher.zip 22 23# Install the Multiview Batch package: 24cd .. 25mkdir -p jobs/multiview-batch 26cd jobs/multiview-batch 27pwd 28curl https://d4i3qtqj3r0z5.cloudfront.net/farm-job-multiview-batch-render%40105.1.0%2Bmain.136.1ebfd569.tc.linux-x86_64.release.zip > farm-job-multiview-batch.zip 29unzip farm-job-multiview-batch.zip 30rm farm-job-multiview-batch.zip 31 32# Install the "create-render" package, containing the job definition for the rendering task: 33cd ../.. 34mkdir -p jobs/create-render 35cd jobs/create-render 36pwd 37curl https://d4i3qtqj3r0z5.cloudfront.net/farm-job-create-render@105.1.0%2Bmain.205.7755a0d5.tc.linux-x86_64.release.zip > farm-job-create-render.zip 38unzip farm-job-create-render.zip 39rm farm-job-create-render.zip 40 41cd ../.. 42 43# Create a boilerplate launch script for the Agent: 44cat << 'EOF' > agent.sh 45#!/bin/bash 46 47BASEDIR=$(dirname "$0") 48exec $BASEDIR/kit/kit $BASEDIR/apps/omni.farm.agent.headless.kit \ 49 --ext-folder $BASEDIR/exts-farm-agent \ 50 --/exts/omni.services.farm.agent.operator/job_store_args/job_directories/0=$BASEDIR/jobs/* \ 51 --/exts/omni.services.farm.agent.operator/manager_host=http://<QUEUE IP>:<QUEUE PORT> \ 52 --/exts/omni.services.farm.agent.controller/agents_service_host=http://<QUEUE IP>:<QUEUE PORT> \ 53 --/exts/omni.services.farm.agent.controller/tasks_service_host=http://<QUEUE IP>:<QUEUE PORT> 54EOF 55 56chmod +x agent.sh
Change the permission of the farm_agent_install.sh script in order to make it executable:
$ chmod +x farm_agent_install.sh
Run the script from within the /opt/ove folder as a non-root user:
$ ./farm_agent_install.sh
Once the files are downloaded and extracted, you will have a folder named /opt/ove/ov-farm-agent.
Ensure that the /opt/ove/ov-farm-agent and all the files within are owned by a non-root user.
Configure the Farm Agent Controller and Operator addresses with the Farm Queue Server address and port (where the default Farm Queue Server port is 8222 unless it was explicitly modified) to agent.sh:
# [...] --/exts/omni.services.farm.agent.operator/manager_host=http://<QUEUE IP>:8222 --/exts/omni.services.farm.agent.controller/agents_service_host=http://<QUEUE IP>:8222 --/exts/omni.services.farm.agent.controller/tasks_service_host=http://<QUEUE IP>:8222
Once the Controller and Operator addresses are configured, launch the Agent:
$ ./agent.sh &
Updating Job Definitions For Manually Installed Builds#
By default, the Agent’s create-render job definition will use the default Composer build that is installed and managed by the Omniverse Launcher.
In the case that you are not using the Omniverse Launcher to manage Composer builds and are instead manually installing builds, then you must update the create-render job definition to point to the Composer startup script. This applies equally to other applications that are manually installed which are generally managed through the Omniverse Launcher.
To have the agent use your manually installed Composer build, modify the [job.create-render] section of the agent Kit file and replace the command = “launcher:///create” line to point to the absolute path of your Composer startup script:
command = "/absolute-path-to-my-manually-installed-composer/startup.sh"
Failure to update the create-render job definition in the case of manually managing the Composer builds will result in an error resembling the following:
[Error] [omni.services.farm.facilities.jobs.store.directory] Failed to resolve launch settings for create-render. Make sure the launcher is running and that the requested app is installed. Error: Cannot connect to host localhost:33480 ssl:default [Connect call failed ('127.0.0.1', 33480)]
Output Log#
When executing the Agent, the console will display an output similar to the following to indicate it is running successfully:
[user@machine ov-farm-agent] $ ./agent.sh [Info] [carb] Logging to file: /home/user/.nvidia-omniverse/logs/Kit/omni.farm.agent.headless/102.1/kit_20220502_132007.log [0.346s] [ext: omni.kit.pipapi-0.0.0] startup [0.360s] [ext: omni.services.pip_archive-0.3.0] startup . . . [1.464s] [ext: omni.farm.agent.headless-102.1.0] startup [1.574s] app ready
In case of error due to an Agent is running without proper acceleration, an output similar to the following will be displayed:
[1.964s] [ext: omni.farm.agent.headless-102.1.0] startup 2022-04-29 20:57:11 [2,049ms] [Error] [omni.services.farm.facilities.agent.capacity.managers.base] Failed to load capacities for omni.services.farm.facilities.agent.capacity.GPU: NVML Shared Library Not Found [2.075s] app ready
See the Linux Troubleshooting for any installation issues.
3. Scaling#
A. SQL database#
By default OV Farm, when installed in this configuration, will use a SQLite DB. This will suffer from performance and scalability as many agents come and request work. It also prevents multiple instances of the various services to be active.
It is possible to instead use a different SQL database such as MariaDB to run remotely or on the same host and provide better performance.
The user account will need to have create permissions to create the DB and the tables.
To change the database connection string to use a MariaDB based SQL instance, Add the following value in the omni.queue.headless.kit file:
[settings]
# Avoids shader cache compilation at startup/RTX requirements.
exts."omni.kit.renderer.core".compatibilityMode = true
exts."omni.kit.async_engine".event_loop_windows = "ProactorEventLoop"
exts."omni.services.transport.server.http".allow_port_range = false
exts."omni.services.transport.server.http".port = 8222
exts."omni.services.farm.management.tasks".dbs.task-persistence.connection_string="mysql://<username>:<password>@<host>:<port>/<db_name>"
And remove the following line from the queue.sh script:
#!/bin/bash
BASEDIR=$(dirname "$0")
exec $BASEDIR/kit/kit $BASEDIR/apps/omni.farm.queue.headless.kit \
--ext-folder $BASEDIR/exts-farm-queue \
--/exts/omni.services.farm.management.tasks/dbs/task-persistence/connection_string=sqlite:///$BASEDIR//task-management.db
B. Redis#
B.1 Agent services#
By default the agent store is in memory meaning that only a single instance of the agent service is supported. It is possible to swap out the in memory agent store with a Redis backed instance.
To change the agent’s backend, add the following values in the omni.queue.headless.kit file, installed as part of the installations steps above:
Be sure to replace the <host> and <port> with the host and port of the redis instance.
The connection_string format documentation is available here
[settings]
# Avoids shader cache compilation at startup/RTX requirements.
exts."omni.kit.renderer.core".compatibilityMode = true
exts."omni.kit.async_engine".event_loop_windows = "ProactorEventLoop"
exts."omni.services.transport.server.http".allow_port_range = false
exts."omni.services.transport.server.http".port = 8222
exts."omni.services.farm.management.agents".manager_class = "omni.services.farm.management.agents.managers.redis.RedisAgentManager"
[settings.exts."omni.services.farm.management.agents".manager_args]
connection_string="redis://<host>:<port>"