1. Omniverse Isaac Gym¶
The Omniverse Isaac Gym extension provides an interface for performing reinforcement learning training and inferencing in Isaac Sim. This framework simplifies the process of connecting reinforcement learning libraries and algorithms with other components in Isaac Sim. Similar to existing frameworks and environment wrapper classes that inherit from gym.Env, the Omniverse Isaac Gym extension also provides an interface inheriting from gym.Env and implements a simple set of APIs required by most common RL libraries. This interface can be used as a bridge connecting RL libraries with physics simulation and tasks running in the Isaac Sim framework.
1.1. Learning Objectives¶
In this tutorial, we will do an introduction of Omniverse Isaac Gym and the interfaces provided in the extension. We will
Introduce reinforcement learning ecosystem in Isaac Sim
Introduce different environment wrapper interfaces in Omniverse Isaac Gym
5 Minute Tutorial
1.2. Getting Started¶
This is an introductory tutorial that covers the basics of reinforcement learning interfaces provided in Isaac Sim.
1.3. Reinforcement Learning in Isaac Sim¶
We can view the RL ecosystem as three main pieces: the Task, the RL policy, and the Environment wrapper that provides an interface for communication between the task and the RL policy. We aim to provide the latter with Omniverse Isaac Gym.
The Task is where main task logic is implemented, such as computing observations and rewards. This is where we can collect states of actors in the scene and apply controls or actions to our actors. Omniverse Isaac Gym allows for tasks to be defined following the BaseTask definition in omni.isaac.core. This provides flexibility for users to re-use task implementations for both RL and non-RL use cases.
1.3.2. Environment Wrappers¶
The main purpose of the Omniverse Isaac Gym extension is to provide Environment Wrapper interfaces that allow for RL policies to communicate with simulation in Isaac Sim. As a base interface, we are providing a class named VecEnvBase, a vectorized interface inheriting from gym.Env that implements common RL APIs. This class can also be easily extended towards RL libraries that require additional APIs by creating a new derived class.
Commonly used APIs provided by the base wrapper class VecEnvBase include:
render(self, mode: str = “human”): renders the current frame
close(self): closes the simulator
seed(self, seed: int = -1): sets a seed. Use -1 for a random seed.
step(self, actions: Union[np.ndarray, torch.Tensor]): triggers task pre_physics_step with actions, steps simulation and renderer, computes observations, rewards, dones, and returns state buffers
reset(self): triggers task reset(), steps simulation, and re-computes observations
188.8.131.52. Multi-Threaded Environment Wrapper¶
VecEnvBase is a simple interface that’s designed to provide commonly used gym.Env APIs required by RL libraries. Users can create an instance of this class, attach your task to the interface, and provide your wrapper instance to the RL policy. Since the RL algorithm maintains the main loop of execution, interaction with the UI and environments in the scene can be limited and may interfere with the training loop.
We also provide another environment wrapper class called VecEnvMT, which is designed to isolate the RL policy in a new thread, separate from the main simulation and rendering thread. This class provides the same set of APIs as VecEnvBase, but also includes threaded queues for sending and receiving actions and states between the RL policy and the task. In order to use this wrapper interface, users have to implement a TrainerMT class, which should implement a run() method that initiates the RL loop on a new thread. The setup for using VecEnvMT is more involved compared to the single-threaded VecEnvBase interface, but will allow users to have more control over starting and stopping the training loop through interaction with the UI.
Note that VecEnvMT has a timeout variable, which defaults to 30 seconds. If either the RL thread waiting for physics state exceeds the timeout amount or the simulation thread waiting for RL actions exceeds the timeout amount, the threaded queues will throw an exception and terminate training. For larger scenes that require longer simulation or training time, try increasing the timeout variable in VecEnvMT to prevent unnecessary timeouts. This can be done by passing in a timeout argument when calling VecEnvMT.initialize().
This tutorial covered the following topics:
Introduction to RL in Isaac Sim
Introduction to environment wrapper interfaces in Omniverse Isaac Gym