3. Offline Dataset Generation

3.1. Learning Objectives

This tutorial demonstrates how to generate an offline synthetic dataset (the generated data will be stored on disk) that can be used for training deep neural networks. The full example can be executed through the Isaac-Sim python environment, and this tutorial will use <install_path>/isaac_sim/standalone_examples/replicator/offline_generation.py to demonstrate the use of omni.replicator extension together with simulated scenes to collect ground-truth information from the sensors that come with omni.replicator.

After this tutorial, you should be able to collect and save sensor data from a stage and randomize components in it.

25-30 min tutorial

3.1.1. Prerequisites

Read the Getting Started With Replicator document to become familiar with the basics of omni.replicator.

3.2. Getting Started

To generate a synthetic dataset offline, run the following command.

./python.sh standalone_examples/replicator/offline_generation.py

3.3. Running as a SimulationApp

The code for this tutorial is a different from the default omni.replicator examples, which are usually executed using the script editor in the Kit GUI. The provided script will run an instance of Omniverse Isaac Sim in headless mode. For this, the SimulationApp object needs to be created before importing any other dependencies (such as omni.replicator.core).

Starting Isaac Sim
11from omni.isaac.kit import SimulationApp
12import os
14# Set rendering parameters and create an instance of kit
15CONFIG = {"renderer": "RayTracedLighting", "headless": True, "width": 1024, "height": 1024, "num_frames": 10}
16simulation_app = SimulationApp(launch_config=CONFIG)
18ENV_URL = "/Isaac/Environments/Simple_Warehouse/full_warehouse.usd"
19FORKLIFT_URL = "/Isaac/Props/Forklift/forklift.usd"
20PALLET_URL = "/Isaac/Environments/Simple_Warehouse/Props/SM_PaletteA_01.usd"
21CARDBOX_URL = "/Isaac/Environments/Simple_Warehouse/Props/SM_CardBoxD_04.usd"
22CONE_URL = "/Isaac/Environments/Simple_Warehouse/Props/S_TrafficCone.usd"
23SCOPE_NAME = "/MyScope"
25import carb
26import random
27import math
28import numpy as np
29from pxr import UsdGeom, Usd, Gf, UsdPhysics, PhysxSchema
31import omni.usd
32from omni.isaac.core import World
33from omni.isaac.core.utils import prims
34from omni.isaac.core.prims import RigidPrim
35from omni.isaac.core.utils.nucleus import get_assets_root_path
36from omni.isaac.core.utils.stage import get_current_stage, open_stage
37from omni.isaac.core.utils.rotations import euler_angles_to_quat, quat_to_euler_angles, lookat_to_quatf
38from omni.isaac.core.utils.bounds import compute_combined_aabb, create_bbox_cache
40import omni.replicator.core as rep

3.4. Loading the Environment

The environment is a USD stage. As a first step, the stage is loaded using the helper function open_stage.

Load the stage
226def main():
227    # Open the environment in a new stage
228    print(f"Loading Stage {ENV_URL}")
229    open_stage(prefix_with_isaac_asset_server(ENV_URL))

3.5. Creating the Cameras and the Writer

The example provides two ways (Replicator and Isaac Sim API) of creating cameras rep.create.camera and prims.create_prim which will be used as render products to generate the data. The created render products are attached to the built-in BasicWriter to collect the data from the selected annotators (rgb, semantic_segmentation, bounding_box_3d, etc.) and to write it to the given output path. Using rep.get.prim_at_path, we can access the driver_cam_prim prim wrapped in an omnigraph node in order to be randomized each step by the randomization graph generated by Replicator.

Creating the cameras
274    driver_cam_prim = prims.create_prim(
275        prim_path=f"{SCOPE_NAME}/DriverCamera",
276        prim_type="Camera",
277        position=driver_cam_pos_gf,
278        orientation=look_at_pallet_xyzw,
279        attributes={"focusDistance": 400, "focalLength": 24, "clippingRange": (0.1, 10000000)},
280    )
282    driver_cam_node = rep.get.prim_at_path(str(driver_cam_prim.GetPath()))
284    # Camera looking at the pallet
285    pallet_cam = rep.create.camera()
287    # Camera looking at the forklift from a top view with large min clipping to see the scene through the ceiling
288    top_view_cam = rep.create.camera(clipping_range=(6.0, 1000000.0))

Being a built-in writer, BasicWriter is already registered, and can be accessed from the WriterRegistry. The writer is then initialized with the output directory and the selected annotators. Finally, the render products are created from the cameras and attached to the writer.

Creating the writer
304    # Initialize and attach writer
305    writer = rep.WriterRegistry.get("BasicWriter")
306    output_directory = os.getcwd() + "/_output_headless"
307    print("Outputting data to ", output_directory)
308    writer.initialize(
309        output_dir=output_directory,
310        rgb=True,
311        bounding_box_2d_tight=True,
312        semantic_segmentation=True,
313        instance_segmentation=True,
314        distance_to_image_plane=True,
315        bounding_box_3d=True,
316        occlusion=True,
317        normals=True,
318    )
320    RESOLUTION = (CONFIG["width"], CONFIG["height"])
321    driver_rp = rep.create.render_product(str(driver_cam_prim.GetPrimPath()), RESOLUTION)
322    pallet_rp = rep.create.render_product(pallet_cam, RESOLUTION)
323    forklift_rp = rep.create.render_product(top_view_cam, RESOLUTION)
324    writer.attach([driver_rp, forklift_rp, pallet_rp])

3.6. Domain Randomization

The following snippet provides examples of various randomization possibilities using Isaac Sim and Replicator API. It starts by spawning a forklift using Isaac Sim API to a randomly generated pose. It then uses the forklift pose to place a pallet in front of it withing the bounds of a random distance.

235    # Spawn a new forklift at a random pose
236    forklift_prim = prims.create_prim(
237        prim_path=f"{SCOPE_NAME}/Forklift",
238        position=(random.uniform(-20, -2), random.uniform(-1, 3), 0),
239        orientation=euler_angles_to_quat([0, 0, random.uniform(0, math.pi)]),
240        usd_path=prefix_with_isaac_asset_server(FORKLIFT_URL),
241        semantic_label="Forklift",
242    )
244    # Spawn the pallet in front of the forklift with a random offset on the Y axis
245    forklift_tf = omni.usd.get_world_transform_matrix(forklift_prim)
246    pallet_offset_tf = Gf.Matrix4d().SetTranslate(Gf.Vec3d(0, random.uniform(-1.2, -2.4), 0))
247    pallet_pos_gf = (pallet_offset_tf * forklift_tf).ExtractTranslation()
248    forklift_quat_gf = forklift_tf.ExtractRotation().GetQuaternion()
249    forklift_quat_xyzw = (forklift_quat_gf.GetReal(), *forklift_quat_gf.GetImaginary())
251    pallet_prim = prims.create_prim(
252        prim_path=f"{SCOPE_NAME}/Pallet",
253        position=pallet_pos_gf,
254        orientation=forklift_quat_xyzw,
255        usd_path=prefix_with_isaac_asset_server(PALLET_URL),
256        semantic_label="Pallet",
257    )

After spawning the forklift and the empty pallet, the example runs a short physics simulation by dropping several stacked boxes on a pallet behind the forklift.

137def simulate_falling_objects(prim, num_sim_steps=250, num_boxes=8):
138    # Create a simulation ready world
139    world = World(physics_dt=1.0 / 90.0, stage_units_in_meters=1.0)
141    # Choose a random spawn offset relative to the given prim
142    prim_tf = omni.usd.get_world_transform_matrix(prim)
143    spawn_offset_tf = Gf.Matrix4d().SetTranslate(Gf.Vec3d(random.uniform(-0.5, 0.5), random.uniform(3, 3.5), 0))
144    spawn_pos_gf = (spawn_offset_tf * prim_tf).ExtractTranslation()
146    # Spawn pallet prim
147    .
148    .
149    .
151    # Spawn boxes falling on the pallet
152    for i in range(num_boxes):
153        # Spawn box prim
154        cardbox_prim_name = f"SimulatedCardbox_{i}"
155        box_prim = prims.create_prim(
156            prim_path=f"{SCOPE_NAME}/{cardbox_prim_name}",
157            usd_path=prefix_with_isaac_asset_server(CARDBOX_URL),
158            semantic_label="Cardbox",
159        )
161        # Add the height of the box to the current spawn height
162        curr_spawn_height += bb_cache.ComputeLocalBound(box_prim).GetRange().GetSize()[2] * 1.1
164        # Wrap the cardbox prim into a rigid prim to be able to simulate it
165        box_rigid_prim = RigidPrim(
166            prim_path=str(box_prim.GetPrimPath()),
167            name=cardbox_prim_name,
168            position=spawn_pos_gf + Gf.Vec3d(random.uniform(-0.2, 0.2), random.uniform(-0.2, 0.2), curr_spawn_height),
169            orientation=euler_angles_to_quat([0, 0, random.uniform(0, math.pi)]),
170        )
172        # Make sure physics are enabled on the rigid prim
173        box_rigid_prim.enable_rigid_body_physics()
175        # Register rigid prim with the scene
176        world.scene.add(box_rigid_prim)
178    # Reset world after adding simulated assets for physics handles to be propagated properly
179    world.reset()
181    # Simulate the world for the given number of steps or until the highest box stops moving
182    last_box = world.scene.get_object(f"SimulatedCardbox_{num_boxes - 1}")
183    for i in range(num_sim_steps):
184        world.step(render=False)
185        if last_box and np.linalg.norm(last_box.get_linear_velocity()) < 0.001:
186            print(f"Simulation stopped after {i} steps")
187            break

Furthermore, using the Replicator API various randomizers are registered. It starts with a rep.randomizer.scatter_2d example, where boxes are randomly scattered on the surface of the pallet in front of the forklift. The randomizer is also randomizing the materials of the boxes using rep.randomizer.materials. The generated randomization graph is then registered using rep.randomizer.register.

Domain Randomization
55# Randomize boxes materials and their location on the surface of the given prim
56def register_scatter_boxes(prim):
57    # Calculate the bounds of the prim to create a scatter plane of its size
58    bb_cache = create_bbox_cache()
59    bbox3d_gf = bb_cache.ComputeLocalBound(prim)
60    prim_tf_gf = omni.usd.get_world_transform_matrix(prim)
62    # Calculate the bounds of the prim
63    bbox3d_gf.Transform(prim_tf_gf)
64    range_size = bbox3d_gf.GetRange().GetSize()
66    # Get the quaterion of the prim in xyzw format from usd
67    prim_quat_gf = prim_tf_gf.ExtractRotation().GetQuaternion()
68    prim_quat_xyzw = (prim_quat_gf.GetReal(), *prim_quat_gf.GetImaginary())
70    # Create a plane on the pallet to scatter the boxes on
71    plane_scale = (range_size[0] * 0.8, range_size[1] * 0.8, 1)
72    plane_pos_gf = prim_tf_gf.ExtractTranslation() + Gf.Vec3d(0, 0, range_size[2])
73    plane_rot_euler_deg = quat_to_euler_angles(np.array(prim_quat_xyzw), degrees=True)
74    scatter_plane = rep.create.plane(
75        scale=plane_scale, position=plane_pos_gf, rotation=plane_rot_euler_deg, visible=False
76    )
78    cardbox_mats = [
79        prefix_with_isaac_asset_server("/Isaac/Environments/Simple_Warehouse/Materials/MI_PaperNotes_01.mdl"),
80        prefix_with_isaac_asset_server("/Isaac/Environments/Simple_Warehouse/Materials/MI_CardBoxB_05.mdl"),
81    ]
83    def scatter_boxes():
84        cardboxes = rep.create.from_usd(
85            prefix_with_isaac_asset_server(CARDBOX_URL), semantics=[("class", "Cardbox")], count=5
86        )
87        with cardboxes:
88            rep.randomizer.scatter_2d(scatter_plane, check_for_collisions=True)
89            rep.randomizer.materials(cardbox_mats)
90        return cardboxes.node
92    rep.randomizer.register(scatter_boxes)

The next randomization example calculates the corners of the bounding box of the forklift together with the pallet and uses the corners as a predefined list of locations to randomly place a traffic cone.

 94# Randomly place cones from calculated locations around the working area (combined bounds) of the forklift and pallet
 95def register_cone_placement(forklift_prim, pallet_prim):
 96    # Helper function to get the combined bounds of the forklift and pallet
 97    bb_cache = create_bbox_cache()
 98    combined_range_arr = compute_combined_aabb(bb_cache, [forklift_prim.GetPrimPath(), pallet_prim.GetPrimPath()])
100    min_x = float(combined_range_arr[0])
101    min_y = float(combined_range_arr[1])
102    min_z = float(combined_range_arr[2])
103    max_x = float(combined_range_arr[3])
104    max_y = float(combined_range_arr[4])
105    corners = [(min_x, min_y, min_z), (max_x, min_y, min_z), (min_x, max_y, min_z), (max_x, max_y, min_z)]
107    def place_cones():
108        cones = rep.create.from_usd(prefix_with_isaac_asset_server(CONE_URL), semantics=[("class", "TrafficCone")])
109        with cones:
110            rep.modify.pose(position=rep.distribution.sequence(corners))
111        return cones.node
113    rep.randomizer.register(place_cones)

The following example randomizes light parameters and their placement above the forklift and the pallet area.

117# Randomize lights around the scene
118def register_lights_placement(forklift_prim, pallet_prim):
119    bb_cache = create_bbox_cache()
120    combined_range_arr = compute_combined_aabb(bb_cache, [forklift_prim.GetPrimPath(), pallet_prim.GetPrimPath()])
121    pos_min = (combined_range_arr[0], combined_range_arr[1], 6)
122    pos_max = (combined_range_arr[3], combined_range_arr[4], 7)
124    def randomize_lights():
125        lights = rep.create.light(
126            light_type="Sphere",
127            color=rep.distribution.uniform((0.2, 0.1, 0.1), (0.9, 0.8, 0.8)),
128            intensity=rep.distribution.uniform(500, 2000),
129            position=rep.distribution.uniform(pos_min, pos_max),
130            scale=rep.distribution.uniform(5, 10),
131            count=3,
132        )
133        return lights.node
135    rep.randomizer.register(randomize_lights)

Similarly to the above examples, Replicator has support for many other randomizations. For more information, please refer to Replicator’s randomizer examples tutorials.

Finally, the registered randomizations are triggered each frame, together with the camera movements. One camera is looking at the pallet in front of the forklift and orbiting it, while the other camera is looking at the whole scene from various heights above.

Domain Randomization
288with rep.trigger.on_frame(num_frames=CONFIG["num_frames"]):
289    rep.randomizer.scatter_boxes()
290    rep.randomizer.place_cones()
291    rep.randomizer.randomize_lights()
293    pallet_cam_min = (pallet_pos_gf[0] - 2, pallet_pos_gf[1] - 2, 2)
294    pallet_cam_max = (pallet_pos_gf[0] + 2, pallet_pos_gf[1] + 2, 4)
295    with pallet_cam:
296        rep.modify.pose(
297            position=rep.distribution.uniform(pallet_cam_min, pallet_cam_max),
298            look_at=str(pallet_prim.GetPrimPath()),
299        )
301    top_view_cam_min = (foklift_pos_gf[0], foklift_pos_gf[1], 9)
302    top_view_cam_max = (foklift_pos_gf[0], foklift_pos_gf[1], 11)
303    with top_view_cam:
304        rep.modify.pose(
305            position=rep.distribution.uniform(top_view_cam_min, top_view_cam_max),
306            rotation=rep.distribution.uniform((0, -90, 0), (0, -90, 180)),
307        )
309    driver_cam_min = (driver_cam_pos_gf[0], driver_cam_pos_gf[1], driver_cam_pos_gf[2] - 0.25)
310    driver_cam_max = (driver_cam_pos_gf[0], driver_cam_pos_gf[1], driver_cam_pos_gf[2] + 0.25)
311    with driver_cam_node:
312        rep.modify.pose(
313            position=rep.distribution.uniform(driver_cam_min, driver_cam_max),
314            look_at=str(pallet_prim.GetPrimPath()),
315        )

3.7. Running the Script

For triggering each randomization and the data writing, the run_orchestrator function does this by starting the process through rep.orchestrator.run(). It then waits until the requested number of frames is processed. Eventually, the rep.orchestrator.stop() function finishes the process and with rep.BackendDispatch.wait_until_done() it waits until all data is written to disk before closing the SimulationApp.


The resulting data will be saved in the directory used to start the process in the _output_headless subfolder.

210# Starts replicator and waits until all data was successfully written
211def run_orchestrator():
212    rep.orchestrator.run()
214    # Wait until started
215    while not rep.orchestrator.get_is_started():
216        simulation_app.update()
218    # Wait until stopped
219    while rep.orchestrator.get_is_started():
220        simulation_app.update()
222    rep.BackendDispatch.wait_until_done()
223    rep.orchestrator.stop()

3.8. Summary

This tutorial covered the following topics:

  1. Starting a SimulationApp instance of Omniverse Isaac Sim to work with replicator

  2. Loading a stage and various assets to random poses using plain Isaac Sim API

  3. Setting up cameras and the writer to write out data

  4. Registering randomizations with Replicator

  5. Using orchestrator to run the data collection

3.8.1. Next Steps

One possible use for the created data is with the TAO Toolkit.

Once the generated synthetic data is in Kitti format, you can use the TAO Toolkit to train a model. TAO provides segmentation, classification and object detection models. This example uses object detection with the Detectnet V2 model as a use case.

To get started with TAO, follow the set-up instructions. Then, activate the virtual environment and run the Jupyter Notebooks as explained in detail here.

TAO uses Jupyter notebooks to guide you through the training process. In the folder cv_samples_v1.3.0, you will find notebooks for multiple models. You can use any of the object detection networks for this use case, but this example uses Detectnet_V2.

In the detectnet_v2 folder, you will find the Jupyter notebook and the specs folder. The TAO Detectnet_V2 documentation goes into more detail about this sample. TAO works with configuration files that can be found in the specs folder. Here, you need to modify the specs to refer to the generated synthetic data as the input.

To prepare the data, you need to run the following command.

tao detectnet_v2 dataset-convert [-h] -d DATASET_EXPORT_SPEC -o OUTPUT_FILENAME [-f VALIDATION_FOLD]

This is in the Jupyter notebook with a sample configuration. Modify the spec file to match the folder structure of your synthetic data. The data will be in TFrecord format and is ready for training. Again, you need to change the spec file for training to represent the path to the synthetic data and the classes being detected.

tao detectnet_v2 train [-h] -k <key>
                        -r <result directory>
                        -e <spec_file>
                        [-n <name_string_for_the_model>]
                        [--gpus <num GPUs>]
                        [--gpu_index <comma separate gpu indices>]
                        [--log_file <log_file>]

For any questions regarding the TAO Toolkit, refer to the TAO documentation, which goes into further detail.

3.8.2. Further Learning

To learn how to use Omniverse Isaac Sim to create data sets in an interactive manner, see the Synthetic Data Recorder, and then visualize them with the Synthetic Data Visualizer.