3. Offline Dataset Generation

3.1. Learning Objectives

This tutorial demonstrates how to generate a synthetic dataset offline that can be used for training deep neural networks. The full example can be executed through the Isaac-Sim python environment, and this tutorial will use standalone_examples/replicator/offline_generation.py to demonstrate the use of omni.replicator extension together with simulated scenes to collect ground-truth information from the sensors that come with omni.replicator.

After this tutorial, you should be able to collect and save sensor data from a stage and randomize components in it.

25-30 min tutorial

3.1.1. Prerequisites

Read the Getting Started With Replicator document to become familiar with the basics of omni.replicator.

3.2. Getting Started

To generate a synthetic dataset offline, run the following command.

./python.sh standalone_examples/replicator/offline_generation.py

3.3. Running as a SimulationApp

The code for this tutorial is a different from the default omni.replicator examples, which are usually executed using the script editor in the Kit GUI. The provided script will run an instance of Omniverse Isaac Sim in headless mode. For this, the SimulationApp object needs to be created before importing any other dependencies (such as omni.replicator.core).

Starting Isaac Sim
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from omni.isaac.kit import SimulationApp
import os

# Set rendering parameters and create an instance of kit
CONFIG = {"renderer": "RayTracedLighting", "headless": True, "width": 1024, "height": 1024, "num_frames": 10}
simulation_app = SimulationApp(launch_config=CONFIG)

ENV_URL = "/Isaac/Environments/Simple_Warehouse/full_warehouse.usd"
FORKLIFT_URL = "/Isaac/Props/Forklift/forklift.usd"
PALLET_URL = "/Isaac/Environments/Simple_Warehouse/Props/SM_PaletteA_01.usd"
CARDBOX_URL = "/Isaac/Environments/Simple_Warehouse/Props/SM_CardBoxD_04.usd"
CONE_URL = "/Isaac/Environments/Simple_Warehouse/Props/S_TrafficCone.usd"
SCOPE_NAME = "/MyScope"

import carb
import random
import math
import numpy as np
from pxr import UsdGeom, Usd, Gf, UsdPhysics, PhysxSchema

import omni.usd
from omni.isaac.core import World
from omni.isaac.core.utils import prims
from omni.isaac.core.prims import RigidPrim
from omni.isaac.core.utils.nucleus import get_assets_root_path
from omni.isaac.core.utils.stage import get_current_stage, open_stage
from omni.isaac.core.utils.rotations import euler_angles_to_quat, quat_to_euler_angles, lookat_to_quatf
from omni.isaac.core.utils.bounds import compute_combined_aabb, create_bbox_cache

import omni.replicator.core as rep

3.4. Loading the Environment

The environment is a USD stage. As a first step, the stage is loaded using the helper function open_stage.

Load the stage
226
227
228
229
def main():
    # Open the environment in a new stage
    print(f"Loading Stage {ENV_URL}")
    open_stage(prefix_with_isaac_asset_server(ENV_URL))

3.5. Creating the Cameras and the Writer

The example provides two ways (Replicator and Isaac Sim API) of creating cameras rep.create.camera and prims.create_prim which will be used as render products to generate the data. The created render products are attached to the built-in BasicWriter to collect the data from the selected annotators (rgb, semantic_segmentation, bounding_box_3d, etc.) and to write it to the given output path. Using rep.get.prim_at_path, we can access the driver_cam_prim prim wrapped in an omnigraph node in order to be randomized each step by the randomization graph generated by Replicator.

Creating the cameras
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
    driver_cam_prim = prims.create_prim(
        prim_path=f"{SCOPE_NAME}/DriverCamera",
        prim_type="Camera",
        position=driver_cam_pos_gf,
        orientation=look_at_pallet_xyzw,
        attributes={"focusDistance": 400, "focalLength": 24, "clippingRange": (0.1, 10000000)},
    )

    driver_cam_node = rep.get.prim_at_path(str(driver_cam_prim.GetPath()))

    # Camera looking at the pallet
    pallet_cam = rep.create.camera()

    # Camera looking at the forklift from a top view with large min clipping to see the scene through the ceiling
    top_view_cam = rep.create.camera(clipping_range=(6.0, 1000000.0))

Being a built-in writer, BasicWriter is already registered, and can be accessed from the WriterRegistry. The writer is then initialized with the output directory and the selected annotators. Finally, the render products are created from the cameras and attached to the writer.

Creating the writer
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
    # Initialize and attach writer
    writer = rep.WriterRegistry.get("BasicWriter")
    output_directory = os.getcwd() + "/_output_headless"
    print("Outputting data to ", output_directory)
    writer.initialize(
        output_dir=output_directory,
        rgb=True,
        bounding_box_2d_tight=True,
        semantic_segmentation=True,
        instance_segmentation=True,
        distance_to_image_plane=True,
        bounding_box_3d=True,
        occlusion=True,
        normals=True,
    )

    RESOLUTION = (CONFIG["width"], CONFIG["height"])
    driver_rp = rep.create.render_product(str(driver_cam_prim.GetPrimPath()), RESOLUTION)
    pallet_rp = rep.create.render_product(pallet_cam, RESOLUTION)
    forklift_rp = rep.create.render_product(top_view_cam, RESOLUTION)
    writer.attach([driver_rp, forklift_rp, pallet_rp])

3.6. Domain Randomization

The following snippet provides examples of various randomization possibilities using Isaac Sim and Replicator API. It starts by spawning a forklift using Isaac Sim API to a randomly generated pose. It then uses the forklift pose to place a pallet in front of it withing the bounds of a random distance.

235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
    # Spawn a new forklift at a random pose
    forklift_prim = prims.create_prim(
        prim_path=f"{SCOPE_NAME}/Forklift",
        position=(random.uniform(-20, -2), random.uniform(-1, 3), 0),
        orientation=euler_angles_to_quat([0, 0, random.uniform(0, math.pi)]),
        usd_path=prefix_with_isaac_asset_server(FORKLIFT_URL),
        semantic_label="Forklift",
    )

    # Spawn the pallet in front of the forklift with a random offset on the Y axis
    forklift_tf = omni.usd.get_world_transform_matrix(forklift_prim)
    pallet_offset_tf = Gf.Matrix4d().SetTranslate(Gf.Vec3d(0, random.uniform(-1.2, -2.4), 0))
    pallet_pos_gf = (pallet_offset_tf * forklift_tf).ExtractTranslation()
    forklift_quat_gf = forklift_tf.ExtractRotation().GetQuaternion()
    forklift_quat_xyzw = (forklift_quat_gf.GetReal(), *forklift_quat_gf.GetImaginary())

    pallet_prim = prims.create_prim(
        prim_path=f"{SCOPE_NAME}/Pallet",
        position=pallet_pos_gf,
        orientation=forklift_quat_xyzw,
        usd_path=prefix_with_isaac_asset_server(PALLET_URL),
        semantic_label="Pallet",
    )

After spawning the forklift and the empty pallet, the example runs a short physics simulation by dropping several stacked boxes on a pallet behind the forklift.

137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
def simulate_falling_objects(prim, num_sim_steps=250, num_boxes=8):
    # Create a simulation ready world
    world = World(physics_dt=1.0 / 90.0, stage_units_in_meters=1.0)

    # Choose a random spawn offset relative to the given prim
    prim_tf = omni.usd.get_world_transform_matrix(prim)
    spawn_offset_tf = Gf.Matrix4d().SetTranslate(Gf.Vec3d(random.uniform(-0.5, 0.5), random.uniform(3, 3.5), 0))
    spawn_pos_gf = (spawn_offset_tf * prim_tf).ExtractTranslation()

    # Spawn pallet prim
    .
    .
    .

    # Spawn boxes falling on the pallet
    for i in range(num_boxes):
        # Spawn box prim
        cardbox_prim_name = f"SimulatedCardbox_{i}"
        box_prim = prims.create_prim(
            prim_path=f"{SCOPE_NAME}/{cardbox_prim_name}",
            usd_path=prefix_with_isaac_asset_server(CARDBOX_URL),
            semantic_label="Cardbox",
        )

        # Add the height of the box to the current spawn height
        curr_spawn_height += bb_cache.ComputeLocalBound(box_prim).GetRange().GetSize()[2] * 1.1

        # Wrap the cardbox prim into a rigid prim to be able to simulate it
        box_rigid_prim = RigidPrim(
            prim_path=str(box_prim.GetPrimPath()),
            name=cardbox_prim_name,
            position=spawn_pos_gf + Gf.Vec3d(random.uniform(-0.2, 0.2), random.uniform(-0.2, 0.2), curr_spawn_height),
            orientation=euler_angles_to_quat([0, 0, random.uniform(0, math.pi)]),
        )

        # Make sure physics are enabled on the rigid prim
        box_rigid_prim.enable_rigid_body_physics()

        # Register rigid prim with the scene
        world.scene.add(box_rigid_prim)

    # Reset world after adding simulated assets for physics handles to be propagated properly
    world.reset()

    # Simulate the world for the given number of steps or until the highest box stops moving
    last_box = world.scene.get_object(f"SimulatedCardbox_{num_boxes - 1}")
    for i in range(num_sim_steps):
        world.step(render=False)
        if last_box and np.linalg.norm(last_box.get_linear_velocity()) < 0.001:
            print(f"Simulation stopped after {i} steps")
            break

Furthermore, using the Replicator API various randomizers are registered. It starts with a rep.randomizer.scatter_2d example, where boxes are randomly scattered on the surface of the pallet in front of the forklift. The randomizer is also randomizing the materials of the boxes using rep.randomizer.materials. The generated randomization graph is then registered using rep.randomizer.register.

Domain Randomization
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# Randomize boxes materials and their location on the surface of the given prim
def register_scatter_boxes(prim):
    # Calculate the bounds of the prim to create a scatter plane of its size
    bb_cache = create_bbox_cache()
    bbox3d_gf = bb_cache.ComputeLocalBound(prim)
    prim_tf_gf = omni.usd.get_world_transform_matrix(prim)

    # Calculate the bounds of the prim
    bbox3d_gf.Transform(prim_tf_gf)
    range_size = bbox3d_gf.GetRange().GetSize()

    # Get the quaterion of the prim in xyzw format from usd
    prim_quat_gf = prim_tf_gf.ExtractRotation().GetQuaternion()
    prim_quat_xyzw = (prim_quat_gf.GetReal(), *prim_quat_gf.GetImaginary())

    # Create a plane on the pallet to scatter the boxes on
    plane_scale = (range_size[0] * 0.8, range_size[1] * 0.8, 1)
    plane_pos_gf = prim_tf_gf.ExtractTranslation() + Gf.Vec3d(0, 0, range_size[2])
    plane_rot_euler_deg = quat_to_euler_angles(np.array(prim_quat_xyzw), degrees=True)
    scatter_plane = rep.create.plane(
        scale=plane_scale, position=plane_pos_gf, rotation=plane_rot_euler_deg, visible=False
    )

    cardbox_mats = [
        prefix_with_isaac_asset_server("/Isaac/Environments/Simple_Warehouse/Materials/MI_PaperNotes_01.mdl"),
        prefix_with_isaac_asset_server("/Isaac/Environments/Simple_Warehouse/Materials/MI_CardBoxB_05.mdl"),
    ]

    def scatter_boxes():
        cardboxes = rep.create.from_usd(
            prefix_with_isaac_asset_server(CARDBOX_URL), semantics=[("class", "Cardbox")], count=5
        )
        with cardboxes:
            rep.randomizer.scatter_2d(scatter_plane, check_for_collisions=True)
            rep.randomizer.materials(cardbox_mats)
        return cardboxes.node

    rep.randomizer.register(scatter_boxes)

The next randomization example calculates the corners of the bounding box of the forklift together with the pallet and uses the corners as a predefined list of locations to randomly place a traffic cone.

 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
# Randomly place cones from calculated locations around the working area (combined bounds) of the forklift and pallet
def register_cone_placement(forklift_prim, pallet_prim):
    # Helper function to get the combined bounds of the forklift and pallet
    bb_cache = create_bbox_cache()
    combined_range_arr = compute_combined_aabb(bb_cache, [forklift_prim.GetPrimPath(), pallet_prim.GetPrimPath()])

    min_x = float(combined_range_arr[0])
    min_y = float(combined_range_arr[1])
    min_z = float(combined_range_arr[2])
    max_x = float(combined_range_arr[3])
    max_y = float(combined_range_arr[4])
    corners = [(min_x, min_y, min_z), (max_x, min_y, min_z), (min_x, max_y, min_z), (max_x, max_y, min_z)]

    def place_cones():
        cones = rep.create.from_usd(prefix_with_isaac_asset_server(CONE_URL), semantics=[("class", "TrafficCone")])
        with cones:
            rep.modify.pose(position=rep.distribution.sequence(corners))
        return cones.node

    rep.randomizer.register(place_cones)

The following example randomizes light parameters and their placement above the forklift and the pallet area.

117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
# Randomize lights around the scene
def register_lights_placement(forklift_prim, pallet_prim):
    bb_cache = create_bbox_cache()
    combined_range_arr = compute_combined_aabb(bb_cache, [forklift_prim.GetPrimPath(), pallet_prim.GetPrimPath()])
    pos_min = (combined_range_arr[0], combined_range_arr[1], 6)
    pos_max = (combined_range_arr[3], combined_range_arr[4], 7)

    def randomize_lights():
        lights = rep.create.light(
            light_type="Sphere",
            color=rep.distribution.uniform((0.2, 0.1, 0.1), (0.9, 0.8, 0.8)),
            intensity=rep.distribution.uniform(500, 2000),
            position=rep.distribution.uniform(pos_min, pos_max),
            scale=rep.distribution.uniform(5, 10),
            count=3,
        )
        return lights.node

    rep.randomizer.register(randomize_lights)

Similarly to the above examples, Replicator has support for many other randomizations. For more information, please refer to Replicator’s randomizer examples tutorials.

Finally, the registered randomizations are triggered each frame, together with the camera movements. One camera is looking at the pallet in front of the forklift and orbiting it, while the other camera is looking at the whole scene from various heights above.

Domain Randomization
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
with rep.trigger.on_frame(num_frames=CONFIG["num_frames"]):
    rep.randomizer.scatter_boxes()
    rep.randomizer.place_cones()
    rep.randomizer.randomize_lights()

    pallet_cam_min = (pallet_pos_gf[0] - 2, pallet_pos_gf[1] - 2, 2)
    pallet_cam_max = (pallet_pos_gf[0] + 2, pallet_pos_gf[1] + 2, 4)
    with pallet_cam:
        rep.modify.pose(
            position=rep.distribution.uniform(pallet_cam_min, pallet_cam_max),
            look_at=str(pallet_prim.GetPrimPath()),
        )

    top_view_cam_min = (foklift_pos_gf[0], foklift_pos_gf[1], 9)
    top_view_cam_max = (foklift_pos_gf[0], foklift_pos_gf[1], 11)
    with top_view_cam:
        rep.modify.pose(
            position=rep.distribution.uniform(top_view_cam_min, top_view_cam_max),
            rotation=rep.distribution.uniform((0, -90, 0), (0, -90, 180)),
        )

    driver_cam_min = (driver_cam_pos_gf[0], driver_cam_pos_gf[1], driver_cam_pos_gf[2] - 0.25)
    driver_cam_max = (driver_cam_pos_gf[0], driver_cam_pos_gf[1], driver_cam_pos_gf[2] + 0.25)
    with driver_cam_node:
        rep.modify.pose(
            position=rep.distribution.uniform(driver_cam_min, driver_cam_max),
            look_at=str(pallet_prim.GetPrimPath()),
        )

3.7. Running the Script

For triggering each randomization and the data writing, the run_orchestrator function does this by starting the process through rep.orchestrator.run(). It then waits until the requested number of frames is processed. Eventually, the rep.orchestrator.stop() function finishes the process and with rep.BackendDispatch.wait_until_done() it waits until all data is written to disk before closing the SimulationApp.

Note

The resulting data will be saved in the directory used to start the process in the _output_headless subfolder.

210
211
212
213
214
215
216
217
218
219
220
221
222
223
# Starts replicator and waits until all data was successfully written
def run_orchestrator():
    rep.orchestrator.run()

    # Wait until started
    while not rep.orchestrator.get_is_started():
        simulation_app.update()

    # Wait until stopped
    while rep.orchestrator.get_is_started():
        simulation_app.update()

    rep.BackendDispatch.wait_until_done()
    rep.orchestrator.stop()

3.8. Summary

This tutorial covered the following topics:

  1. Starting a SimulationApp instance of Omniverse Isaac Sim to work with replicator

  2. Loading a stage and various assets to random poses using plain Isaac Sim API

  3. Setting up cameras and the writer to write out data

  4. Registering randomizations with Replicator

  5. Using orchestrator to run the data collection

3.8.1. Next Steps

One possible use for the created data is with the TAO Toolkit.

Once the generated synthetic data is in Kitti format, you can use the TAO Toolkit to train a model. TAO provides segmentation, classification and object detection models. This example uses object detection with the Detectnet V2 model as a use case.

To get started with TAO, follow the set-up instructions. Then, activate the virtual environment and download the Jupyter Notebooks as explained in detail here.

TAO uses Jupyter notebooks to guide you through the training process. In the folder cv_samples_v1.3.0, you will find notebooks for multiple models. You can use any of the object detection networks for this use case, but this example uses Detectnet_V2.

In the detectnet_v2 folder, you will find the Jupyter notebook and the specs folder. The TAO Detectnet_V2 documentation goes into more detail about this sample. TAO works with configuration files that can be found in the specs folder. Here, you need to modify the specs to refer to the generated synthetic data as the input.

To prepare the data, you need to run the following command.

tao detectnet_v2 dataset-convert [-h] -d DATASET_EXPORT_SPEC -o OUTPUT_FILENAME [-f VALIDATION_FOLD]

This is in the Jupyter notebook with a sample configuration. Modify the spec file to match the folder structure of your synthetic data. The data will be in TFrecord format and is ready for training. Again, you need to change the spec file for training to represent the path to the synthetic data and the classes being detected.

tao detectnet_v2 train [-h] -k <key>
                        -r <result directory>
                        -e <spec_file>
                        [-n <name_string_for_the_model>]
                        [--gpus <num GPUs>]
                        [--gpu_index <comma separate gpu indices>]
                        [--use_amp]
                        [--log_file <log_file>]

For any questions regarding the TAO Toolkit, refer to the TAO documentation, which goes into further detail.

3.8.2. Further Learning

To learn how to use Omniverse Isaac Sim to create data sets in an interactive manner, see the Synthetic Data Recorder, and then visualize them with the Synthetic Data Visualizer.