10.9.6.3.3. Writer Control

10.9.6.3.3.1. Overview

Omni.Replicator.Character provides built-in writers as well as the ability for the users to create their own custom writers.

10.9.6.3.3.2. Built-in Writers

10.9.6.3.3.2.1. Tao Writer

10.9.6.3.3.2.1.1. Overview

TaoWriter is a writer that publishes aligned images with detected character data. It produces RGB images from camera, corresponding labels with 2d bounding box, 3d bounding box and segmentation data on people.
The labels from TaoWriter will have bounding box coordinates for each character in the scene and follow Kitti 3D annotations format.

10.9.6.3.3.2.1.2. Parameters

Tao Writer expose below parameters to user

  • rgb: control whether rgb annotator would be output

  • bbox: control whether character data [2d bounding box, 3d bounding box, joint position, etc] would be output

  • semantic_segmentation: control whether semantic segmentation would be output

10.9.6.3.3.2.1.3. Output

  • Rgb: Current Frame’s render product from camera’s viewport

  • Colored Semantic Segmentation: every character would be colored with distinguish color

  • Character Information

    For each character that passes the width/height threshold checking, Tao Writer would output following data.

    • Semantic label: character’s semantic label

    • 2d tight bounding box: x_max, x_min, y_max, y_min

    • 2d loose bounding box: x_max, x_min, y_max, y_min

    • Character 3d position: character’s position in Isaac sim omniverse space

    • Character joint information

      • Joint Order: [‘Pelvis’, ‘Head’, ‘Left_Shoulder’, ‘Left_Elbow’, ‘Left_Hand’, ‘Right_Shoulder’, ‘Right_Elbow’, ‘Right_Hand’, ‘Left_Thigh’, ‘Left_Knee’, ‘Left_Foot’, ‘Left_Toe’, ‘Right_Thigh’, ‘Right_Knee’, ‘Right_Foot’, ‘Right_Toe’]

      • 3d_joint_position: for each joint output its translate in x, y, z position

      • 2d_joint_position: for each joint output its 2d position on the screen

    • Character 3d bounding_box information

      • 3d bounding box scale: scale value in x, y, z

      • 3d bounding box camera space location: bounding box location in camera space

      • 3d bounding box rotation in degree in camera space

      • 3d bounding box’s vertexes’ 2d projection on screen

10.9.6.3.3.2.2. Lidar Fusion Writer

10.9.6.3.3.2.2.1. Overview

Lidar Fusion Writer is a writer that publishes aligned images and point cloud data with 3D bounding box annotations. It produces RGB images from camera, corresponding point cloud data of the same scene from lidar, extrinsic and intrinsic calibration matrices and labels with 3D bounding boxes on people.
The labels from Lidar Fusion Writer will have bounding box coordinates for each person in the scene and follow Kitti 3D annotations format: (center_x, center_y, center_z, scale_x, scale_y, scale_z, rotation_x, rotation_y, rotation_z). The output data from Lidar Fusion Writer can be used for any 3D people detection tasks or any image-lidar fusion tasks.

10.9.6.3.3.2.2.2. Parameters

Lidar Fusion Writer expose following parameters

  • rgb: control whether rgb annotator would be output

  • bbox: control whether character data [2d bounding box, 3d bounding box camrea space position and rotation, etc] would be output

  • lidar: control whether the point cloud data would be output

10.9.6.3.3.2.2.3. Output

  • Camera Information: extrinsic and intrinsic calibration matrices

  • Rgb: Current Frame’s render product from camera’s viewport

  • Lidar Data: Point Cloud data captured from the lidar camera Character Information:

  • Character Information:

For each character that passes the width/height threshold checing, Tao Writer would output following data.

  • Semantic Label: character’s semantic label

  • 2d tight bounding box: x_max, x_min, y_max, y_minc

  • Character_3d_bounding_box information:

    • 3d bounding box scale: scale value in x, y, z

    • 3d bounding box camera space location: bounding box location in camera space

    • 3d bounding box rotation in degree in camera space

    • 3d bounding box’s vertexes’ 2d projection on screen

10.9.6.3.3.2.3. Objectron Writer

10.9.6.3.3.2.3.1. Overview

Objectron writer outputs a 3D label file in JSON format. This ground truth file is useful for training 3D object detection models or 6-DoF pose estimation. Each file records the camera poses, as well as the 3D bounding boxes of the objects of interest, both in the 2D image plane and the 3D camera space. Additionally, it includes the pose and class of each object.

10.9.6.3.3.2.3.2. Parameters

Objectron Writer expose below parameters to user

  • rgb: control whether rgb annotator would be output

  • bbox: control whether character data [3d bounding boxes’ scale, camera space rotation and camera space translation .etc] would be output

  • semantic_segmentation: control whether semantic segmentation would be output

  • distance_to_camera: control whether the depth image would be output

10.9.6.3.3.2.3.3. Output

  • Rgb: Current Frame’s render product from camera’s viewport

  • Colored Semantic Segmentation: every character would be colored with distinguish color:

  • Distance to the Camera: Depth

  • Camera Information:

    • Camera Projection Matrix: camera projection matrix

    • Camera View: camera view matrix

    • Camera Intrinsic Data cx, cy, fx, fy

    • Viewport Width/Height

    • Camera’s tranlation: x, y, z in Isaac Sim coordinate.

    • Camera’s rotation in quaternion xyzw

  • Character Information

    For each character that passes the width/height threshold checking, Tao Writer would output following data.

    • Semantic label: character’s semantic label

    • Character_3d_bounding_box information:

      • 3d bounding box camera space location: bounding box location in camera space

      • 3d bounding box rotation in camera space: quaternion xyzw format

      • 3d bounding box’s vertexes’ 2d projection on screen

      • 3d bounding box’s vertexes’ 3d position in camera space

10.9.6.3.3.2.4. RTSP Writer

10.9.6.3.3.2.4.1. Overview

RTSPWriter is a custom writer publishes annotations of attached render products to an RTSP server. It tracks a dictionary of render products (HydraTexture) by the combo of the annotator name and the render product’s prim path. Each render product is recorded as an instance of RTSPCamera. The published RTSP URL of each RTSPCamera instance is constructed by appending the render product’s camera prim path and the annotator name to the base output directory.

10.9.6.3.3.2.4.2. Annotators

These annotators are supported:

  • LdrColor / rgb

  • semantic_segmentation

  • instance_id_segmentation

  • instance_segmentation

  • DiffuseAlbedo

  • Roughness

  • EmissionAndForegroundMask

  • distance_to_camera

  • distance_to_image_plane

  • DepthLinearized

  • HdrColor

10.9.6.3.3.2.4.3. Parameters

RTSPWriter accepts below parameters:

  • device: an integer variable specifies the device ID of the GPU where the NVENC operates on. The annotator data of all attached render products are encoded on the same GPU device.

  • annotator: a string variable specifies the annotator of all attached render products to be streamed. The accepted value is LdrColor / rgb, semantic_segmentation, instance_id_segmentation, instance_segmentation, HdrColor, distance_to_camera, or distance_to_image_plane. The string value is required to be exact.

  • output_dir: a string variable specifies the base RTSP URL in the form of rtsp://<RTSP server hostname>:8554/<base topic name>. Given a render product and the annotator, the full RTSP URL is formatted as rtsp://<RTSP server hostname>:8554/<base topic name>_<camera prim path>_<annotator>. For example, if the RTSP server hostname is my_rtsp_server.com; the base topic name is RTSPWriter; the camera prim path is /World/Cameras/Camera_01; the annotator is rgb, then the full RTSP URL is rtsp://my_rtsp_server.com:8554/RTSPWriter_World_Cameras_Camera_01_LdrColor

10.9.6.3.3.2.4.4. Instructions

Before RTSPWriter can stream camera viewports in Omnivserse, below steps are required.

  1. Install and start an RTSP server

    The Writer sends all streams to the same RTSP server. The server can be local. One candidate of RTSP server is MediaMTX. Here are the steps to install and start a Linux version of such server.
    1. Download and extract a standalone binary from the release page

      mkdir mediamtx; cd mediamtx
      wget https://github.com/bluenviron/mediamtx/releases/download/v1.1.1/mediamtx_v1.1.1_linux_amd64.tar.gz
      tar xvzf mediamtx_v1.1.1_linux_amd64.tar.gz
      
    2. Start the server: ./mediamtx

  2. Install FFmpeg

    Install FFmpeg on the same machine where RTSPWriter is executed from. sudo apt update && sudo apt install -y ffmpeg

  3. Register and initialize RTSPWriter

    As an example, below code snippet prepares to send the rgb or equivalently LdrColor annotators of the render products to rtsp://<RTSP server hostname>:8554. The base topic name is RTSPWriter. The full topic name of the RTSP stream is constructed as RTSPWriter_<camera prim path>_<annotator>. If the annotator data format is supported by NVENC, the frame encoding will be performed on GPU device=0. Otherwise, device= option has no effect.

    import omni.replicator.core as rep
    ...
    render_products = []
    render_products.append(rep.create.render_product(...))
    ...
    render_products.append(rep.create.render_product(...))
    ...
    writer = rep.WriterRegistry.get("RTSPWriter")
    writer.initialize(device=0, annotator="rgb", output_dir="rtsp://<RTSP server hostname>:8554/RTSPWriter")
    writer.attach(render_products)
    

Assume the camera prim path of a render product is /World/Cameras/Camera_01, the full topic name of the RTSP stream is RTSPWriter_World_Camera_01_LdrColor.

10.9.6.3.3.3. Custom Writers

ORC supports using custom writers created by users, through UI or config file directly.

  • To enable it from UI, select “Custom” from the dropdown of “Replicator Setting” panel, enter the writer’s name and its input parameters in the text boxes.

  • To enable it from config file, simply put the writer’s name and parameters in “replicator” section.

ORC obtains writer by writer’s name from the Replicator extension, so writers are expected to be registered in Replicator beforehand. For writers mentioned above (provided by ORC), they are registered when ORC is loaded. For custom writers by users, they should be registered by users. Please follow Replicator Documentation Custom Writer to create and register custom writers.

10.9.6.3.3.4. Notes

10.9.6.3.3.4.1. Width/Height Threshold Checking

To make sure characters are not mostly occluded: we applied following test to select only valid characters.
We define three conditions for labeling characters. The character should be labeled if all three conditions are met.
For occluded objects (objects that are blocked by an object within the camera frame), objects must satisfy both the visibility in height and width requirements
  • Visibility in height

    • If the head is visible and 20% of the height is visible then, label the object

    an example of what the UI looks like for the camera calibration extension
    • If the head is not visible, then label the object if 60% of the height is visible

    an example of what the UI looks like for the camera calibration extension
  • Visibility in width

    • More than 60% body width visible should be labeled

    an example of what the UI looks like for the camera calibration extension
For objects that are truncated – objects that are cut off via the camera frame, the objects must satisfy EITHER of the conditions in visibility for height OR the visibility for width:
  • Visibility in height

    • If the head is visible and 20% of the height is visible then, label the object

    an example of what the UI looks like for the camera calibration extension
    • If the head is not visible then label the object if 60% of the height is visible

    an example of what the UI looks like for the camera calibration extension
  • Visibility in width

    • More than 60% body width visible should be labeled

    an example of what the UI looks like for the camera calibration extension