10.9.6.3.3. Writer Control

10.9.6.3.3.1. Overview

Omni.Replicator.Character provides built-in writers as well as the ability for the users to create their own custom writers.

10.9.6.3.3.2. Built-in Writers

10.9.6.3.3.2.1. Tao Writer

10.9.6.3.3.2.1.1. Overview

TaoWriter is a writer that publishes aligned images with detected character data. It produces RGB images from camera, corresponding labels with 2d bounding box, 3d bounding box and segmentation data on people.

The labels from TaoWriter will have bounding box coordinates for each character in the scene and follow Kitti 3D annotations format.

10.9.6.3.3.2.1.2. Parameters

Tao Writer expose below parameters to user

rgb: control whether rgb annotator would be output
bbox: control whether character data [2d bounding box, 3d bounding box, joint position, etc] would be output
semantic_segmentation: control whether semantic segmentation would be output

10.9.6.3.3.2.1.3. Output

Rgb: Current Frame’s render product from camera’s viewport
Colored Semantic Segmentation: every character would be colored with distinguish color
Character Information
For each character that passes the width/height threshold checking, Tao Writer would output following data.
- Semantic label: character’s semantic label
- 2d tight bounding box: x_max, x_min, y_max, y_min
- 2d loose bounding box: x_max, x_min, y_max, y_min
- Character 3d position: character’s position in Isaac sim omniverse space
- Character joint information
  Joint Order: [‘Pelvis’, ‘Head’, ‘Left_Shoulder’, ‘Left_Elbow’, ‘Left_Hand’, ‘Right_Shoulder’, ‘Right_Elbow’, ‘Right_Hand’, ‘Left_Thigh’, ‘Left_Knee’, ‘Left_Foot’, ‘Left_Toe’, ‘Right_Thigh’, ‘Right_Knee’, ‘Right_Foot’, ‘Right_Toe’]
  
  3d_joint_position: for each joint output its translate in x, y, z position
  
  2d_joint_position: for each joint output its 2d position on the screen
- Character 3d bounding_box information
  3d bounding box scale: scale value in x, y, z
  
  3d bounding box camera space location: bounding box location in camera space
  
  3d bounding box rotation in degree in camera space
  
  3d bounding box’s vertexes’ 2d projection on screen

10.9.6.3.3.2.2. Lidar Fusion Writer

10.9.6.3.3.2.2.1. Overview

Lidar Fusion Writer is a writer that publishes aligned images and point cloud data with 3D bounding box annotations. It produces RGB images from camera, corresponding point cloud data of the same scene from lidar, extrinsic and intrinsic calibration matrices and labels with 3D bounding boxes on people.

The labels from Lidar Fusion Writer will have bounding box coordinates for each person in the scene and follow Kitti 3D annotations format: (center_x, center_y, center_z, scale_x, scale_y, scale_z, rotation_x, rotation_y, rotation_z). The output data from Lidar Fusion Writer can be used for any 3D people detection tasks or any image-lidar fusion tasks.

10.9.6.3.3.2.2.2. Parameters

Lidar Fusion Writer expose following parameters

rgb: control whether rgb annotator would be output
bbox: control whether character data [2d bounding box, 3d bounding box camrea space position and rotation, etc] would be output
lidar: control whether the point cloud data would be output

10.9.6.3.3.2.2.3. Output

Camera Information: extrinsic and intrinsic calibration matrices
Rgb: Current Frame’s render product from camera’s viewport
Lidar Data: Point Cloud data captured from the lidar camera Character Information:
Character Information:

For each character that passes the width/height threshold checing, Tao Writer would output following data.

Semantic Label: character’s semantic label

2d tight bounding box: x_max, x_min, y_max, y_minc

Character_3d_bounding_box information:

3d bounding box scale: scale value in x, y, z

3d bounding box camera space location: bounding box location in camera space

3d bounding box rotation in degree in camera space

3d bounding box’s vertexes’ 2d projection on screen

10.9.6.3.3.2.3. Objectron Writer

10.9.6.3.3.2.3.1. Overview

Objectron writer outputs a 3D label file in JSON format. This ground truth file is useful for training 3D object detection models or 6-DoF pose estimation. Each file records the camera poses, as well as the 3D bounding boxes of the objects of interest, both in the 2D image plane and the 3D camera space. Additionally, it includes the pose and class of each object.

10.9.6.3.3.2.3.2. Parameters

Objectron Writer expose below parameters to user

rgb: control whether rgb annotator would be output
bbox: control whether character data [3d bounding boxes’ scale, camera space rotation and camera space translation .etc] would be output
semantic_segmentation: control whether semantic segmentation would be output
distance_to_camera: control whether the depth image would be output

10.9.6.3.3.2.3.3. Output

Rgb: Current Frame’s render product from camera’s viewport
Colored Semantic Segmentation: every character would be colored with distinguish color:
Distance to the Camera: Depth
Camera Information:
- Camera Projection Matrix: camera projection matrix
- Camera View: camera view matrix
- Camera Intrinsic Data cx, cy, fx, fy
- Viewport Width/Height
- Camera’s tranlation: x, y, z in Isaac Sim coordinate.
- Camera’s rotation in quaternion xyzw
Character Information
For each character that passes the width/height threshold checking, Tao Writer would output following data.
- Semantic label: character’s semantic label
- Character_3d_bounding_box information:
  3d bounding box camera space location: bounding box location in camera space
  
  3d bounding box rotation in camera space: quaternion xyzw format
  
  3d bounding box’s vertexes’ 2d projection on screen
  
  3d bounding box’s vertexes’ 3d position in camera space

10.9.6.3.3.2.4. RTSP Writer

10.9.6.3.3.2.4.1. Overview

RTSPWriter is a custom writer publishes annotations of attached render products to an RTSP server. It tracks a dictionary of render products (HydraTexture) by the combo of the annotator name and the render product’s prim path. Each render product is recorded as an instance of RTSPCamera. The published RTSP URL of each RTSPCamera instance is constructed by appending the render product’s camera prim path and the annotator name to the base output directory.

10.9.6.3.3.2.4.2. Annotators

These annotators are supported:

LdrColor / rgb
semantic_segmentation
instance_id_segmentation
instance_segmentation
DiffuseAlbedo
Roughness
EmissionAndForegroundMask
distance_to_camera
distance_to_image_plane
DepthLinearized
HdrColor

10.9.6.3.3.2.4.3. Parameters

RTSPWriter accepts below parameters:

device: an integer variable specifies the device ID of the GPU where the NVENC operates on. The annotator data of all attached render products are encoded on the same GPU device.
annotator: a string variable specifies the annotator of all attached render products to be streamed. The accepted value is LdrColor / rgb, semantic_segmentation, instance_id_segmentation, instance_segmentation, HdrColor, distance_to_camera, or distance_to_image_plane. The string value is required to be exact.
output_dir: a string variable specifies the base RTSP URL in the form of rtsp://<RTSP server hostname>:8554/<base topic name>. Given a render product and the annotator, the full RTSP URL is formatted as rtsp://<RTSP server hostname>:8554/<base topic name>_<camera prim path>_<annotator>. For example, if the RTSP server hostname is my_rtsp_server.com; the base topic name is RTSPWriter; the camera prim path is /World/Cameras/Camera_01; the annotator is rgb, then the full RTSP URL is rtsp://my_rtsp_server.com:8554/RTSPWriter_World_Cameras_Camera_01_LdrColor

10.9.6.3.3.2.4.4. Instructions

Before RTSPWriter can stream camera viewports in Omnivserse, below steps are required.

Install and start an RTSP server
The Writer sends all streams to the same RTSP server. The server can be local. One candidate of RTSP server is MediaMTX. Here are the steps to install and start a Linux version of such server.
1. Download and extract a standalone binary from the release page
  mkdir mediamtx; cd mediamtx wget https://github.com/bluenviron/mediamtx/releases/download/v1.1.1/mediamtx_v1.1.1_linux_amd64.tar.gz tar xvzf mediamtx_v1.1.1_linux_amd64.tar.gz
2. Start the server: ./mediamtx
Install FFmpeg

Install FFmpeg on the same machine where RTSPWriter is executed from. sudo apt update && sudo apt install -y ffmpeg
Register and initialize RTSPWriter
As an example, below code snippet prepares to send the rgb or equivalently LdrColor annotators of the render products to rtsp://<RTSP server hostname>:8554. The base topic name is RTSPWriter. The full topic name of the RTSP stream is constructed as RTSPWriter_<camera prim path>_<annotator>. If the annotator data format is supported by NVENC, the frame encoding will be performed on GPU device=0. Otherwise, device= option has no effect.
import omni.replicator.core as rep ... render_products = [] render_products.append(rep.create.render_product(...)) ... render_products.append(rep.create.render_product(...)) ... writer = rep.WriterRegistry.get("RTSPWriter") writer.initialize(device=0, annotator="rgb", output_dir="rtsp://<RTSP server hostname>:8554/RTSPWriter") writer.attach(render_products)

Assume the camera prim path of a render product is /World/Cameras/Camera_01, the full topic name of the RTSP stream is RTSPWriter_World_Camera_01_LdrColor.

10.9.6.3.3.3. Custom Writers

ORC supports using custom writers created by users, through UI or config file directly.

To enable it from UI, select “Custom” from the dropdown of “Replicator Setting” panel, enter the writer’s name and its input parameters in the text boxes.
To enable it from config file, simply put the writer’s name and parameters in “replicator” section.

ORC obtains writer by writer’s name from the Replicator extension, so writers are expected to be registered in Replicator beforehand. For writers mentioned above (provided by ORC), they are registered when ORC is loaded. For custom writers by users, they should be registered by users. Please follow Replicator Documentation Custom Writer to create and register custom writers.

10.9.6.3.3.4. Notes

10.9.6.3.3.4.1. Width/Height Threshold Checking

To make sure characters are not mostly occluded: we applied following test to select only valid characters.

We define three conditions for labeling characters. The character should be labeled if all three conditions are met.

For occluded objects (objects that are blocked by an object within the camera frame), objects must satisfy both the visibility in height and width requirements

Visibility in height
- If the head is visible and 20% of the height is visible then, label the object
- If the head is not visible, then label the object if 60% of the height is visible
Visibility in width
- More than 60% body width visible should be labeled

For objects that are truncated – objects that are cut off via the camera frame, the objects must satisfy EITHER of the conditions in visibility for height OR the visibility for width:

Visibility in height
- If the head is visible and 20% of the height is visible then, label the object
- If the head is not visible then label the object if 60% of the height is visible
Visibility in width
- More than 60% body width visible should be labeled