Simulation Performance Guide#
Please consider the following hints and best practices if you are optimizing simulation performance for your application.
Solver Settings#
Lowering solver iterations to a count that still results in acceptable simulation fidelity will help a lot with performance. You can set global iteration clamps in the scene with the PhysxSchema.PhysxSceneAPI
using the snippet below.
PGS is typically faster than TGS per iteration but TGS can often achieve better stability with fewer iterations. See Physics Solver for more solver details.
1from pxr import PhysxSchema, UsdPhysics
2import omni.usd
3
4# get stage, create scene prim, and apply physx schema scene API
5stage = omni.usd.get_context().get_stage()
6scenePath = "/World/PhysicsScene"
7scene = UsdPhysics.Scene.Define(stage, scenePath)
8physxSceneAPI = PhysxSchema.PhysxSceneAPI.Apply(scene.GetPrim())
9
10# force all actors in the scene to a fixed position and velocity iteration count
11posIters = 16
12velIters = 1
13physxSceneAPI.CreateMaxPositionIterationCountAttr().Set(posIters)
14physxSceneAPI.CreateMinPositionIterationCountAttr().Set(posIters)
15physxSceneAPI.CreateMaxVelocityIterationCountAttr().Set(velIters)
16physxSceneAPI.CreateMinVelocityIterationCountAttr().Set(velIters)
CPU and GPU Simulation Depending on Scene Size#
If you are simulating only a few rigid bodies and articulations, it is likely beneficial to switch to CPU simulation. You may set CPU simulation on the PhysxSchema.PhysxSceneAPI
1from pxr import PhysxSchema
2import omni.usd
3
4stage = omni.usd.get_context().get_stage()
5
6scenePath = "/World/PhysicsScene"
7scenePrim = stage.GetPrimAtPath(scenePath)
8physxSceneAPI = PhysxSchema.PhysxSceneAPI(scenePrim)
9
10# disable GPU dynamics (i.e. running simulation on the GPU)
11physxSceneAPI.CreateEnableGPUDynamicsAttr().Set(False)
12# switch from running broadphase on the GPU to CPU
13physxSceneAPI.CreateBroadphaseTypeAttr().Set("MBP")
For more information on the CPU broadphase types, please refer to the PhysX SDK Broadphase Types
.
If you are simulating large scenes, for example in an RL setting, you should use the GPU simulation and access simulation state using a batch API, e.g. an ArticulationView
.
Render Transforms in Runtime Fabric#
It is expensive to write back render transforms and other simulation state to USD and you should use the physics Fabric extension instead (see enabling Fabric).
Asynchronous Simulation and Rendering#
In the PhysxSchema.PhysxSceneAPI
, you can try the async update setting to run simulation and rendering in parallel.
This is ideal for simulations that do not need to run custom Python code every step, since Python can have issues when it is called from a separate thread.
1from pxr import PhysxSchema, UsdPhysics
2import omni.usd
3
4stage = omni.usd.get_context().get_stage()
5
6# create a scene, set to async update
7scenePath = "/World/PhysicsScene"
8scene = UsdPhysics.Scene.Define(stage, scenePath)
9physxSceneAPI = PhysxSchema.PhysxSceneAPI.Apply(scene.GetPrim())
10physxSceneAPI.CreateUpdateTypeAttr().Set(PhysxSchema.Tokens.asynchronous)
Physics Thread Count#
You may adjust the number of CPU threads used for simulation. We have seen cases where lower thread counts result in a speedup.
This snippet changes the thread count to zero so that the simulation runs single-threaded, which usually gives the best performance for small scenes.
1import carb
2import omni.physx.bindings._physx as pxb
3
4settings = carb.settings.acquire_settings_interface()
5# set number of simulation threads to 0, to run a small scene in a single thread
6settings.set(pxb.SETTING_NUM_THREADS, 0)
Unintended Collision Geometry Overlap#
When setting up reinforcement learning simulation scenes with many parallel environments, make sure that you do not duplicate collision geometry such as ground planes that will generate a lot of inter-environment overlaps that are expensive to check in the collision phase.
Below is an example of an incorrectly setup scene, where the ground plane in the environment has been duplicated many times. Since a ground plane is infinite, each one touches all the environments, leading to performance and memory usage problems.

Use the OmniPVD - PhysX Visual Debugger extension to inspect the scene and visually confirm that everything looks fine. In the above screenshot, the visual hint that something is wrong is the unexpected sea of magenta debug lines all around the scene.
Disable Scene Query Support#
If you are not using scene queries, you may disable support for it to improve performance.
1from pxr import PhysxSchema
2import omni.usd
3
4stage = omni.usd.get_context().get_stage()
5
6scenePath = "/World/PhysicsScene"
7scenePrim = stage.GetPrimAtPath(scenePath)
8physxSceneAPI = PhysxSchema.PhysxSceneAPI(scenePrim)
9
10# Disable Scene-Query Support for More Perf
11physxSceneAPI.CreateEnableSceneQuerySupportAttr().Set(False)
Multi-GPU Simulation and Rendering#
You can distribute simulation and rendering over multiple GPUs for potential performance gain. See Simulation on multiple GPUs for details.
Collision Geometry Choice#
The simpler the collision geometry, the faster the simulation. For example, an SDF mesh collider will be more expensive than a simple sphere.
Deformable or Particle Features#
Deformables and particles are considerably more expensive to simulate than rigid bodies. Instead of deformables, you may want to experiment with rigid bodies and compliant contacts, or rigid bodies connected with joints to approximate a deformable geometry.
Python Callbacks#
Users can implement, for example, robot controllers in per-physics-step callbacks setup with subscribe_physics_step_events
. If the callback compute load is high, it can become a bottleneck.