4. Scripting Behaviors

4.1. Learning Objectives

In this tutorial, we’ll cover the basics of designing behaviors and activating them.

Prerequisites

  • Completed the Overview and Fundamentals tutorial to understand the basic concepts, especially the separation between the belief robot (the robot making decisions) and sim / real robot (the robot being controlled, either in simulation or reality).

  • Completed the Manually Commanding the Robot tutorial to learn how to manually control the robot’s target prim.

Scripts throughout this tutorial are generally referenced relative to standalone_examples/cortex. For reference, the decision framework’s tools are implemented in exts/omni.isaac.cortex/omni/isaac/cortex/df.py.

4.2. Basics of behaviors

Every behavior file has a build_behavior(tools) method. Take go_home.py as an example. It’s code is very simple:

from df import DfNetwork
from dfb import DfToolsContext, make_go_home

def build_behavior(tools):
    return DfNetwork(decider=make_go_home(), context=DfToolsContext(tools))

and we can run it using:

# Terminal 1: Launches the default Franka blocks belief world.
./cortex launch

# Manually control the robot to move it away from its default home position.

# Terminal 2: Run the go home behavior to send the robot back to is home position.
./cortex activate go_home.py

The robot will move from wherever it is to the home configuration.

So far, we haven’t discussed what a DfNetwork is, and its parameters are magic. But this is an example of a behavior. The job of the build_behavior(tools) method is to construct and return a behavior object, which is anything with a tick() method. When a behavior is activated, Cortex will load the behavior using this build_behavior(tools) method, passing in a ContextTools object as described below. And once loaded, Cortex will start ticking the behavior (i.e. calling tick()) at 60Hz.

DfNetwork objects, which represent decider networks, are the standard behavior type used throughout our decision framework, but anything with a tick() method can be used. For instance, try implementing your own behavior. Add the following to a file my_behavior.py:

class Behavior(object):
    def tick(self):
        print("<tick>")

def build_behavior(tools):
    return Behavior()

Then run it using

./cortex activate my_behavior.py

You should see “<tick>” printed repeatedly to the console where Cortex was launched, once per call to tick().

You can clear the behavior using

./cortex clear

4.2.1. Tools available to behaviors

A ContextTools object is created by Cortex on startup containing information about the world, the objects and obstacles in the world, and the robot. It also carries a MotionCommander object for commanding the robot (see Motion command actions). This object is passed automatically to the behavior builder as the tools parameter.

Here’s the full structure:

class ContextTools:
    """ The tools passed in to a behavior when build_behavior(tools) is called.
    """
    def __init__(self, world, objects, obstacles, robot, commander):
        self.world = world  # The World singleton.
        self.objects = objects  # The objects under /cortex/belief/objects as core API objects.
        self.obstacles = obstacles  # Those objects marked as obstacles.
        self.robot = robot  # The belief robot.
        self.commander = commander  # The motion commander.

    def enable_obstacles(self):
        """ Ensures the obstacles are enabled. This can be called by a behavior on construction. To
        reset any previous obstacle suppression.
        """
        for _,obs in self.obstacles.items():
            self.commander.enable_obstacle(obs)

These tools are the interface behaviors have to both properties of the world and actions that can be taken.

4.2.2. Motion command actions

Actions are typically taken through the motion command API accessible through the commander member of ContextTools. See exts/omni.isaac.cortex/omni/isaac/cortex/motion_commander.py for detailed documentation on the API.

The most commonly used methods are

  • tools.commander.set_command(motion_command): Send a motion command to the robot. motion_command is a MotionCommand object (see below).

  • tools.commander.get_fk_{T,pq,p,R}(): Get the forward kinematics of the robot, and return it as a homogeneous transform matrix T, a PosePq position-quaternion object pq, just the position p or just the rotation matrix R.

  • tools.commander.disable_obstacle(obs): Disable the given obstacle. The obstacle must be labeled an obstacle cortex:is_obstacle=true in the belief USD.

  • tools.commander.enable_obstacle(obs): Enable the given obstacle. The obstacle must be labeled an obstacle cortex:is_obstacle=true in the belief USD.

Note

Motion commands, when received by the MotionCommander object, are exponentially smoothed to prevent discrete jumps that would jerk the physical robot. That means sending a command won’t immediately translate to a command all the way at that target. They need to be sent repeatedly to converge to the desired target. Behaviors most commonly should send targets repeatedly, once per tick. This practice also encourages reactive design. In the behaviors below, notice in the viewport how the motion controller target prim never jumps directly to a target, instead you can see it move to the target, as though it were animated. This is a result of the exponential smoothing.

Here’s the relevant code defining a MotionCommand, including ApproachParams and PosePq (taken from exts/omni.isaac.cortex/omni/isaac/cortex/motion_commander.py):

class ApproachParams(object):
    """ Parameters describing how to approach a target (in position).

    The direction is a 3D vector pointing in the direction of approach. It'd magnitude defines the
    max offset from the position target the intermediate approach target will be shifted by. The std
    dev defines the length scale a radial basis (Gaussian) weight function that defines what
    fraction of the shift we take. The radial basis function is defined on the orthogonal distance
    to the line defined by the target and the direction vector.

    Intuitively, the normalized vector direction of the direction vector defines which direction to
    approach from, and it's magnitude defines how far back we want the end effector to come in from.
    The std dev defines how tighly the end-effector approaches along that line. Small std dev is
    tight around that approach line, large std dev is looser. A good value is often between 1 and 3
    cm.

    See calc_shifted_approach_target() for the specific implementation of how these parameters are
    used.
    """

    def __init__(self, direction, std_dev):
        self.direction = direction
        self.std_dev = std_dev

    def __str__(self):
        return "{direction: %s, std_dev %s}" % (str(self.approach), str(self.std_dev))


class PosePq:
    """ A pose represented internally as a position p and quaternion orientation q.
    """

    def __init__(self, p, q):
        self.p = p
        self.q = q

    def to_T(self):
        return math_util.pack_Rp(quat_to_rot_matrix(self.q), self.p)


class MotionCommand:
    """ A motion command includes the motion API parameters: a target pose (required), optional
    approach parameters, and an optional posture configuration.

    The target pose is a full position and orientation target. The approach params define how the
    end-effector should approach that target. And the posture config defines how the system should
    resolve redundancy and generally posture the arm on approach.
    """

    def __init__(self, target_pose, approach_params=None, posture_config=None):
        self.target_pose = target_pose
        self.approach_params = approach_params
        self.posture_config = posture_config

    @property
    def has_approach_params(self):
        return self.approach_params is not None

    @property
    def has_posture_config(self):
        return self.posture_config is not None

Each command is a MotionCommand object, which takes

  1. A target pose, given as a PosePq position-quaternion pair object. This is the primary pose target the end-effector will try to reach. In position-only command mode, the quaternion is ignored by the controller and can be passed as None. In full-pose command mode, if None is passed in as the target orientation, the orientation will be automatically filled in using the end-effector’s current orientation.

  2. An optional approach parameter specified as an ApproachParams object. These parameters define the direction along which the end-effector should approach, including how far back the end-effector should approach from and how tightly it should approach along that line.

  3. An optional posture configuration given as numpy array of joint values, one for each actively controlled arm joint. For instance, in the case of the Franka, this would be 7-dimensional. The posture config defines how the arm should resolve redundancy in the nullspace. This redundancy resolution is most important when using position-only command mode since the problem is less constrained.

Modify your behavior to be the following and try it out:

import numpy as np
from omni.isaac.cortex.motion_commander import MotionCommand, ApproachParams, PosePq
import omni.isaac.cortex.math_util as math_util

class Behavior(object):
    def __init__(self, tools):
        self.tools = tools

        target_p = np.array([.5,.3,.01])
        target_q = math_util.matrix_to_quat(math_util.make_rotation_matrix(
            az_dominant=np.array([0., 0., -1.]),
            ax_suggestion=-target_p))
        self.target_pose = PosePq(target_p, target_q)
        self.approach_params = ApproachParams(direction=np.array([0.,0.,-.1]), std_dev=.05)

    def tick(self):
        self.tools.commander.set_command(
                MotionCommand(
                    target_pose=self.target_pose,
                    approach_params=self.approach_params))

def build_behavior(tools):
    tools.commander.set_target_full_pose()
    return Behavior(tools)

This code is available in: simple_examples/simple_behavior.py.

Note

Note that this behavior requires the full-pose command mode, so we call the motion commander’s API for setting the command mode (full-pose, in this case) first thing when building the behavior. It’s generally good form to include a line like that to prevent user error (all our examples do).

Play with choosing different targets and approach parameters and get a feel for how they affect the behavior. Note that the length of the approach direction defines how far back the end-effector will approach the target from (think of the tip of the direction arrow as sitting at the target). Longer vectors result in longer approaches. The approach standard deviation scales in units of meters with how loose the approach should be when it first enters the approach region. Smaller values result in tighter approaches.

Also, to see an example of how the optional posture_config parameter of the motion command can affect the behavior, run franka/nullspace.py:

./cortex activate franka/nullspace.py

This behavior moves the end-effector to a fixed target using the position-only command mode, and changes the posture_config parameter every 2 seconds. You’ll see the robot hold the end-effector position, but move within the nullspace to minimize the distance to each posture_config.

The behavior itself uses some concepts we introduce down below, so we won’t review it here.

4.3. Built in behavior tools

The decision framework is built around decider networks (see Example: Reactivity Using Deciders and Example: Designing Context Monitors) and their interaction with state machines (see Example: State Machines). In this section, we cover the basics and build up some simple examples. See the respective tutorials for details.

The decision framework is implemented in exts/omni.isaac.cortex/omni/isaac/cortex/df.py with useful reusable behaviors / skills / actions implemented in exts/omni.isaac.cortex/omni/isaac/cortex/dfb.py.

4.3.1. State machines

The easiest starting point when thinking about behavior is the familiar state machine. States of a state machine are represented as state objects with three methods enter(), step(), exit(). When transitioning into a state, enter() is called. While in the state, step() is called. And when transitioning out of a state, exit() is called. Transitions are represented by the return value of step(), which should return the next state the machine is transitioning to (which could be (and often is) itself).

Here’s the basic interface:

class DfState(object):
    def enter(self):
        pass

    def step(self):
        pass

    def exit(self):
        pass

The decision framework does not provide a behavior type (implementing the tick() method) that explicitly represents a state machine. Instead, we’ll see below, that state machines can be stored inside decider networks, and vice versa. So the decision framework’s model is that everything is a decider network, even if that network is simply a single decision node which internally just steps a state machine (aka is just a state machine).

As an example of designing behaviors, though, we can create our own state machine behavior class, and use it to tick a simple state machine.

class StateMachine:
    def __init__(self, init_state):
        self.needs_entry = True
        self.state = init_state

    def tick(self):
        if self.state is None:
            return

        if self.needs_entry:
            self.state.enter()
            self.needs_entry = False

        new_state = self.state.step()

        if new_state != self.state:
            self.state.exit()
            self.state = new_state
            self.needs_entry = True

Now lets write a simple state machine behavior that moves back and forth between two target points.

Code: simple_examples/simple_state_machine.py

import numpy as np
from omni.isaac.cortex.motion_commander import MotionCommand, ApproachParams, PosePq
from omni.isaac.cortex.df import DfState

# << Add the above StateMachine code here. >>

class ReachState(DfState):
    def __init__(self, target_p, tools):
        self.target_p = target_p
        self.tools = tools
        self.next_state = None

    def step(self):
        self.tools.commander.set_command(MotionCommand(target_pose=PosePq(self.target_p, None)))
        if np.linalg.norm(self.target_p - self.tools.commander.get_fk_p()) < .01:
            return self.next_state
        return self

def build_behavior(tools):
    tools.commander.set_target_position_only()

    p1 = np.array([.3,-.1,.01])
    p2 = np.array([.6,.3,.6])
    state1 = ReachState(p1, tools)
    state2 = ReachState(p2, tools)

    state1.next_state = state2
    state2.next_state = state1
    return StateMachine(state1)

Run this example using:

./cortex activate simple_examples/simple_state_machine.py

The robot will move indefinitely between a the specified pair of points.

4.3.2. Decider networks

A decider network is an acyclic graph of decider nodes (deciders), each of whose job is to choose amongst its children. The root is the entry point, and the leaves (decider node with no children) are terminal points, often modeling actions. Every tick, the algorithm traces a path from the root to a leaf, following the decisions of the encountered deciders to ultimately choose a leaf action.

Decider networks directly model the natural flow of hierarchical decisions. Here’s a schematic of the decisions that need to be made for two variants of the block stacking task. Left is where the blocks can be in any order (the task is purely constructive); right is where they have to be in a specific order (the robot may need to deconstruct a tower that’s mis-ordered in order to rebuild it in the right order).

Block stacking network of decisions.

At a high-level the decisions are all about whether it should pick up a block (and from where) or whether it should put down a block (and where to place it). The specific-ordering variant seems naively like it would require explicit task planning, but the diagram of decisions exposes an inherent simplicity. Decider networks make it easy to program these sorts of tasks.

The primary API of a decider is

class DfDecider(object):
    def enter(self):
        pass

    def decide(self):
        pass

    def exit(self):
        pass

Each call to decide() returns a DfDecision() object naming its chosen child and optionally passing it parameters. The parameters can be any object expected by the child. We’ll see examples below.

The API of a DfDecider is very similar to that of a state machine. When the decider is reached along a given path for the first time, enter() is called along with decide(). As long as the tick in subsequent iterations reaches the same decider along the same path, only decide() is called. On the first time for which that decider is no longer reached along the new tick path, exit() is called on it.

Specifically, the descent algorithm keeps track of the previous tick’s path, and as it generates the new path, it compares the new path to the previous path. As long as the two paths are the same, it calls only decide() on the encountered deciders. However, if it detects the new path is branching from the previous path, it calls exit() on all nodes of the previous path that will no longer be traversed, in reverse order from the leaf up, and it calls enter() on all the new nodes encountered along the remaining path to the leaf in the order they’re reached.

See df_descend() in exts/omni.isaac.cortex/omni/isaac/cortex/df.py for details.

Decider Networks.

Here’s a standard pattern of a decider implementation (pseudocode):

class Chooser(DfDecider):
    def __init__(self):
        self.add_child("child1", ChildChooser1())
        self.add_child("child2", ChildChooser2())
        self.params = Params()  # Can be anything.

    def enter(self):
        self.do_some_local_initialization()

    def decide(self):
        if self.context.condition1:
            return DfDecision("child1")  # Choose child1, no params passed.
        if self.context.condition2:
            return DfDecision("child2", self.params)  # Choose child2 and pass it params.
        return None  # Do nothing

    def exit(self):
        self.do_some_local_teardown()

The tutorials Example: Reactivity Using Deciders and Example: Designing Context Monitors provide more in-depth demonstrations of decider networks, and below we describe how to use monitors to monitor logical state leveraging functionality built into the DfNetwork class (which represents decider networks). But first, let’s look more simply at how we use them to interface to the state machines like the one we designed above.

Code: simple_examples/simple_decider_state_machine.py

import numpy as np
from omni.isaac.cortex.motion_commander import MotionCommand, ApproachParams, PosePq
from omni.isaac.cortex.df import DfBindableState, DfStateMachineDecider, DfStateSequence

class ReachState(DfBindableState):
    def __init__(self, target_p):
        self.target_p = target_p

    def step(self):
        self.context.tools.commander.set_command(MotionCommand(target_pose=PosePq(self.target_p, None)))
        if np.linalg.norm(self.target_p - self.context.tools.commander.get_fk_p()) < .01:
            return None
        return self

def build_behavior(tools):
    tools.commander.set_target_position_only()
    root = DfStateMachineDecider(DfStateSequence([
            ReachState(np.array([.3,-.1,.01])),
            ReachState(np.array([.6,.3,.6]))],
        loop=True))
    return DfNetwork(decider=root, context=DfToolsContext(tools))

This behavior acts exactly the same as the above state machine version (see video) but is somewhat easier to construct due to the tooling built into the decision framework. Some new elements:

  • We create the state maching using the DfStateSequence tool. This tool takes a list of states, each of which runs until it terminates (returns None), and automatically transitions to the next state when the current state terminates. We use loop=True so it transitions back to the first state once the final state terminates.

  • We wrap that state machine in a DfStateMachineDecider. The state machine decider is a decider node which internally runs the given state machine. The decider’s enter() and exit() call the internal state machine’s enter() and exit() methods, and decide() simply calls the state machine’s step() method. This example demonstrates the standard method of using state machines in the decision framework — ticking it within a decider node of a decider network. In this case, the only decider node is this DfStateMachineDecider, so this behavior is entirely defined by the state machine.

  • The descent method df_descend(...) binds each encountered decider node to the context object (and any passed in parameters) as described above. For that reason, we needed to switch the ReachState to derive from DfBindableState to give it the bind API. That allows ReachState’s state API methods to access the context object passed into the DfNetwork through the member self.context. Likewise, if this decider were a child of another decider, any parameters passed to it would be accessed through self.params. Note that since we’re using a DfToolsContext context object, which wraps the tools, the tools are now automatically accessible through self.context.tools.

4.4. Context objects and monitors

It’s common with decider networks to monitor the environment and record the logical state in a collection of variables up front, at the beginning of every tick, so the deciders don’t need to perform those computations internally. Decider networks have built in tools for easily implementing that design pattern. Users can design a custom context object with a collection of monitor functions. The context object can be anything, and monitors are simply functions which take the context as an argument. For instance, member functions of the context object which take self as their argument are common.

DfNetwork, on construction, can be given a context object, as well as a list of monitor functions. If no explicit list of functions is provided to the constructor, it will look for a monitors member of the context object and use that. These monitors are called in the order they’re listed at the beginning of each tick.

For instance, the following context object has monitor methods that monitor whether the end-effector is in the middle region, to the left, or to the right, and prints a note when entering a different region.

Code: simple_examples/simple_decider_network.py

from omni.isaac.cortex.df import DfNetwork, DfDecider, DfAction, DfDecision
from omni.isaac.cortex.dfb import DfToolsContext

class Context(DfToolsContext):
    def __init__(self, tools):
        super().__init__(tools)

        self.y = None
        self.is_left = None
        self.is_middle = None

        self.monitors = [
            Context.monitor_y,
            Context.monitor_is_left,
            Context.monitor_is_middle,
        ]

    def monitor_y(self):
        self.y = self.tools.commander.get_fk_p()[1]

    def monitor_is_left(self):
        self.is_left = (self.y < 0)

    def monitor_is_middle(self):
        self.is_middle = (-.15 < self.y and self.y < .15)

class PrintAction(DfAction):
    def __init__(self, msg=None):
        super().__init__()
        self.msg = msg

    def enter(self):
        if self.params is not None:
            print(self.params)
        else:
            print(self.msg)

class Dispatch(DfDecider):
    def __init__(self):
        super().__init__()
        self.add_child("print_left", PrintAction("<left>"))
        self.add_child("print_right", PrintAction("<right>"))
        self.add_child("print", PrintAction())

    def decide(self):
        if self.context.is_middle:
            return DfDecision("print", "<middle>")  # Send parameters down to generic print.

        if self.context.is_left:
            return DfDecision("print_left")
        else:
            return DfDecision("print_right")

def build_behavior(tools):
    tools.commander.set_target_position_only()
    return DfNetwork(decider=Dispatch(), context=Context(tools))

Run this example and manually move the end-effector left and right. In the middle region (about 1 foot in width), it’ll print out “<middle>” on entry (which should be on activation if the robot is already there). Beyond the middle region to the left, it’ll print “<left>” and beyond the middle region to the right it’ll print “<right>”. Play with moving the end-effector around and watch it print the region names as it enters each new region.