5. Example: State Machines

5.1. Learning Objectives

We’ll see here how to design state machines and an example of where subtle reactivity bugs can arise as a result of their structure.

Prerequisites

Scripts throughout this tutorial are generally referenced relative to standalone_examples/cortex.

5.2. Running the example

This tutorial is based on the example franka/peck_state_machine.py.

To run the example:

./cortex launch  # Launches the default franka blocks belief world.
./cortex activate franka/peck_state_machine.py  # Run the example.

The Franka robot will peck at the ground avoiding the blocks. You can move the blocks around to see how that affects where the robot chooses to peck.

5.3. An error case in reactivity

Now try moving a block directly into the path of a current peck. Since the state machine chooses the target on entry and keeps it fixed throughout the behavior, it will get stuck trying to reach the target since the obstacle is in the way.

State machines, by themselves, aren’t great at modeling reactive behavior. We use decider networks, in conjunction with state machines, to solve this problem in Example: Reactivity Using Deciders.

But first, let’s take a look at the simple state machine based implementation.

5.4. The Code

Example: franka/peck_state_machines.py

import numpy as np

from omni.isaac.cortex.df import DfNetwork, DfBindableState, DfStateSequence, DfTimedDeciderState, DfStateMachineDecider
from omni.isaac.cortex.dfb import DfToolsContext, DfLift, DfCloseGripper
import omni.isaac.cortex.math_util as math_util
from omni.isaac.cortex.motion_commander import MotionCommand, ApproachParams, PosePq

def sample_target_p():
    min_x = .3
    max_x = .7
    min_y = -.4
    max_y = .4

    pt = np.zeros(3)
    pt[0] = (max_x-min_x) * np.random.random_sample() + min_x
    pt[1] = (max_y-min_y) * np.random.random_sample() + min_y
    pt[2] = .01

    return pt

def make_target_rotation(target_p):
    return math_util.matrix_to_quat(math_util.make_rotation_matrix(
        az_dominant=np.array([0., 0., -1.]),
        ax_suggestion=-target_p))

class PeckState(DfBindableState):
    def is_near_obs(self, p):
        for _,obs in self.context.tools.obstacles.items():
            obs_p,_ = obs.get_world_pose()
            if np.linalg.norm(obs_p - p) < .2:
                return True
        return False

    def sample_target_p_away_from_obs(self):
        target_p = sample_target_p()
        while self.is_near_obs(target_p):
            target_p = sample_target_p()
        return target_p

    def enter(self):
        # On entry, sample a target.
        target_p = self.sample_target_p_away_from_obs()
        target_q = make_target_rotation(target_p)
        self.target = PosePq(target_p, target_q)
        self.approach_params = ApproachParams(direction=np.array([0.,0.,-.1]), std_dev=.04)

    def step(self):
        # Send the command each cycle so exponential smoothing will converge.
        self.context.tools.commander.set_command(
                MotionCommand(self.target, approach_params=self.approach_params))
        target_dist = np.linalg.norm(self.context.tools.commander.get_fk_p() - self.target.p)

        if target_dist < .01:
            return None  # Exit
        return self  # Keep going

def build_behavior(tools):
    tools.enable_obstacles()
    tools.commander.set_target_full_pose()

    # Build a state machine decider from a sequencial state machine. The sequence will be 1. close
    # gripper, 2. peck at target, 3. lift the end-effector. It's set to loop, so it will simply peck
    # repeatedly until the behavior is replaced. Note that PeckState chooses its target on entry.
    root = DfStateMachineDecider(DfStateSequence([
            DfCloseGripper(width=.0),
            PeckState(),
            DfTimedDeciderState(DfLift(height=.05), activity_duration=.25)],
        loop=True))
    return DfNetwork(decider=root, context=DfToolsContext(tools))

The build_behavior(tools) method constructs the behavior as a DfStateMachineDecider that represents internally a sequential state machine defined by DfStateSequence. On construction, we setup that state sequence to be:

  1. Send a close gripper command.

  2. Peck. On entry, the peck state chooses its next target, then steps the state machine sending that command until it arrives that that target.

  3. Lift away from its peck point just briefly before looping again.

We use loop=True in the DfStateSequence so the state machine is looped indefinitely.