Skip to content

Skills Example (Grasp and Stack)

In this skills example, we describe how to implement an agent that can control a robotic arm to grasp and stack objects, inspired by this research work: https://arxiv.org/pdf/1709.06977v1.pdf. There are even more details and descriptions about agent design variations in the section "Brains are Organized by Functions and Strategies" in Designing Autonomous AI (O'Reilly, 2022): https://learning.oreilly.com/library/view/designing-autonomous-ai/9781098110741/ch04.html#idm45643834187872. Credit for the image below also goes to O'Reilly.

Agent Design

The robotic arm agent has five action skills that define different motions the arm can take: reach, move, orient, grasp, and stack. It also has a selector that determines the action that the agent should take.

The agent performs these skills in sequence, so control is handed back to the selector after each action to pass control to the next skill.

In the agent design, each of the skills, including the selector, is practiced and learned with reinforcement learning.

Defining Skills

The five action skills and the selector are all designed as trainable skills within the Composabl SDK, so that they can be taught with deep reinforcement learning.

It would also be possible to use algorithms like inverse kinematics, which execute the reach and move skill quite effectively. So, we may later decide to set trainable=False for these two skills and use an inverse kinematics Python library to build and import those skills.

Reach

The reach skill extends the arm out from the robot body by extending the "elbow" and "wrist."

Define the skill like this:

python
reach_skill = Skill("reach", ReachSkillTeacher)

Move

The move skill moves the arm laterally using the "shoulder."

Define the skill like this:

python
move_skill = Skill("move", CenterSkillTeacher)

Orient

The orient skill turns the "wrist" to orient the end effector "hand."

Define the skill like this:

python
orient = Skill("orient", OrientSkillTeacher)

Grasp

The grasp skill manipulates the fingers of the end effector "hand" to clamp down on the block.

Define the skill like this:

python
grasp_skill = Skill("grasp", GraspSkillTeacher)

Stack

The stack skill uses the fingers to hold onto the block, but performs a lateral movement much like the "shoulder" movement in the move skill above.

Define the skill like this:

python
land_skill = Skill("land", TeachStackSkillTeacher)

Selector

The selector skill determines when to utilize each of the above skills. It's like a supervisor that assigns each skill like workers based on the scenario.

Define the skill like this:

python
selector_skill = Skill("selector", SelectorSkillTeacher)

Orchestration

Now, let's orchestrate the skills together in the agent.

python
agent.add_selector_skill(selector_skill, [reach, move, orient, grasp, stack], fixed_order=True, fixed_order_repeat=False)

In this orchestration, the success criteria for each skill must be met before control is handed back to the selector to decide the next skill.