Configure Programmed Algorithms as Skill Agents

are programmed skill agents used in Composabl agent systems. These may be optimizers, PID or MPC controllers, heuristics, or any other Python algorithms that are useful to control parts of a process. Configure controllers in the SDK and then publish them to the UI to use in agent systems.

Create a New Skill Agent

To create a skill agent in the Python SDK, begin by logging in to the SDK by typing Composabl login from the CLI.

Then type composabl skill new.

Give the skill agent a name and a description in response to the prompts that follow. Choose your skill agent type as a controller (a programmed module like an optimization algorithm or MPC controller).

Specify the folder where you’d like to create the skill agent.

The Composal SDK will create a folder and Python controller.py() file from the template.

The Python Controller Class

The Python controller class offers several functions that you can use to build your algorithms or rule-based controllers in Composabl.

Functions for Training

Controllers don't need training, since they are based on programming rather than learning, but they include functions that connect them to the rest of the agent system during training.

Initialize your algorithm: the `init` Function

The __init__ function can be used to initialize your Algorithm and define initial configuration variables, this is called once when the Runtime starts. Let's supose that you want to use a MPC algorithm:

def __init__(self, *args, **kwargs):
    self.counter += 1
    self.mpc = LinearMPC()

Process Observation to Compute Action: the `compute_action` Function

The compute_action function will process the observation and returns an action. This function returns a number that represents the action that will be sent to the simulation.

async def compute_action(self, obs, action):
    self.counter += 1
    action = self.mpc.solve(obs)
    return action

End Simulation: the `compute_termination` Function

The compute_termination function tells the Composabl platform when to terminate a practice episode and start over with a new practice scenario (episode). From a controller perspective, it makes most senses to terminate an episode when the agent system succeeds, fails, or is pursuing a course of action that you do not find likely to succeed. This function returns a Boolean flag (True or False) whether to terminate the episode. You can calculate this criteria however seems best.

python
async def compute_termination(self, transformed_sensors, action):
        return False

Define Success: the `compute_success_criteria` Function

The success_criteria function provides a definition of the skill agent's success and a proxy for how completely the agent system has learned the skill. The platform uses the output of this function (True or False) to calculate when to stop training one skill agent and move on to training the next skill agent. It is also used to determine when to move to the next skill agent in a fixed order sequence. The agent system cannot move from one skill agent in a fixed order sequence to the next, until the success criteria for one skill agent is reached.

python
async def compute_success_criteria(self, transformed_ sensors, action):
        return self.counter > 100

Here are some examples of success criteria definition:

A simple but naive success criteria might return True if a certain sensors or KPI value for an episode or scenario crosses a threshold, but False if it does not.
A complex success criteria might compare a benchmark controller or another agent system to the agent system across many key variables and trials. It returns True if the agent system beats the benchmark on this criteria, but False otherwise.

Functions to Manage Information Inside Agent Systems

As information passes through perceptors, skill agents, and selectors in the agent system, sometimes it needs to change format along the way. You can use three teaching functions to transform sensor and action variables inside agent systems: transform_ sensors, transform_action, and filtered_ sensor _space.

Transform Sensor Variables: the `transform_sensors` function

To transform sensor variables, use the transform_sensor function to calculate changes to specific sensors, then return the complete set of sensor variables (the observation space).

python
async def transform_sensor(self, sensor, action):
        return sensor

Two of the most common reasons for transforming sensor variables are conversion and normalization. For example, if a simulator reports temperature values in Fahrenheit, but the agent system expects temperature values in Celsius, use the transform_sensor function to convert between the two.

Normalization is when you transform variables into different ranges. For example, one sensor variable in your agent system might have very large values (in the thousands), but another variable might have small values (in the tenths), so you might use the transform_sensor function to transform these disparate sensor values to a range from 0 to 1 so that they can better be compared and used in the agent system.

Transform Decisions within the Agent System: the `transform_action` function

You may want to transform action variables for the same reasons as sensor variables.

python
async def transform_action(self, transformed_sensor, action):
    return action

Filter the Sensor List: the `filtered_sensor_space` function

Use the filtered_sensor_space function to pare down the list of sensor variables you need for a particular skill agent. Pass only the information that a skill agent or module needs in order to learn or perform well.

python
def filtered_sensor_space(self):
        return ["state1"]
Return a list of all the sensor variables that you want passed to the skill agent by this teacher.

PreviousCreate Skill Agents with Rewards Using the SDK NextConfigure API Connections to Third-Party Software as Skill Agents

Last updated 4 months ago

Create a New Skill Agent

The Python Controller Class

Functions for Training

Initialize your algorithm: the __init__ Function

Process Observation to Compute Action: the compute_action Function

End Simulation: the compute_termination Function​

Define Success: the compute_success_criteria Function​