Only this pageAll pages
Powered by GitBook
1 of 56

Main

Loading...

Loading...

Reference

Loading...

Loading...

Loading...

Loading...

Tutorials

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Establish a Simulation Environment

Loading...

Loading...

Loading...

Build Multi-Agent Systems

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Train Agents

Loading...

Loading...

Loading...

Loading...

Evaluate Performance

Loading...

Loading...

Deploy Agents

Loading...

Loading...

Loading...

Loading...

Loading...

clusters

Loading...

Loading...

Loading...

Loading...

Loading...

Troubleshooting

Loading...

SDK Reference

Glossary

Controller: A skill agent that makes decisions based on programmed algorithms. Also called a programmed skill agent.

Decisions: The outputs of a Composabl system - its instructions for controlling the system

Design patterns: Common structures that can be used to quickly create multi-agent systems

Episode: An entire run through the task

Functional Pattern: A machine teaching design pattern used to orchestrate skills in sequences

Iteration: One decision during agent training or performance

Learned skill agent: A skill agent that uses DRL to make decisions and learn by practicing in simulation

Orchestration: Arranging agents as modular building blocks to work together to complete tasks

Orchestrator: A special type of skill agent that assigns decision-making control to the right decision-making skill agent based on current conditions

Perceptor: An ML model or other algorithm that interprets sensor data

Plan-Execute Pattern: A machine teaching design pattern used to orchestrate skill agents in pairs that work together to make decisions

Programmed skill agent: A skill agents that makes decisions based on programmed algorithms. Also called controllers.

Project: A collection of agents for the same use case that share the same simulator and top-level goal

Scenarios: Conditions that are associated with specific skills

Sensors: The part of the system that takes in information from the simulator or real environment - the eyes, ears, and other senses of the system

Simulator: The virtual environment where agents practice and improve performance

Skill agents: Modules within a multi-agent system that make decisions to complete all or part of a task

Strategy Pattern: A machine teaching design pattern used to orchestrate skill agents in hierarchies

Teacher: An algorithm that creates a skill agent that uses DRL to learn to make decisions

Welcome to Composabl

Composabl is the platform for creating industrial-strength multi-agent AI systems that make high-impact decisions in the real world.

With Composabl, you can convert expert knowledge about how a process works into a team of agents with specialized skills that work together to allow the system to make the right decision in every situation. These can be either programmed or learned through advanced AI techniques and orchestrated so that the multi-agent system performs effectively in every part of the process and under any conditions. For skill agents that learn by practicing, Composabl trains the agent systems in realistic scenarios until the agent system can succeed at the task and outperform the alternatives.

Quick Links

Find articles on key topics with these links.

Try Out a Tutorial

Integrate with Composabl

You can use any model or Python algorithm with Composabl for training agent systems, adding perception, analysis, and communication, and making decisions. See how to configure different types of modules in the UI and by publishing them via the data science workflow for agent system design, training, and deployment.

Create Modular Skill Agents

Composabl multi-agent systems are built on modular skills that break down a task into separate parts. Learn how to create skill agents to train with deep reinforcement learning.

Deploy Multi-Agent Systems

Once Composabl agentic systems are designed and trained, you can export them to the Composabl runtime to connect with your system. Learn how to deploy an agent within the runtime container and how to use Composabl's tools to analyze agent behavior during both training and deployment.

CLI Reference

Login

Command
Description

composabl login

Login to the Composabl platform

Agent

Command
Description

composabl agent train --json-path <PATH>

Train the agent based on the JSON Structure provided

composabl agent visualize

Visualize the agent structure

Sim (composabl sim)

Command
Description

composabl sim new --name <NAME> --description <DESCRIPTION> --location <PATH>

Create a new sim template. This will create the boilerplate required to submit a sim

composabl sim validate <PATH>

Validates the simulator at the given path

composabl sim validate --path <PATH>

Validates the simulator at the given path

composabl sim validate --address <ADDRESS>

Validates the simulator running at the given address

composabl sim run <PATH>

Run the sim created through the command above locally (useful for development)

composabl sim list

List the sims associated with an organization in the Composabl UI

composabl sim list --local

List the Docker images for sims running locally

composabl sim status

Gives a status of currently running local sims

composabl sim stop --sim-id <SIM_ID>

Stop the running sim

composabl sim stop

Stop all the running sim

composabl sim mappings --address <ADDRESS> --image <IMAGE>

Based on the address or image, get the mappings for a Sim its observation and action space

Perceptor (composabl perceptor)

Command
Description

composabl perceptor new --name <NAME> --description <DESCRIPTION> --location <PATH>

Create a new perceptor template. This will create the boilerplate required to submit a perceptor

composabl perceptor publish <PATH>

Publish the perceptor to the NoCode app

Skill (composabl skill)

Command
Description

composabl skill new --name <NAME> --description <DESCRIPTION> --location <PATH>

Create a new skill template. This will create the boilerplate required to submit a skill

composabl skill publish <PATH>

Publish the skill to the NoCode app

Selector (composabl selector)

Command
Description

composabl selector new --name <NAME> --description <DESCRIPTION> --location <PATH>

Create a new selector template. This will create the boilerplate required to submit a selector

composabl selector publish <PATH>

Publish the selector to the NoCode app

Job (composabl job)

Command
Description

composabl job list

List the running jobs

composabl job stop <JOB_ID>

Stop the running job

composabl job submit --path <PATH> --job-id <JOB_ID> --json <JSON> --template-path <TEMPLATE_PATH>

Submit a job for training

Below, you can find the commands available in the Composabl CLI.

Integrate a machine learning model

Use existing ML models for machine vision, prediction, or analysis in Composabl agent systems

Integrate a language model

Set up an LLM call to add communication to your agent system

Integrate a programmed algorithm

Configure any Python algorithm, such as a PID controller or optimization algorithm, as a skill agent in Composabl

Integrate third-party software

Use an API call to external software as a skill agent in Composabl

Integrate a simulator

Configure your simulator to work with Composabl

Publish to the platform

Upload configured models and algorithms to the Composabl UI with one CLI command

Create skill agents with goals and constraints

Create skills agents with subject matter expertise by configuring goals and constraints for learning

Create skill agents with rewards using the SDK

Create skill agents that learn with rewards and access additional teaching tools in Python

Publish to the platform

Upload skill agents configured with the SDK to the Composabl UI for training and deployment

Evaluate the performance of your multi-agent system

Evaluate performance using the Composabl benchmarking feature

Deploy a Composabl multi-agent system

Export a multi-agent system and connect to the Composabl runtime container

Connect the multi-agent system to your production system

Connect the Composabl runtime container to your system for deployment

Audit multi-agent system behavior with the historian

Use the Composabl historian to analyze system behavior in detail

Composabl SDK Documentation

Industrial Mixer

This tutorial will take you through the process of building agents for a realistic chemical manufacturing use case.

  1. Learn about the use case (this page)

  2. Experiment with your own designs

About the Use Case

Why an industrial mixer?

Mixing and blending tasks are great use cases for intelligent agent systems because they are both complex and high-value. In the example in this tutorial, process improvements could lead to millions in ROI.

Use Case Overview

In the industrial mixer use case, raw materials are stirred together inside a tank, undergoing a reaction that produces the desired end product.

The goal of the process is to convert as much of the raw material as possible. But as the chemicals mix and the conversion occurs, the tank heats up. If the temperature gets too high, a condition called “thermal runaway” occurs, potentially causing explosions and fires.

To produce as much chemical as possible, the operator must constantly adjust the temperature in the tank, keeping it high enough to allow productivity but low enough to avoid any thermal runaway risk.

As in all machine teaching use cases, this process can be summarized in the form of a goal (maximize yield) and a constraint (avoid thermal runaway) that must be balanced against each other:

The process is controlled by adjusting the mixture's temperature in the tank using a "jacket" filled with coolant. Lowering the coolant temperature in the jacket lowers the temperature in the tank, decreasing the risk of thermal runaway.

However, cooling the tank can also reduce yield. By how much? The answer varies unpredictably – temperature changes affect chemical concentration differently at different parts of the reaction. That nonlinear relationship between temperature and yield is why this is a nuanced process that benefits so much from intelligent automation.

Simulating a Complex Reaction

The industrial mixer simulator uses principles of chemistry to model the behavior of the system in real life.

Learn more about the simulator

The main reaction equation is:

For evaluating agent performance, yield is defined as the percentage of the product that is at or above the concentration defined by the manufacturer as meeting product specifications. The threshold for thermal runaway is set at 400 degrees Kelvin; in the real reaction, there would be some variability of the threshold depending on other factors.

The reaction is nuanced because the conversion rate and the temperature have a nonlinear correlation, expressed by the Arrhenius equation:

Agents can be tested with different levels of noise, random fluctuations that distort the ability to recognize patterns. All the agent results in this tutorial are tested with a noise level of 5%. Agents generally perform better with less noise, but as the noise level increases, some agent designs handle it better than others. The plan-execute pattern is the best of these agents at handling noise.

Benchmark

The current automation solution is a linear MPC controller. As the current solution, this agent system's performance is the benchmark for the other designs.

The image below shows an MPC controller represented in the visual system of agent designs. The agent takes in sensor information about the temperature in the tank and the concentrations of the chemicals.

It passes that information to the skills layer of the agent system. The skills layer contains a single programmed skill agent: control reactor. This skill uses a mathematical model to determine the desired temperature set point for the tank. It also determines the control actions to take to achieve that temperature using the cooling jacket, and outputs those actions as decisions.

In simulation, this agent's conversion rate was 82%. That means that 82% of the reagents were turned into product, with 18% waste, but the risk of thermal runaway is high.

Like all technologies, model predictive control has a “personality,” a unique set of strengths and weaknesses that can be seen reflected in this performance. MPC is a rule follower. It is a way of controlling a process using mathematical relationships that can be fully mapped and programmed by human engineers. It works well in situations that are straightforward and linear.

The agent does a good job in the first steady state at the start. However, as the transition phase begins, the agent’s performance starts to fail. The wide shadow shows that this agent can’t adapt to the unpredictable conditions in the transition. Its performance becomes dangerously inconsistent, potentially allowing the temperature to exceed the thermal runaway checkpoint at nearly every point in the reaction.

When an MPC controller is used to control this process in the real world, a human operator needs to step in and take over control before the automated system lets the temperature cross the thermal runaway threshold.

Get Started

The following tutorials will walk you through the anatomy of some multi-agent systems that take different approaches to learning how to control the temperature of a mixer more effectively to maximize yield and avoid thermal runaway.

Deep Reinforcement Learning

The DRL agent system is a simple design with only one skill agent. This agent system does not use machine teaching to decompose the task into skills that can be trained separately. Instead, the entire reaction is controlled by a skill agent trained with deep reinforcement learning.

Let's get started!

1. Create your first skill agent

This agent system has a single-skill agent called Control Full Reaction. To create this skill in the UI go to the skill agent page and click Create new skill agent

2. Set skill agent goals and constraints

Configure your agent to set the instructions for its training sessions. This agent has one goal, to maximize yield, and one constraint, to keep the temperature from going above 400 degrees Kelvin.

  1. Click Add goal In the left drop-down menu, select Maximize, and in the right one, select Eps_Yield . This means the agent will train with the goal of maximizing the total product produced by the end of each episode.

  2. Click Add constraint. In the left drop-down menu, select Avoid , and in the right one, select T . After you select T you're going to see a slider appear for you to set boundaries you want to train the system to avoid. In this case, we want to set the boundaries from 400 to 500.

  3. Save your skill agent configuration and return to the Agent Orchestration Studio.

3. Create a Scenario

Set scenarios to tell each skill agent what specific conditions or phases of the process to practice in. This skill agent controls the full reaction, so it needs to practice with the reaction as a whole.

Go to the Scenarios page and select Add scenario, then name it Control full reaction and click Save . We're going to add two criteria to this scenario, and they are a reference temperature and concentration.

Control Full reaction: Cref Is 8.57, Tref Is 311 |

4. Add the Skill Agent to Your Agent Configuration

Drag the skill control_reaction that you can now see on the left-hand side of your project onto the skills layer. Click on the skill agent once it's in the skill layer and assign the scenario.

5. Run Your Training Session

Now, we are ready to train your agent and see the results. First, select our built-in training cluster or one you own and have connected to the platform. Then set the number of cycles. For this tutorial, we suggest running 50. You can run multiple simulations in parallel to speed up training time. Under advanced, you can use GPUs instead of CPUs, set a rollout fragment length, and set the number of benchmark runs.

Once you have everything configured, click Allocate training cycles . This agent system design has only one agent, so all training cycles will be allocated to our DRL agent. In a multi-agent system, you can assign a different number of training cycles to different agents depending on the complexity of the skill.

6. View Results

When the training has been completed, you can view your results in the training sessions tab in the UI. This will show you information on how well the agent is learning.

You will likely see a steep learning curve as the agent experiments with different control strategies and learns from the results. When the learning curve plateaus, that usually means that the skill is trained.

Analyze the DRL Agent's Performance

Conversion rate: 90% Thermal runaway risk: Low

We tested this fully trained agent and plotted the results.

The DRL agent system performs well. Its relatively thin shadow means that it performs consistently over different conditions and stays within the safety threshold almost every time.

This agent controls the initial steady state well, staying on the benchmark line. But during the transition, the DRL agent goes off the benchmark line quite a bit. It doesn't notice right away when the transition phase begins, staying too long in the lower region of the graph and then overcorrecting. That's because DRL works by experimentation, teaching itself how to get results by exploring every possible way to tackle a problem. It has no prior knowledge or understanding of a situation and relies entirely on trial and error. That means it is potentially well-suited to complex processes, like the transition phase, that can’t be easily represented mathematically.

However, its behavior is erratic because it can’t distinguish between the phases. The DRL agent’s skills do better than the traditional automation benchmark, but still leave room for improvement.

Get Started

This getting started guide takes you through the steps you will need to go through before you can start building your teams of agents.

Prerequisites

  • A Composabl account

Create Your Use Case

Select the simulator you will use for the project and the assigned team. If you haven't created a team yet, you will need to do that before you create your first project.

Then, you can use the AI prompt to explain what you are attempting to achieve with your project and have it bootstrap based on your description. I will describe the use case skills, goals, and constraints and call it Chemical Process Control.

Click New use case and Create use case

Set Use Case Goals

To finish setting up your use case, set your top-level project goal.

Now, we will set the goal for this example. In this project, we want to:

  • Maximize yield (make as much product as we can)

Goal Title and Description

Fill out the goal title and description for the top-level goal.

Goal Title: Chemical Process Control Goals

Description: The goal for this project is to maximize yield

Logical Structure

We are going to one condition to represent our goal.

  1. To maximize yield, the objective is Maximize , and the variable is Eps_Yield

Now you can save your changes.

Now that you have created your new use case and set the goal, you can start creating agent systems!

Optional: Set Up the Data Science Workflow

Some of the steps in this tutorial use Composabl's data science workflow, which allows data scientists and controls engineers to publish any Python module to the Composabl platform to use in multi-agent systems.

To use this workflow, complete these steps:

Now that you have your project created and your goals set you're ready to start creating agents!

Plan-Execute Pattern

What is special about this agent is that it combines DRL and MPC, the technologies from the two single-skilled agent systems — the worst performers — to create the best performing agent.

In this example, the DRL skill agent first uses its powers of learning and experimentation to determine the goal temperature for the cooling jacket — the set point. It then passes this information on to the MPC skill agent, which uses its powers of control and execution to direct the agent on what action to take to achieve the desired temperature.

Remember how the strategy pattern is like a math class where each student solves the problems they are best at, as assigned by the teacher? In the plan-execute pattern, the students work in groups to solve problems together. Let’s say Student A is good at translating word problems into equations, while Student B is good at solving equations. Student A works on each problem first, and then passes it over to Student B, who produces the solution. No teacher is needed here, because the students divide each problem the same way.

Let's get started configuring this agent!

1. Publish the MPC Skill Agent to Your Project

Return to the agent orchestration studio and refresh the page. The skill agent will appear in the skills menu on the left of your page.

Explore the Code Files

All skill agents, perceptors, and orchestrators have at least two files in them. A Python file contains the code the skill agent will use, and a config file.

  1. pyproject.toml, a config file with the following information.

File Structure

See the Code

MPC Skill Group Controller Skill Agent

pyproject.toml

See the code

controller.py

See the code

2. Build the Plan Execute Pattern Agent System in the Agent Orchestrator UI

First, drag the skill agent control_full_reaction from the hand side of the page to the skill layer. Once it's there, drag over the mpc-skill-group and make sure that it is dropped below the control_full_reaction skill agent and not beside it.

3. Run Your Training Session

We are ready to train your agent system and see the results. Select the cluster you want to use and the number of training cycles. We suggest you run 50 training cycles. You will see the skill agents training one at a time, and you assign the number of cycles you want each skill agent to use. It will automatically assign an equal number of training sessions for each skill agent, but in some agent system designs, some skill agents might require more training than others.

4. View Results

When the training has been completed, you can view your results in the training sessions tab in the UI. This will show you information on how well the agent is learning.

Analyzing the Plan-Execute Pattern Agent’s Performance

Conversion rate: 95% Thermal runaway risk: Very low

We tested this fully trained agent and plotted the results.

This agent is the best performer of the group. Combining two imperfect technologies together with Machine Teaching produces much better results than either technology achieves alone.

Strategy Pattern with a Perception Layer

While skill agents can be built within Composabl's no-code studio or created externally and then imported, perceptors are always built outside of Composabl and then published to the platform to use in agents.

In this case, the perceptor is a pre-built ML model saved as a pickle file. This ML model is trained to interpret the sensor data and check for conditions that might indicate an elevated risk of thermal runaway, and then pass that information to the selector along with the rest of the sensor data.

Think of the perception layer as an additional set of senses that helps the agent system predict when something might go wrong, like a teacher monitoring the class for early signs of trouble.

Let's get started configuring this agent system!

1. Add the Perceptor Skill Agent to Your Use Case

Return to the agent orchestration studio and refresh the page. The skill agent will appear in the skill agents menu on the left side of your page.

Explore the Code Files

All skill agents, perceptors, and selectors have a minimum of two files in them. A Python file contains the code that the agent system will use, and a config file. Perceptors have some more files to load in ML models and other python packages.

File Structure

Thermal Runaway Perceptor

pyproject.toml

See the code

thermal_runaway_predictor.py

See the code

2. Copy the Strategy Pattern Agent System, name it Strategy Pattern with Perceptor, and add the Perceptor Skill Agent to your Strategy Pattern Agent System

Drag the Perceptor thermal_runaway_predictor that you can now see on the left-hand side of your use case onto the perception layer.

3. Run Your Training Session

We are ready to train your agent system and see the results. Select the cluster you want to use and the number of training cycles. We suggest you run 150 training cycles. You will see the skill agents training one at a time, and you assign the number of cycles you want each skill agent to use. It will automatically assign an equal number of training sessions for each skill agent, but in some agent system designs, some skill agents might require more training than others.

4. View Results

When the training has been completed, you can view your results in the training sessions tab in the UI. This will show you information on how well the agent system is learning.

The agent system training results will be a little bit different from the strategy pattern alone. That's because the thermal runaway predictor is making a difference in how the agent system performs.

Analyzing the Strategy Pattern Agent System’s Performance with Perception

Conversion rate: 92% Thermal runaway risk: Very low

We tested this fully trained agent system and plotted the results.

Adding perception improves agent system temperature control performance.

The red lines on the graph show where the perceptors helped the agent system make adjustments to avoid thermal runaway. This agent system gets the same yield as the strategy pattern agent, but the improved temperature control has reduced the thermal runaway incidents from low to 0.

Composabl Simulation API

You can integrate your simulator with the Composabl SDK by using the ServerComposabl class. This class provides the necessary methods for the simulator to interact with the Composabl SDK.

The methods of the ServerComposabl class allow the Composabl SDK to automatically take care of serializing and deserializing the different requests and responses.

To conform your simulator to the Composabl SDK, you must define a server implementation class that defines methods of how to talk with the Composabl SDK.

Set Up the Simulation Environment Instance

Make

Make is a request to create a new instance of the environment with the specifications requested.

  • string env_Id; Identifier for the type of environment to create.

  • dictionary env_Init; Initial configuration for the environment, as defined within the runtime configuration (link to section about how to define runtime configuration parameters)

Sensor_space_info

Sensor_space_info provides details about the environment’s sensor space.

Action_space_info

Action_space_info defines the agent system's action space.

Action_space_sample

The action_space_sample function returns an element of the simulator’s action space.

Run the Simulation Environment Instance

Reset

Reset is a request to reset the environment, and returns the first observation of the newly reset environment.

  • observation Initial observation of the environment.

  • Dictionary info Additional information about the reset environment.

Step

Step provides the agent system action to be applied to the environment. The return structure is as follows:

  • observation; The observation following the action.

  • float reward The reward received after taking the action.

  • bool terminated Whether the episode has ended.

  • bool truncated Whether the episode was truncated before a natural conclusion.

  • Dictionary info Additional information about the step.

Close

Close denotes the simulator is done being used and may perform any necessary cleanups required.

Set_Scenario

Get_Scenario

Get_scenario returns the scenario that the simulation is currently running.

Create Visualizations

Get_Render

Get_render provides the current rendered image of the environment, either as a numpy array or a string.

Sample Use Cases

The examples and code samples in this documentation refer to our sample use cases. These examples are real-world use cases with complex goals and constraints. In each case, the Composabl team has built agent systems that exceed the benchmark control technology by orders of magnitude.

Industrial Mixer

About the Use Case

The industrial mixer use case is a realistic case study of a chemical process control agent system controlling a continuous stirred tank chemical reaction. The agent system controls the temperature in a tank where a chemical reaction occurs to create a product.

As the chemicals are stirred together in the tank, the reaction produces heat at a nonlinear, unpredictable rate. If the tank isn’t cooled enough, it can reach dangerous temperatures, a condition called thermal runaway. If it’s cooled too much, not enough product will be produced. The agent system needs to balance these two goals, keeping the tank at the right temperature at every moment to optimize production while ensuring safety.

Explore Agent System Components

Production Scheduling

About the Use Case

The production scheduling use case is an complex production planning problem set in an industrial bakery. The agent system must determine the right amount of cookies, cakes, and cupcakes to make each day, directing teams of workers and equipment and responding to fluctuations in costs, pricing, and demand.

Anatomy of a Multi-Agent System

Multi-agent systems have structure, just like our brains. Different parts perform different functions.

This modularity is a key to building high-performing agentic systems quickly and effectively. Most complex systems, from factories to rockets to software programs, are broken into modular pieces to separate concerns, reuse design patterns, and integrate pre-built components. Like these systems, decision-making AI also works best when decisions are separated into modular concerns. Modular structure makes intelligent agent systems easier to build, test, and maintain.

How Agentic Systems Make Decisions

Multi-agent systems work through a feedback loop. The system gets information from the simulation or real environment and then makes a decision to adjust one or more of its control variables in some way. This creates a change that is reflected in new sensor information that the agent system can then process and use to decide on the next adjustment.

While agent systems are being built and trained, they are connected to that give them feedback about how their actions affect the system they are learning to control. Once an agent system is deployed, the simulator is replaced by an equivalent connection to the real system.

The simulator and the multi-agent system function in a continuous feedback loop, with the system outputting actions that affect the simulation environment, and the simulator returning new information to the agent system based on every decision.

are the part of the multi-agent system that take in information from the simulator or the real environment about conditions and about the results of the agent system's actions. They are the equivalent of human senses – the eyes and ears. Sensor readings come from specific variables in the simulation or control system that measure various aspects of the state space.

are the system's outputs, adjusting one or more of the control variables to control the process dynamically throughout the performance of the task. Each decision during training is known as an , while a series of iterations to complete the task is called an .

Inside the Multi-Agent System

The team of agents is organized in layers, with information and decisions passing through each layer in sequence.

Skill Agents

The layer, or decision-making layer, is the heart of a multi-agent system. This is where the system makes its control decisions. When a specific skill agent is activated, it will determine the control action the system should take.

Multi-agent systems contain multiple agents d together according to design. Skill agents can work together in groups, sequences, hierarchies, or coordinated teams. This modularity allows the agent to learn more efficiently and perform better.

You can imagine skill agents being like students on a math team who are working together to complete a set of problems. Each student performs best solving a particular kind of problem: one is good at fractions and one at decimals. Depending on the type of problem, the appropriate student will use their expertise to solve the problem and produce the answer for the team. Sometimes one student might handle the entire problem, and for other problems more than one student might need to work together.

Just as different students’ capabilities make them able to solve particular problems, different skill agents may make use of different technologies. Some types of decisions are best approached through skill agents that can be programmed with mathematical calculations, rules, or optimization algorithms. Others that are more complex and nonlinear can be trained using deep reinforcement learning.

Examples of Skills

For an HVAC system regulating temperature in an office building:

  • Control the system during the day

  • Control the system at night

For a factory where responses are needed to different types of alarms:

  • Handle safety critical alarms (programmed with rules)

  • Handle simple alarms (programmed with calculations)

  • Handle complex alarms (learned with reinforcement learning)

For a drone autopilot:

  • Stabilize

  • Get to landing zone

  • Land

  • Avoid obstacles

For a robotic arm used to grab and stack objects:

  • Reach (extend the robot arm from the "elbow" and "wrist")

  • Move (move the arm laterally using the "shoulder)

  • Orient (turn the "wrist" to position the "hand")

  • Grasp (Manipulate the "fingers" to clamp down)

  • Stack (Move laterally while grasping)

Orchestrators

skills are the supervisors for your agent system. In the math class analogy, the orchestrator would be like the teacher. The teacher assesses the type of problem and assigns the right student.

In a multi-agent system, an orchestrator uses information from the and to understand the scenario and then determine which skill agent is needed. Once the skill agent is called into service, it makes the decision.

For example, for HVAC control, an orchestrator would determine whether day or night control is needed, and then pass control to the appropriate skill agent. In the safety alarm example, the orchestrator determines the type of alarm and then passes the decision to the right skill agent. In the drone and robotic arm examples, the skills need to be performed in sequence. In these cases, the orchestrator assigns the appropriate skill agent as needed for each step in the process.

Perceptors

A perception layer is an optional enhancement layer. process and translate sensor data into a format that can be used to make decisions. In other words, the perception layer inputs the sensor variables and outputs new variables deduced by calculation, machine learning, or other programming.

For example, if we design a team of agents as an autopilot for a drone, we might have sensor variables that measure pitch, yaw, roll (position of the drone), velocity in each of those three directions, and acceleration in each of those three directions. But what about stability? Stability is an important characteristic to understand while flying a drone, but there is no sensor variable that describes stability. It is too complex to be captured by a single sensor.

The perception layer allows us to create a variable for stability. It can be calculated using dynamics equations or trained with supervised machine learning. The new variable then becomes accessible to the rest of the multi-agent system along with the other sensor variables.

Examples of Perceptors

  • Computer Vision: A camera sensor passes image or video feeds into a perceptor module that identifies object types and locations

  • Auditory Perception: A microphone sensor passes machine sounds to a perceptor module that identifies which state the machine is in based on the sounds that it is making

  • Prediction: A perceptor module inputs quality measurements and past agent system actions and predicts whether current actions will lead to acceptable quality measurements

  • Anomaly Detection: A perceptor modules inputs market variables and detects when the market is changing regimes.

  • Classification and Clustering: A perceptor module inputs machine and process data and classifies which of several conditions a manufacturing line is currently in.

Simulation Overview

The simulator is the part of the agent system ecosystem that tells you what will happen when you take a certain action. Whether based on historical data, physical principles, and math, or other methods, the simulation environment models the real system and allows your agent systems to train.

Explore Composabl's Simulators

To look under the hood and see how a simulator works, explore Composabl's public Python simulators hosted on Docker Hub.

You can build and train agent systems using these simulations to explore Composabl, educate yourself and your team, and create demonstrations and POCs. These are relatively simple simulations that can train agent systems quickly.

The best way to access these simulations is through the CLI.

To see the options available:

To connect to one of the simulators:

Simulation Help

If you have a simulator, this section of the documentation will explain how to connect it to Composabl so that you can use it to train agent systems.

If you don’t have a simulator, you may want to work with a Composabl partner to help you develop and connect one.

We wrote a. Read the white paper to learn about the problem, the agent system designs, and agent system performance. Use this tutorial to build and train all the agents for yourself.

The first thing you will do is in the Composabl UI.

Along with the , and the , the plan-execute pattern is one of the major of machine teaching. In this pattern, the skill agents work together in a skill group, with the first skill agent determining what the action should be and the second skill agent determining how to achieve it.

This tutorial will show you how to publish the MPC controller to the platform using the and then use it to create a multi-agent system using the plan-execute pattern.

This agent has two skill agents called control_full_reaction and mpc-skill-group. We have control_full_reaction in our project, so we only need to publish mpc-skill-group to build this agent in the Agent Builder UI. To publish mpc-skill-group to your use case you will need to open up your favorite code editor and terminal. In your terminal, navigate to the skills folder of the and use this command with the Composabl CLI.

A Python file. For this skill agent, we use with the following code and explanations in comments inline.

The performs well, but it's not perfect in avoiding thermal runaway. One good way to address that is to add a perception layer.

are special skill agents that process and interpret sensor data before passing it to the rest of the agent system. To improve the strategy pattern's performance on temperature control, you can add a perception layer that uses machine learning to predict thermal runaway.

In this tutorial, you'll use Composabl's to to publish the perceptor to your use case so that you can add it to your agent configuration.

This agent system has a perceptor skill agent called thermal_runaway_predictor. To publish it to your use case, you will need to open up your favorite code editor and terminal. In your terminal, navigate to the perceptors folder and use this command with the .

Composabl’s simulation API extends the

Set_scenario tells the simulator the current scenario the agent system wishes to train on.

Learn more:

Learn more: .

The case study, developed in partnership with Composabl partner , requires the agent system to make a choice every minute between 24 possible combinations of equipment, task, employee and product, over the course of a 400-decision day, with the ultimate goal of maximizing profit.

for more information about finding a simulation partner.

composabl skill publish mpc-skill-group
[project]
name = "MPC Skill Group"
version = "0.1.0"
description = "MPC prepared for Skill Group"
authors = [{ name = "John Doe", email = "john.doe@composabl.com" }]
dependencies = [
    "composabl-core",
    "scipy",
    "casadi==3.6.6",
    "do_mpc==4.6.5"
]

[composabl]
type = "skill-controller"
entrypoint = "mpc_skill_group.controller:Controller"
from random import randint
from typing import Dict, List

from composabl_core import SkillController
######
import os
import math
import numpy as np
import do_mpc
import numpy as np
from casadi import *
from scipy import interpolate
from math import exp

# time step (seconds) between state updates
Δt = 1

π = math.pi

class Controller(SkillController):
    def __init__(self, *args, **kwargs):
        """
        Initialize the Controller skill with default values.
        
        Args:
            *args: Variable length argument list.
            **kwargs: Arbitrary keyword arguments.
        """
        # Initialize a counter to track the number of actions computed
        self.count = 0

    async def compute_action(self, obs, action):
        """
        Compute the control action (ΔTc) based on current observations using Model Predictive Control (MPC).
        
        Args:
            obs (list or dict): Current sensor observations.
                If list, expected order: ['T', 'Tc', 'Ca', 'Cref', 'Tref']
            action: The previous action taken (not directly used but considered for ΔTc calculation).
        
        Returns:
            list: A list containing the computed change in Tc (ΔTc).
                  Example: [ΔTc]
        """
        # Convert observations to dictionary if they are provided as a list
        if type(obs) == list:
            obs = {
                'T': obs[0],
                'Tc': obs[1],
                'Ca': obs[2],
                'Cref': obs[3],
                'Tref': obs[4]
            }
        # else:
        #     # Uncomment if you need to ensure all values are floats
        #     for key, value in obs.items():
        #         obs[key] = float(value)

        # Handle action input: ensure it's a float value
        if type(action) == list or type(action) == np.ndarray:
            action = action[0]
        elif type(action) == dict:
            assert type(action['action']) == float
            action = float(action['action'])
        else:
            action = float(action)

        # Initialize noise variable (currently set to 0; can be modified for stochasticity)
        noise = 0

        # Extract and convert sensor readings to float
        CrSP = float(obs['Cref'])    # Reference concentration
        Ca0 = float(obs['Ca'])       # Actual concentration at current step
        T0 = float(obs['T'])         # Temperature at current step
        Tc0 = float(obs['Tc']) + action  # Cooling liquid temperature adjusted by action

        # Define constants for the CSTR model
        F = 1          # Volumetric flow rate (m³/h)
        V = 1          # Reactor volume (m³)
        k0 = 34930800  # Pre-exponential nonthermal factor (1/h)
        E = 11843      # Activation energy per mole (kcal/kmol)
        R = 1.985875   # Boltzmann's ideal gas constant (kcal/(kmol·K))
        ΔH = -5960     # Heat of reaction per mole (kcal/kmol)
        phoCp = 500    # Density multiplied by heat capacity (kcal/(m³·K))
        UA = 150       # Overall heat transfer coefficient multiplied by tank area (kcal/(K·h))
        Cafin = 10     # Inlet concentration (kmol/m³)
        Tf = 298.2     # Feed temperature (K)

        # --- MPC MODEL SETUP ---
        model_type = 'continuous'  # Define model type: 'discrete' or 'continuous'
        model = do_mpc.model.Model(model_type)

        # Define state variables
        Ca = model.set_variable(var_type='_x', var_name='Ca', shape=(1,1))  # Concentration
        T = model.set_variable(var_type='_x', var_name='T', shape=(1,1))    # Temperature

        # Define measurements (if any) with optional measurement noise
        model.set_meas('Ca', Ca, meas_noise=True)
        model.set_meas('T', T, meas_noise=True)

        # Define control input
        Tc = model.set_variable(var_type='_u', var_name='Tc')  # Cooling liquid temperature

        # Define time-varying parameters (TVPs)
        model.set_variable(var_type='_tvp', var_name='Caf')   # Inlet concentration (kmol/m³)
        model.set_variable(var_type='_tvp', var_name='Tref')  # Reference temperature (K)

        # Define model equations (right-hand side)
        model.set_rhs('Ca', (F/V * (Cafin - Ca)) - (k0 * exp(-E/(R*T))*Ca))
        model.set_rhs('T', (F/V *(Tf-T)) - ((ΔH/phoCp)*(k0 * exp(-E/(R*T))*Ca)) - ((UA /(phoCp*V)) *(T-Tc)))

        # Finalize model setup
        model.setup()

        # --- CONTROLLER SETUP ---
        mpc = do_mpc.controller.MPC(model)
        setup_mpc = {
            'n_horizon': 20,       # Prediction horizon
            'n_robust': 1,         # Number of robust steps
            'open_loop': 0,        # Open-loop setting
            't_step': Δt,          # Time step (seconds)
            'store_full_solution': True  # Store full solution
        }

        mpc.set_param(**setup_mpc)

        # Suppress IPOPT solver output for cleaner logs
        surpress_ipopt = {'ipopt.print_level':0, 'ipopt.sb': 'yes', 'print_time':0}
        mpc.set_param(nlpsol_opts = surpress_ipopt)

        # Scaling for states and inputs to improve numerical stability
        mpc.scaling['_x', 'T'] = 100
        mpc.scaling['_u', 'Tc'] = 100

        # --- OBJECTIVE FUNCTION ---
        _x = model.x
        _tvp = model.tvp
        _u = model.u

        # Define terminal and stage cost
        mterm = ((_x['Ca'] - CrSP))**2  # Terminal cost
        lterm = ((_x['Ca'] - CrSP))**2  # Stage cost

        mpc.set_objective(mterm=mterm, lterm=lterm)

        # Define control input penalties to discourage large control actions
        mpc.set_rterm(Tc=1.5)  # Input penalty for Tc

        # --- CONSTRAINTS ---
        # Bounds for state variables
        mpc.bounds['lower', '_x', 'Ca'] = 0.1   # Minimum concentration
        mpc.bounds['upper', '_x', 'Ca'] = 12    # Maximum concentration

        mpc.bounds['upper', '_x', 'T'] = 400    # Maximum temperature
        mpc.bounds['lower', '_x', 'T'] = 100    # Minimum temperature

        # Bounds for control inputs
        mpc.bounds['lower', '_u', 'Tc'] = 273   # Minimum cooling temperature (K)
        mpc.bounds['upper', '_u', 'Tc'] = 322   # Maximum cooling temperature (K)

        # --- TIME-VARYING PARAMETERS (TVPs) SETUP ---
        # Define templates for TVPs
        tvp_temp_1 = mpc.get_tvp_template()
        tvp_temp_1['_tvp', :] = np.array([8.5698])

        tvp_temp_2 = mpc.get_tvp_template()
        tvp_temp_2['_tvp', :] = np.array([2])

        tvp_temp_3 = mpc.get_tvp_template()
        tvp_temp_3['_tvp', :] = np.array([2])

        # Define a function to update TVPs based on current time
        def tvp_fun(t_now):
            p1 = 22    # Time step 1
            p2 = 74    # Time step 2
            time = 90  # Total time

            # Define concentration and temperature equilibrium points
            ceq = [8.57, 6.9275, 5.2850, 3.6425, 2]
            teq = [311.2612, 327.9968, 341.1084, 354.7246, 373.1311]

            # Interpolate concentration and temperature based on current time
            C = interpolate.interp1d([0, p1, p2, time], [8.57, 8.57, 2, 2])
            T_ = interpolate.interp1d([0, p1, p2, time], [311.2612, 311.2612, 373.1311, 373.1311])

            if t_now < p1:
                return tvp_temp_1
            elif p1 <= t_now < p2:
                y = float(C(t_now))
                tvp_temp_3['_tvp', :] = np.array([y])
                return tvp_temp_3
            else:
                return tvp_temp_2

        mpc.set_tvp_fun(tvp_fun)

        # Finalize MPC setup
        mpc.setup()

        # --- ESTIMATOR SETUP ---
        estimator = do_mpc.estimator.StateFeedback(model)

        # --- SIMULATOR SETUP ---
        simulator = do_mpc.simulator.Simulator(model)
        params_simulator = {
            't_step': Δt  # Time step (seconds)
        }

        simulator.set_param(**params_simulator)

        # Define templates for simulator parameters
        p_num = simulator.get_p_template()
        tvp_num = simulator.get_tvp_template()

        # Define functions for TVPs and uncertain parameters in the simulator
        def tvp_fun_sim(t_now):
            return tvp_num

        def p_fun_sim(t_now):
            return p_num

        simulator.set_tvp_fun(tvp_fun_sim)
        simulator.set_p_fun(p_fun_sim)

        # Finalize simulator setup
        simulator.setup()

        # --- INITIAL STATE SETUP ---
        # Initialize states for MPC, simulator, and estimator
        x0 = simulator.x0
        x0['Ca'] = Ca0
        x0['T'] = T0

        u0 = simulator.u0
        u0['Tc'] = Tc0

        mpc.x0 = x0
        simulator.x0 = x0
        estimator.x0 = x0

        mpc.u0 = u0
        simulator.u0 = u0
        estimator.u0 = u0

        # Set initial guess for MPC
        mpc.set_initial_guess()

        # --- MPC CONTROL LOOP ---
        # Simulate N steps (currently set to 1)
        u0_old = 0
        time_steps = 1
        for k in range(time_steps):
            if k > 1:
                u0_old = u0[0][0]

            # Make a control step using MPC
            u0 = mpc.make_step(x0)

            # Enforce a maximum change of ±10 on the control input
            if k > 1:
                if u0[0][0] - u0_old > 10:
                    u0 = np.array([[u0_old + 10]])
                elif u0[0][0] - u0_old < -10:
                    u0 = np.array([[u0_old - 10]])
            else:
                if u0[0][0] - Tc0 >= 10:
                    u0 = np.array([[Tc0 + 10]])
                elif u0[0][0] - Tc0 <= -10:
                    u0 = np.array([[Tc0 - 10]])

            # Add Gaussian noise to the measurements
            error_var = noise
            σ_max1 = error_var * (8.5698 - 2)
            σ_max2 = error_var * (373.1311 - 311.2612)
            mu = 0
            v0 = np.array([
                mu + σ_max1 * np.random.randn(1, 1)[0],
                mu + σ_max2 * np.random.randn(1, 1)[0]
            ])

            # Simulate the next step with the control input and noise
            y_next = simulator.make_step(u0, v0=v0)  # MPC simulation step

            # Reshape state values for consistency
            state_ops = y_next.reshape((1, 2))

            # --- BENCHMARK SETUP ---
            p1 = 22
            p2 = 74
            ceq = [8.57, 6.9275, 5.2850, 3.6425, 2]
            teq = [311.2612, 327.9968, 341.1084, 354.7246, 373.1311]

            # Interpolate reference concentration and temperature
            C = interpolate.interp1d([0, p1, p2, time_steps], [8.57, 8.57, 2, 2])
            T_ = interpolate.interp1d([0, p1, p2, time_steps], [311.2612, 311.2612, 373.1311, 373.1311])

            # Update reference concentrations and temperatures based on current step
            if k < p1:
                Cref = 8.5698
                Tref = 311.2612
            elif p1 <= k < p2:
                y = float(C(k))
                y2 = float(T_(k))
                Cref = y
                Tref = y2
            else:
                Cref = 2
                Tref = 373.1311

            # Update the estimator with the new measurements
            x0 = estimator.make_step(y_next)  # Update state estimates

        # Increment the action counter
        self.count += 1

        # Compute the change in Tc (ΔTc) based on the new control input
        newTc = u0[0][0]
        dTc = float(newTc) - float(obs['Tc'])

        # Return the computed ΔTc as a list
        return [dTc]

    async def transform_sensors(self, obs):
        """
        Process and potentially modify sensor observations before they are used.

        Args:
            obs (dict): Current sensor observations.

        Returns:
            dict: Transformed sensor observations.

        Note:
            - Currently, this method returns the observations unchanged.
            - This can be customized to apply transformations if needed.
        """
        # Currently, no transformation is applied to sensors
        return obs

    async def filtered_sensor_space(self):
        """
        Define which sensors are relevant for this perceptor.

        Returns:
            list: Names of the sensors to be used.

        Note:
            - Specifies a list of sensor names that this perceptor will utilize.
            - Helps in focusing the perceptor's operations on relevant data.
        """
        # Specify the sensors that this perceptor will use
        return ['T', 'Tc', 'Ca', 'Cref', 'Tref', 'Conc_Error', 'Eps_Yield', 'Cb_Prod']

    async def compute_success_criteria(self, transformed_obs, action):
        """
        Determine whether the success criteria have been met.

        Args:
            transformed_obs (dict): Transformed sensor observations.
            action: The action taken.

        Returns:
            bool: True if success criteria are met, False otherwise.

        Behavior:
            - Currently always returns False.
            - Can be implemented with logic to check if certain conditions are satisfied.
        """
        # Placeholder for success criteria logic
        return False

    async def compute_termination(self, transformed_obs, action):
        """
        Determine whether the training episode should terminate.

        Args:
            transformed_obs (dict): Transformed sensor observations.
            action: The action taken.

        Returns:
            bool: True if the episode should terminate, False otherwise.

        Behavior:
            - Currently always returns False.
            - Can be implemented with logic to terminate based on certain conditions.
        """
        # Placeholder for termination condition logic
        return False
composabl perceptor publish thermal_runaway_predictor
[project]
name = "Thermal Runaway Predictor - ML 1.2.2"
version = "0.1.0"
description = "ML thermal runaway predictor"
authors = [{ name = "John Doe", email = "john.doe@composabl.com" }]
dependencies = [
    "composabl-core",
    "scikit-learn==1.2.2"
]

[composabl]
type = "perceptor"
entrypoint = "thermal_runaway_predictor.perceptor:ThermalRunawayPredict"

# Include additional data files
[tool.setuptools.packages.find]
where = ["thermal_runaway_predictor"]

[tool.setuptools.package-data]
"*" = ["*.json", "*.pkl"]
from composabl_core import PerceptorImpl

#######
import os
import pickle

# Determine the directory where the current script is located
path = os.path.dirname(os.path.realpath(__file__))

class ThermalRunawayPredict(PerceptorImpl):
    def __init__(self, *args, **kwargs):
        """
        Initialize the ThermalRunawayPredict perceptor with default values and load the machine learning model.
        
        Args:
            *args: Variable length argument list.
            **kwargs: Arbitrary keyword arguments.
        """
        # Initialize the prediction output variable
        self.y = 0
        
        # Initialize a flag to indicate thermal runaway status
        self.thermal_run = 0
        
        # Load the pre-trained machine learning model from a pickle file
        # The model is expected to be located in the 'ml_models' directory relative to the script's path
        model_path = os.path.join(path, "ml_models", "ml_predict_temperature_122.pkl")
        try:
            with open(model_path, 'rb') as model_file:
                self.ml_model = pickle.load(model_file)
        except FileNotFoundError:
            print(f"Machine learning model not found at {model_path}. Please ensure the model file exists.")
            self.ml_model = None
        except Exception as e:
            print(f"An error occurred while loading the ML model: {e}")
            self.ml_model = None
        
        # Initialize a list to store historical ML predictions if needed
        self.ML_list = []
        
        # Initialize the last recorded 'Tc' value to compute its change (ΔTc)
        self.last_Tc = 0

    async def compute(self, obs_spec, obs):
        """
        Compute the thermal runaway prediction based on current sensor observations.
        
        Args:
            obs_spec: Observation specification (not used in this implementation).
            obs: Current sensor observations. Can be a list or a dictionary.
        
        Returns:
            dict: A dictionary containing the thermal runaway prediction.
                  Example: {"thermal_runaway_predict": 1}
        """
        # Ensure that 'obs' is a dictionary. If not, convert it using predefined sensor keys.
        if not isinstance(obs, dict):
            # Define the expected sensor keys
            obs_keys = ['T', 'Tc', 'Ca', 'Cref', 'Tref', 'Conc_Error', 'Eps_Yield', 'Cb_Prod']
            # Convert the list to a dictionary by zipping it with the sensor keys
            obs = dict(zip(obs_keys, obs))
            print("Converted 'obs' to dictionary format using predefined sensor keys.")
        
        # Calculate the change in 'Tc' (ΔTc) since the last observation
        if self.last_Tc == 0:
            # If this is the first observation, assume an initial ΔTc of 5
            self.ΔTc = 5
        else:
            # Compute ΔTc as the difference between current 'Tc' and the last recorded 'Tc'
            try:
                current_Tc = float(obs['Tc'])
                self.ΔTc = current_Tc - self.last_Tc
            except (KeyError, ValueError, TypeError) as e:
                # Handle cases where 'Tc' is missing or cannot be converted to float
                print(f"Error accessing or converting 'Tc': {e}")
                self.ΔTc = 0  # Default to 0 if there's an error
        
        # Initialize the prediction output
        y = 0
        
        # Check if the current temperature 'T' exceeds or equals 340
        try:
            current_T = float(obs['T'])
        except (KeyError, ValueError, TypeError) as e:
            print(f"Error accessing or converting 'T': {e}")
            current_T = 0  # Default to 0 if there's an error
        
        if current_T >= 340:
            # Prepare the feature vector for the ML model
            try:
                Ca = float(obs['Ca'])
                Cref = float(obs['Cref'])
            except (KeyError, ValueError, TypeError) as e:
                print(f"Error accessing or converting 'Ca' or 'Cref': {e}")
                Ca = 0
                Cref = 0
            
            # Feature vector: [Ca, T, Tc, ΔTc]
            X = [[Ca, current_T, self.ΔTc]]
            
            # If the ML model was loaded successfully, make a prediction
            if self.ml_model:
                try:
                    # Predict the probability of thermal runaway
                    y_proba = self.ml_model.predict_proba(X)
                    
                    # Get the predicted class label (e.g., 0 or 1)
                    y = self.ml_model.predict(X)[0]
                    
                    # Optionally, use the probability to adjust prediction confidence
                    # For example, set y=1 only if the probability of class 1 is >= 0.3
                    if y_proba[0][1] >= 0.3:
                        y = 1
                    else:
                        y = 0
                except Exception as e:
                    print(f"Error during ML model prediction: {e}")
                    y = 0
            else:
                print("ML model is not loaded. Cannot make predictions.")
                y = 0
        
        # Update the last recorded 'Tc' with the current value for the next computation
        try:
            self.last_Tc = float(obs['Tc'])
        except (KeyError, ValueError, TypeError) as e:
            print(f"Error accessing or converting 'Tc' for updating last_Tc: {e}")
            self.last_Tc = self.last_Tc  # Keep the previous value if there's an error
        
        # Optionally, store the prediction in ML_list for historical tracking
        self.ML_list.append(y)
        
        # Update the prediction output variable
        self.y = y
        
        # Return the prediction as a dictionary
        return {"thermal_runaway_predict": y}

    def filtered_sensor_space(self, obs):
        """
        Define which sensors are relevant for this perceptor.
        
        Args:
            obs: Current sensor observations (not used in this implementation).
        
        Returns:
            list: Names of the sensors to be used.
        """
        # Specify the sensors that this perceptor will use
        return ['T', 'Tc', 'Ca', 'Cref', 'Tref', 'Conc_Error', 'Eps_Yield', 'Cb_Prod']
  async def make(self, env_id: str, env_init: dict):
        self.env_id = env_id if env_id else self.env_id
        self.env_init = env_init if env_init else self.env_init

        self.env = Sim(self.env_init)

        return {
            "id": "my_simulator",
        }
async def sensor_space_info(self) -> gym.Space:
        return self.env.sensor_space
async def action_space_info(self) -> gym.Space:
   async def action_space_sample(self):
        return self.env.action_space.sample()
  async def reset(self): 
        return self.env.reset()
async def step(self, action):
        return self.env.step(action)
async def close(self):
        self.env.close()
  async def set_scenario(self, scenario):
        self.env.scenario = scenario
async def get_scenario(self):
        if self.env.scenario is None:
            return None

        return self.env.scenario
 async def get_render(self):
        return self.env.render()
composabl sim list
composabl sim run
Access Composabl and set up your project
Build a benchmark agent system with deep reinforcement learning
Build an agent system using the strategy design pattern
Add a perception layer to your strategy pattern agent system
Build an agent system using the plan-execute design pattern
white paper about this case study
create your use case
Install the Composabl CLI
Clone the Industrial Mixer Repo
strategy pattern
perception pattern
data science workflow
already created
Industrial Mixer Repo
controller class
strategy pattern agent system
data science workflow
Composabl CLI
the gymnasium.Env standards.
Learn more about scenarios.
Read the Composabl whitepaper about the production scheduling use case.
Access perceptors, skill agents, and selectors for this use case.
Read the Composabl whitepaper about the production scheduling use case
Rovisys
Contact us
Install the Composabl SDK
Install the Composabl CLI

Create a Use Case

A use case is a collection of agent systems for the same use case. The best way to use Composabl is to build multiple agent systems within the same use case using different design patterns and variations. This allows you to iterate and improve your agent systems for the best possible performance.

All agent systems within a use case share the same goals and the same simulator.

Create a Use Case in the UI

To create a use case, click on New New Use Case in the upper right-hand corner of your dashboard.

You'll be prompted to enter the simulator associated with your use case and then choose your team.

Use the AI Use Case Creation Assistant

You also have the option to use Composabl's AI assistant to help set up the use case. This specialized AI assistant is designed to help solve the "cold start" problem and get you started designing your agent system more quickly.

Based on your written use case description, the AI will automatically create customized goals and agent skills within your use case. You can then access those goals and agent skills in the Agent Orchestration studio and edit and adjust them as needed.

When prompting the assistant, use as much detail as you can. You can tell the AI:

  • What the problem or use case is

  • What equipment or process the agent system will control

  • What the overall goal is

  • Anything you know about phases of the process or different scenarios

Set Goals, Constraints, and Success Criteria

The performance goal is the most important KPI or metric you will use to evaluate your agent system's success. This goal directs your agent as it trains. The AI learning technology within the agent will reward the agent system when it gets closer to the goal, helping it to improve.

For many business use cases, the top-level goal will be expressed in terms of profit or ROI. There are many factors to consider in ​the production scheduling use case, but profit is the ultimate goal, so it is the most important KPI.

Setting Goals, Constraints, and Success Criteria in the UI

To edit a goal created by the copilot or to create a new goal, follow these steps:

  1. Navigate to the use case page

  2. Click Set up goal and enter a name and description for your goal.

  3. Click Add condition to define the goal.

Use the dropdown menus to select the variables and then define the parameters for each goal, constraint, or success criterion you want to include.

Configure Goals

Goals define what a skill agent system should do. Goals apply to one of the sensor variables and are defined using one of three possible directives:

  • Maximize: Maximize the value of the variable

  • Minimize: Minimize the value of the variable

  • Maintain: Keep the variable at a specified value

Configure Constraints

Constraints set the boundaries for the skill agent. They define rules that the agent system must follow during operation. Constraints are defined using one of two possible directives:

  • Avoid: The skill agent learns to keep the variable from reaching a specified value or range through withholding rewards

  • Terminate: When the actions of the skill agent lead to certain conditions within a variable, the skill agent has failed and must stop and start a new episode

Configure Success Criteria

Success criteria tell the skill agent when it's doing something right. They are defined using one of two possible directives:

  • Approach: The skill agent learns to get close to a specified value by getting an increased reward

  • Succeed: When the success criteria are achieved, the session ends, and a new one begins so that the skill agent can keep practicing and learn to win every time

Using Goals, Constraints, and Success Criteria for Multi-Agent AI System Design

Goals, constraints, and success criteria transform complex AI agent building into an intuitive process that eliminates the need for programming expertise.

You directly apply your domain knowledge without coding by simply defining what you want the agent to achieve and what conditions to respect. This approach reduces development time while ensuring AI agents embody real-world operational wisdom, bridging the gap between technical AI capabilities and practical industrial knowledge in complex physical environments.

Create Skill Agents

Skills are the foundational building blocks for your intelligent agent system. They take action to achieve goals in key scenarios where your agent system needs to succeed. To build an agent with Machine Teaching, you will create multiple skill agents and then orchestrate them together.

You can use three different types of skill agents within Composabl.

For learned skills you can use either the UI or the SDK successfully to create teachers, but the SDK includes some fine-tuning options that are not available in the UI.

Controllers for programmed skills can be created only through the SDK. They can then be published to the UI for use in agent systems.

You can use either the UI or the SDK to create selectors.

Create Skill Agents with Rewards Using the SDK

The Composabl SDK offers a suite of advanced tools to train skills using deep reinforcement learning. Using the Python teacher class, you can fine-tune the rewards for your skills. Once you have configured a skill with the SDK, you can publish it to the UI to use in agent system designs.

Create a New Skill

To create a skill in the Python SDK, begin by logging in to the SDK by typing Composabl login from the CLI.

Then type composabl skill new.

Give the skill a name and a description in response to the prompts that follow. Choose whether your skill should be a teacher (learned with AI) or a controller (a programmed module like an optimization algorithm or MPC controller).

Specify the folder where you’d like to create the skill.

The Composal SDK will create a folder and Python teacher file from the template.

The Python Teacher Class

The Python teacher class offers several functions that you can use to fine-tune the training of your skills.

Functions for Training

Train with Rewards: the compute_reward Function

python
def compute_reward(self, transformed_sensors, action, sim_reward):
        self.counter += 1
        if self.past_ sensors is None:
            self.past_ sensors = transformed_ sensors
            return 0
        else:
            if self.past_ sensors ["state1"] < transformed_ sensors ["state1"]:
                return 1
            else:
                return -1

The compute_termination function tells the Composabl platform when to terminate a practice episode and start over with a new practice scenario (episode). From a teaching perspective, it makes most senses to terminate an episode when the agent system succeeds, fails, or is pursuing a course of action that you do not find likely to succeed. This function returns a Boolean flag (True or False) whether to terminate the episode. You can calculate this criteria however seems best.

python
def compute_termination(self, transformed_ sensors, action):
        return False

The success_criteria function provides a definition of skill success and a proxy for how completely the agent system has learned the skill. The platform uses the output of this function (True or False) to calculate when to stop training one skill and move on to training the next skill. It is also used to determine when to move to the next skill in a fixed order sequence. The agent system cannot move from one skill in a fixed order sequence to the next, until the success criteria for one skill is reached.

python
def compute_success_criteria(self, transformed_ sensors, action):
        return self.counter > 100

Here are some examples of success criteria definition:

  • A simple but naive success criteria might return True if the average reward for an episode or scenario crosses a threshold, but False if it does not.

  • A more complex success criteria might calculate root mean squared error (RMSE) for key variables across the episode and return True if the error is less than a customer specified benchmark, but False otherwise.

  • A complex success criteria might compare a benchmark controller or another agent system to the agent system across many key variables and trials. It returns True if the agent system beats the benchmark on this criteria, but False otherwise.

Train with Goals

Training with goals lets you use a predefined reward structure rather than configuring the rewards individually. When you use a goal, your agent system will inherit the compute reward, compute termination, and compute success functions from the goal. (You will still have the option to further customize those functions as needed.)

The five goal types you can use are:

  • AvoidGoal

  • MaximizeGoal

  • MinimizeGoal

  • ApproachGoal

  • MaintainGoal

Goals are added using specialized teacher classes rather than the general teacher class that you would otherwise use to teach skills. For example, for a skill named Balance that you wanted to train with a goal to maintain a specific orientation, you would use the MaintainGoal teacher class.

python
class BalanceTeacher(MaintainGoal):
	def __init__(self, *args, **kwargs):
super(),__init__(“pole_theta”, “Maintain pole to upright”, target=0, stop_distance=0.418)

The parameters you can use for goals are:

You can also use more than one goal for a single skill using the CoordinatedGoal teacher class. This is useful when your agent system needs to behave in a way that creates a balance between two goals that are both important.

Functions to Guide Agent System Behavior with Rules

Just like rules guide training and behavior for humans, providing rules for the agent system to follow can guide the agent system's decision-making more quickly to success. Rules guide the behavior of an agent system based on expertise and constraints.

Add Rules: the compute_action_mask Function

The compute_action_mask teaching function expresses rules that trainable agent systems must follow.

python
 # The action mask provides rules at each step about which actions the agent system is allowed to take.
    def compute_action_mask(self, transformed_ sensors, action):
        return [0, 1, 1]

The compute_action_mask teaching function works only for discrete action spaces (where the actions are integers or categories), not for continuous action spaces (where decision actions are decimal numbers). If you specify a mask for a skill whose actions are continuous, the platform will ignore the action mask.

The function returns a list of 0 and 1 values. Zero means that the action is forbidden by the rule. One means that the action is allowed by the rule. The function may change the returned value after each decision. This allows complex logic to express nuanced rules.

In the example above, the first action is forbidden for the next decision, but the second and third actions are allowed. The logic in the skill itself (whether learned or programmed) will choose between the allowed second and third actions.

All selectors have a discrete action space (they choose which child skill to activate), so you can always apply the compute_action_mask function to teach them.

Functions to Manage Information Inside Agent Systems

As information passes through perceptors, skills, and selectors in the agent system, sometimes it needs to change format along the way. You can use three teaching functions to transform sensor and action variables inside agent systems: transform_ sensors, transform_action, and filtered_ sensor _space.

To transform sensor variables, use the transform_sensor function to calculate changes to specific sensors, then return the complete set of sensor variables (the observation space).

python
def transform_sensor(self, sensor, action):
        return sensor

Two of the most common reasons for transforming sensor variables are conversion and normalization. For example, if a simulator reports temperature values in Fahrenheit, but the agent system expects temperature values in Celsius, use the transform_sensor function to convert between the two.

Normalization is when you transform variables into different ranges. For example, one sensor variable in your agent system might have very large values (in the thousands), but another variable might have small values (in the tenths), so you might use the transform_sensor function to transform these disparate sensor values to a range from 0 to 1 so that they can be better compared and used in the agent system.

You may want to transform action variables for the same reasons as sensor variables.

python
def transform_action(self, transformed_sensor, action):
    return action

Use the filtered_sensor_space function to pare down the list of sensor variables you need for a particular skill. Pass only the information that a skill or module needs in order to learn or perform well.

python
def filtered_sensor_space(self):
        return ["state1"]
Return a list of all the sensor variables that you want passed to the skill by this teacher.

Configure Programmed Algorithms as Skill Agents

are programmed skill agents used in Composabl agent systems. These may be optimizers, PID or MPC controllers, heuristics, or any other Python algorithms that are useful to control parts of a process. Configure controllers in the SDK and then publish them to the UI to use in agent systems.

Create a New Skill Agent

To create a skill agent in the Python SDK, begin by logging in to the SDK by typing Composabl login from the CLI.

Then type composabl skill new.

Give the skill agent a name and a description in response to the prompts that follow. Choose your skill agent type as a controller (a programmed module like an optimization algorithm or MPC controller).

Specify the folder where you’d like to create the skill agent.

The Composal SDK will create a folder and Python controller.py() file from the template.

The Python Controller Class

The Python controller class offers several functions that you can use to build your algorithms or rule-based controllers in Composabl.

Functions for Training

Controllers don't need training, since they are based on programming rather than learning, but they include functions that connect them to the rest of the agent system during training.

Initialize your algorithm: the __init__ Function

The __init__ function can be used to initialize your Algorithm and define initial configuration variables, this is called once when the Runtime starts. Let's supose that you want to use a MPC algorithm:

Process Observation to Compute Action: the compute_action Function

The compute_action function will process the observation and returns an action. This function returns a number that represents the action that will be sent to the simulation.

The compute_termination function tells the Composabl platform when to terminate a practice episode and start over with a new practice scenario (episode). From a controller perspective, it makes most senses to terminate an episode when the agent system succeeds, fails, or is pursuing a course of action that you do not find likely to succeed. This function returns a Boolean flag (True or False) whether to terminate the episode. You can calculate this criteria however seems best.

The success_criteria function provides a definition of the skill agent's success and a proxy for how completely the agent system has learned the skill. The platform uses the output of this function (True or False) to calculate when to stop training one skill agent and move on to training the next skill agent. It is also used to determine when to move to the next skill agent in a fixed order sequence. The agent system cannot move from one skill agent in a fixed order sequence to the next, until the success criteria for one skill agent is reached.

Here are some examples of success criteria definition:

  • A simple but naive success criteria might return True if a certain sensors or KPI value for an episode or scenario crosses a threshold, but False if it does not.

  • A complex success criteria might compare a benchmark controller or another agent system to the agent system across many key variables and trials. It returns True if the agent system beats the benchmark on this criteria, but False otherwise.

Functions to Manage Information Inside Agent Systems

As information passes through perceptors, skill agents, and selectors in the agent system, sometimes it needs to change format along the way. You can use three teaching functions to transform sensor and action variables inside agent systems: transform_ sensors, transform_action, and filtered_ sensor _space.

To transform sensor variables, use the transform_sensor function to calculate changes to specific sensors, then return the complete set of sensor variables (the observation space).

Two of the most common reasons for transforming sensor variables are conversion and normalization. For example, if a simulator reports temperature values in Fahrenheit, but the agent system expects temperature values in Celsius, use the transform_sensor function to convert between the two.

Normalization is when you transform variables into different ranges. For example, one sensor variable in your agent system might have very large values (in the thousands), but another variable might have small values (in the tenths), so you might use the transform_sensor function to transform these disparate sensor values to a range from 0 to 1 so that they can better be compared and used in the agent system.

You may want to transform action variables for the same reasons as sensor variables.

Use the filtered_sensor_space function to pare down the list of sensor variables you need for a particular skill agent. Pass only the information that a skill agent or module needs in order to learn or perform well.

Create a New Perceptor

Perceptors use the SDK and CLI workflow.

To access a template for a perceptor, type composabl perceptor new into the CLI. Composabl will then generate a perceptor template that you can populate with your information.

In this simple perceptor example we calculate the perceptor outputs that will be added as new sensor variables and we create a list of perceptors that comprise the perception layer.

If you choose not to use the AI assistant, just click Skip. You can then configure your and manually.

Some use cases will have multiple goals the agent system must meet simultaneously. In fact, the need to balance more than one goal is a sign that a use case is a good fit for Machine Teaching. In , the agent system needs to get the right chemical product concentration and keep the tank's temperature from becoming dangerously hot.

When you in the UI, you can use natural language to describe your goals for the use case. Composabl's copilot assistant uses this information to create a starter goal in the format the platform can use for training. This goal may be ready to use, but you will likely need to adjust it.

For example, for , we want to maximize the concentration of the product, Ca.

For example, we want to avoid the temperature, T, getting above 400 degrees Kelvin in use case.

Create a to implement a that uses DRL to make decisions. You can set up your teacher to or to . The skill agent will then train and practice in simulation until it can make decisions successfully.

Create a to implement a that makes decisions based on programming rather than learning. A controller can use math, rules, optimization, or any other technology that can be expressed in a Python program. and publish them to the UI to use in agent systems.

Create a to . Selectors are specialized skills that direct the agent system how to choose between different decision-making skills. Selectors can be either learned or programmed.

You can also create skills in two different ways, using the UI and using the SDK. If you use the SDK, you can then to be included in agent system designs.

The compute_reward function provides the bulk of the feedback after each agent system action about how much that action contributed to the success of the skill. This function returns a number that represents the reward signal the agent system will receive for its last decision. Reward functions, as they are called in reinforcement learning, can be tricky to craft. .

End Training: the compute_termination Function

Define Success: the compute_success_criteria Function

These have the same parameters and work the same way as .

Transform Sensor Variables: the transform_sensors function

Transform Decisions within the Agent System: the transform_action function

Filter the Sensor List: the filtered_sensor_space function

End Simulation: the compute_termination Function

Define Success: the compute_success_criteria Function

Transform Sensor Variables: the transform_sensors function

Transform Decisions within the Agent System: the transform_action function

Filter the Sensor List: the filtered_sensor_space function

the industrial mixer use case
train the skill agent using goals
train with rewards
Configure controllers with the SDK
orchestrate skills together
publish the skills to the UI
Learn more about how to write good reward functions
​
​
the goal types in the UI
​
​
​
​
goals
skill agents
set up a use case
the industrial mixer
the industrial mixer
def __init__(self, *args, **kwargs):
    self.counter += 1
    self.mpc = LinearMPC()
async def compute_action(self, obs, action):
    self.counter += 1
    action = self.mpc.solve(obs)
    return action
python
async def compute_termination(self, transformed_sensors, action):
        return False
python
async def compute_success_criteria(self, transformed_ sensors, action):
        return self.counter > 100
python
async def transform_sensor(self, sensor, action):
        return sensor
python
async def transform_action(self, transformed_sensor, action):
    return action
python
def filtered_sensor_space(self):
        return ["state1"]
Return a list of all the sensor variables that you want passed to the skill agent by this teacher.
python
class DeltaCounter():
    def __init__(self):
        self.key = "state1"
        self.previous_value = None

    def compute(self, sensors):
        if self.previous_value is None:
            self.previous_value = sensors[self.key]
            return {"delta_counter": 0, "state2": 0}

        delta = sensors ["state1"] - self.previous_value
        self.previous_value = sensors["state1"]
        return {"delta_counter": delta, "state2": 0}

    def filtered_sensor_space(self, sensors):
        return ["state1"]

delta_counter = Perceptor(["delta_counter", "state2"], DeltaCounter, "the change in the counter from the last two steps")

Configure an ML Model as a Perceptor

In this tutorial, we will walk through how to integrate a trained machine learning (ML) model into your Composabl agent system as a Perceptor. A perceptor allows your agent system to interpret data from sensors, process it using a machine learning model, and output new variables that will help the agent system make better decisions.

The goal is to publish a pre-trained ML model as a perceptor that adds a new layer of perception to your agent system, enabling it to process sensor data in a more advanced way. This could be useful in a variety of scenarios, such as predictive maintenance, anomaly detection, or autonomous decision-making.


Step 1: Understanding the Perceptor

A Perceptor in Composabl is a module in the perception layer that inputs sensor data, processes it (potentially using an ML model), and outputs new variables that are automatically added to the list of available sensors.

For this example, let’s assume we are building a perceptor that uses a trained machine learning model to predict thermal runaway in a system.


Step 2: Setting Up the Trained Model

We will use a pre-trained ML model stored as a pickle file to predict thermal runaway based on certain temperature and chemical sensor readings. Here’s how to set up the trained ML model for use as a perceptor.

  1. Store the ML Model: Assume the ML model has been trained and saved as a .pkl file. For this example, the model is stored in the path: ml_models/ml_predict_temperature.pkl.

  2. Load the ML Model in the Perceptor: In the perceptor class, we will load the model and define how it processes the sensor data.


Step 3: Creating the Perceptor

Now, we’ll create the perceptor using the trained ML model to process the sensor data and predict thermal runaway events. The perceptor will be responsible for calling the model and returning the prediction as a new sensor variable.

We can start by creating the preceptor by using the Composable CLI with the following command:

composable preceptor new

The new preceptor will have the following file structure:

.└── perceptor_name/    
    ├── perceptor_name/    
        │   ├── __init__.py    
        │   └── perceptor.py    
    ├── pyproject.toml    
    └── README.md

3.1. Configuring your pyproject.toml file

[project]

name = "perc" # Keep this as the same name as the directory
version = "0.1.0"
description = "perc" # You can change this to a description of your project
authors = [    
    { name = "John Doe", email = "john.doe@composabl.com" },
] # Change this to your name and email
dependencies = [    
    "composabl-core",    
    "<third-party-lib>", # Add any third-party dependencies here
]

[composabl]# This metadata should not be changed with exption of the entrypoint
# after the : marker if you change the name of the class. Those names
# should match.
type = "perceptor"entrypoint = "perc.perceptor:DemoPerceptor"

3.2. Implementing the Perceptor in the perceptor.py file

Here’s the Python code to create the perceptor:

import pickle
from composabl import Perceptor, PerceptorImpl
from sensors import sensors  # Assuming sensors are already defined

class ThermalRunawayPredict(PerceptorImpl):
    def __init__(self, *args, **kwargs):
        self.last_Tc = 0  # Store the last observed temperature
        self.ml_model = pickle.load(open("ml_models/ml_predict_temperature.pkl", 'rb'))  # Load the trained ML model

    async def compute(self, obs_spec, obs):
        # Convert the observation to a dictionary using sensor names
        if type(obs) != dict:
            obs_keys = [s.name for s in sensors]
            obs = dict(zip(obs_keys, obs))

        # Calculate delta temperature (change in Tc)
        if self.last_Tc == 0:
            delta_Tc = 5  # Initial change
        else:
            delta_Tc = float(obs['Tc']) - self.last_Tc

        # Prepare the input for the ML model
        X = [[float(obs['Ca']), float(obs['T']), float(obs['Tc']), delta_Tc]]
        
        # Get the prediction from the ML model
        prediction = self.ml_model.predict(X)[0]

        # Update last_Tc for the next iteration
        self.last_Tc = float(obs['Tc'])

        # Return the prediction as a new sensor variable
        return {"thermal_runaway_predict": prediction}

    def filtered_sensor_space(self, obs):
        # Define which sensor values the ML model needs
        return ['T', 'Tc', 'Ca']

In this perceptor:

  • We load the trained machine learning model from a pickle file.

  • The compute() method takes in sensor data (e.g., temperature, chemical concentrations), processes it, and uses the ML model to predict whether a thermal runaway event will occur.

  • The perceptor outputs the prediction as a new sensor variable, thermal_runaway_predict.

3.2. Adding the Perceptor to Your Agent System

Once the perceptor is defined, you can login to the Composabl editor and add it to your agent system.


Conclusion

In this tutorial, we covered how to publish a trained ML model as a perceptor in Composabl. This allows the agent system to integrate more advanced decision-making by processing raw sensor data through a machine learning model and outputting predictions as new sensor variables. This method can be applied in various domains, such as predictive maintenance, anomaly detection, and control systems.

​
​
​
​
​
​

Add a Perception Layer

Adding perception modules to your agent can provide richer, complex, condensed, and nuanced information to the decision-making parts of the agent system. For example, you might include a computer vision model in your perception layer that inputs images or video from a camera and outputs classifications of objects that it identifies. You can also add large language models as perceptors to take in and interpret information in natural language.

Each module in the perception layer for a Composabl agent system inputs the sensor variables, processes those variables in some way, and outputs one or more new variables that the platform will automatically add to the list of sensors.

Perceptors can use any supported Python function or library to calculate outputs. They can even call machine learning and large language models or their APIs.

The next three pages explain how to use the SDK and CLI workflow to create new perceptors or configure existing models as perceptors to use in Composabl agent systems.

Add Perceptors to Agent Systems

Just like skill agents, perceptors can be dragged and dropped into agent systems using the UI. Perceptors will always be situated in the Perception layer that comes before orchestrators and skill agents. That’s because perception needs to be applied to the sensor inputs to create new variables that are then passed to the skills layer for the agent system to use in decision-making.

Configure Scenarios

are situations where your agent system needs to behave differently to succeed.

Scenarios are created by carving out sections of the simulation space, as defined by specific configurations of variables and variable ranges. This allows you to train skill agents only in the scenarios where they will need to perform, leading to more efficient training and better performance. Orchestrators learn to recognize scenarios and pass control to the appropriate specialist skill agent to handle the scenario’s demands.

Types of Scenarios

How you define scenarios depends on the type of sensor variables you are working with.

Discrete variables are categories that describe a particular scenario. For each scenario, there is likely a perceptor in the agent system, such as a machine learning model, that inputs the sensors, processes the sensor values, and outputs the discrete variable category.

Continuous variables are numbers. When they are used to define a scenario, one number value determines the section of the space that your agent system will explore for decisions.

Sometimes a scenario is better defined by a range of continuous values than by a single continuous value. In that case, the scenario would be defined by a continuous variable range.

Here are some examples of how scenarios can be defined in different ways:

Use Case
Discrete Variables
Continuous Variables
Continuous Variable Ranges

Drone Control

windy, far_from_charger, low_battery

windspeed=20 (knots)

windspeed between 20-40 (knots)

Autonomous Driving

city, highway, high_traffic

vehicle_speed=65 (miles per hour)

vehicle_speed between 65-85 (miles per hour)

Cybersecurity Network Optimization

normal, high_traffic, cyberattack

site_traffic=0 (clicks per second)

site_traffic between 100-150 (clicks per second)

Process Control

startup, steady_state, shutdown

product_thickness=50 (milimeters)

product_thickness between 49.94-50.06 (milimeters)

Machine Control

break-in, normal, wearing_out

rpm=280 (revolutions per minute)

rpm between 250-295 (revolutions per minute)

Scenarios Example

Let’s say that we are creating an agent system to plan the operations of a restaurant with varying demand for three different recipes. We could create three different scenarios:

  • Low demand: 30 recipe A, 20 recipe B, and 10 recipe C

  • Normal demand: 60 recipe A, 45 recipe B, and 10 recipe C

  • High demand: 100 recipe A, 50 recipe B, and 25 recipe C

Once you have defined these scenarios, you could set the agent system to train in the low-demand scenario until the success criteria are reached. Then it would move to normal demand, and then to high demand. We would expect the agent to learn faster as it progressed through the scenarios, with cumulative knowledge building over time.

Set Up Scenarios in the UI

Add Scenarios to a Use Case

Add scenarios to your use case by clicking on Scenarios from the left-hand menu to open the Scenarios page.

Click Add scenario to create a new scenario. Give your scenario a name and then click Add condition to configure it. You will then have the option to choose from any of the available sensor variables and apply conditions to them.

  • For continuous variables, select Is and enter the exact value.

  • For continuous variable ranges, select Is between and enter the range.

  • For discrete variables, select Is element of and enter the possible values.

Create Scenario Flows

In addition to identifying scenarios, Composabl also lets you teach your agent systems about how scenarios relate to each other. To do this, you can build scenario flows to establish sequences of scenarios for the agent system to practice.

Without scenario flows, the training platform will connect scenarios at random. But sometimes it is important for agent systems to practice scenarios in a specific order. For example, it might be important to practice flying a drone in high winds and then landing in the same conditions. Or an agent managing a production line might need to practice going from a scenario in which raw materials are scarce to one in which material costs rapidly rise. By creating a scenario flow, you can ensure that a given sequence of scenarios will be included in training.

Once you have scenarios built, create sequential sections of the task by clicking Add section.

After creating all the sections you need, drag and drop the scenarios into the sections.

Create as many flows as you need to capture the different scenario sequences your agent system will need to navigate.

Add Scenarios to Skill Agents

Scenarios are added to agent systems as part of configuring skill agents. Scenarios are added to skill agents so that they know what specific conditions they need to master. Not all scenarios are relevant to all skill agents. For example, a drone landing skill doesn’t need to practice taking off in windy weather.

The configurations you set up when building the scenario flows will appear in the skill agent configuration modal. For each section of the process, as defined by the flows, tell the orchestrator which scenarios it should apply by checking the boxes next to each scenario.

Orchestrate Skill Agents

Skill agents can be arranged in sequences or hierarchies, in skill groups, or as coordinated skills that output multiple decisions together. The orchestration structures reflect common design patterns that can be used to accelerate the design and creation of agent systems.

Orchestrate Skills in Hierarchies and Sequences

For some agent system designs, the task will be broken down into different skill agents that each control the system under certain conditions. For these agent systems, a special skill called an orchestrator chooses the right skill agent at the right time to accomplish the task. Orchestrators are the specialized supervisor skill agents that orchestrate the skill agents together, determining which skill agent to activate based on the conditions the system needs to respond to.

To add an orchestrator to an agent system, drag the orchestrator into your agent system above the skills layer.

You will then be prompted to configure the orchestrator.

You will also be prompted to choose between an additional set of options that correspond to two separate Machine Teaching design patterns.

  • Fixed-order sequence: perform the skills in a set order. This is used in the , a design pattern that is useful for tasks that involve fixed sequences of actions.

  • Variable order sequence: perform the skills in any order based on the orchestrator’s determination. This is used in the , a design pattern that is useful for tasks that require different control strategies to be used in different situations or conditions.

Orchestrate Skill Agents in Groups

Unlike agent system designs that use an orchestrator to assign control to skills one at a time, agent systems with skill groups use skills working together to make decisions.

Skill groups always consist of two or more skill agents. To create a skill group, simply drag the second skill under the first, and a skill group will automatically be created.

Skill groups are used for the , where one skill determines what the action should be and a second skill then “turns the knobs” to implement the decision.

In the industrial mixer example, the DRL skill agent is able to train effectively because the actions of the MPC controller are predictable. That means that it can practice and learn knowing that variations in performance are due to its own actions.

In agent systems with multiple DRL skill agents arranged in plan-execute patterns, Composabl will always train the skills from the bottom to the top. In other words, the execute skill will have to achieve competence before the plan skill will start training. That allows each skill to effectively interpret the feedback from the system without confusion from each other.

Orchestrate Coordinated Skills

Some tasks require multiple skill agents to work together on a single decision, but in parallel rather than in sequence. Skill agents for these tasks use coordinated skills that learn to take action together toward a goal. Also known as Multi-Agent Training, coordinated skills are trained using a coach, rather than a teacher.

Examples of Coordinated Skill Agents

Traffic Optimization: Enhancing traffic flow and safety by teaching individual vehicles to navigate optimally and cooperate with each other.

Collaborative Robotics: Enabling robots to work together on tasks such as assembly in manufacturing or coordination in logistics.

Smart Grids: Optimizing energy distribution by having agents represent power plants, storage, and consumers to improve efficiency and stability.

Multiplayer Games: Creating adaptive and intelligent NPCs that can offer dynamic challenges to players in competitive or cooperative game settings.

Communication Networks: Improving network performance by optimizing resource allocation and traffic routing through agents representing network components.

Environmental Management: Balancing economic, ecological, and social goals in land use and resource management by simulating stakeholders as agents.

Healthcare Logistics: Strategizing resource allocation and treatment plans in scenarios like pandemics by considering the actions of hospitals, pharmacies, and patients as agents.

Supply Chain Optimization: Minimizing costs and delivery times in supply chains by coordinating agents representing various stages of the supply chain process.

Orchestrate Coordinated Skills with the SDK

Coordinated skill agents are not yet available in the UI. In the SDK, we have expanded the API to integrate Coordinated Skills through the add_coordinated_skill method on your agent. This method accepts a new class that gets configured, named CoordinatedSkill, just as with the Teacher or Controller classes we implement this class by inheriting from the Coach class.

The coordinated skill agent will now take the incoming observation and action spaces and pass it to the sub-skills as a shared environment observation and action taking. The sub-skills will then return their observations and actions, which will be passed back to the coordinated skill agent. The coordinated skill agent will then return the combined observations and actions to the agent system.

python
# ####################################################################################################
# Define the Coordinated Coach
# ####################################################################################################
class CoordinatedCoach(Coach):
    def __init__(self):
        self.counter = 0

    def compute_reward(self, transformed_ sensors, action, sim_reward):
        """
        Computes the reward for the given transformed observation and action
        :param transformed_ sensors: The transformed observation
        :param action: The actions dict
        :param sim_reward: The reward from the simulation
        :return: The reward, as a dictionary, with each key the sub-skill name and the value the reward
        """
        self.counter += 1
        return 1

    def compute_success_criteria(self, transformed_ sensors, action):
        # keep the episodes short to make testing quicker
        return self.counter > 100

    def compute_termination(self, transformed_ sensors, action):
        # keep the episodes short to make testing quicker
        return self.counter > 150

    def transform_action(self, composabl_ sensors, action):
        return action


# ####################################################################################################
# Construct your Agent
# ####################################################################################################
s1 = Skill("skill1", IncrementTeacher)
s2 = Skill("skill2", IncrementTeacher)

a = Agent()
a.add_coordinated_skill(CoordinatedSkill(
  "my-coordinated-skill",
  CoordinatedCoach,
  [s1, s2]
))

Get Started

A Two-Part Platform

Composabl is a two-part platform with a no-code UI and a Python SDK and CLI. The interplay of these parts gives Composabl its combination of usability and power.

The two parts enable teams to work together effectively. People and roles who primarily use code, such as data scientists and controls engineers, use the SDK to create modules like ML models and deep reinforcement learning skills. Then, subject matter experts, process engineers, and others can use the no-code interface to easily create teams of agents from these modular building blocks and train them to succeed.

We designed the platform this way because, for complex, high-value processes, there will be some tasks that can only be done through coding, and some team members who prefer to work in code, and other tasks that are better performed through a visual interface. Both parts of the platform work together.

Most users use both parts of the platform to some extent but spend more time in the no-code UI or the SDK, depending on their expertise and role. How you will use the platform depends on your role and what part of the process you are working on.

Set Up Your Environment

Access Composabl

You can access Composabl Via a no-code UI and an SDK. They work together to enable you to build, train, and deploy autonomous agent systems.

No-Code UI

Python CLI and SDK Instalation

Installing the SDK

pip install -U composabl

Development

pip install -U composabl-dev

Note: The Composabl CLI and SDK require Python version 3.10.x

Workflow Steps

  • Step 3: UI and SDK | Create agent systems

  • Step 7: Notebook | Export Historian training data and perform detailed analysis

Configure an LLM Model as a Perceptor

You can use an LLM as a to add language and communication capabilities to your agent system.

This allows you to create human-like assistants or copilots who can contribute natural language capabilities to your agent system. Composabl has several personas for LLM assistants to help structure your agent design.

  • The analyst interprets sensor data and passes it to an interface that the user can access, allowing real-time monitoring of conditions and the agent system's responses.

  • The executive reads external data sources in text and reports information to the agent system, such as trends in the business press that would help to anticipate demand for a product

  • The plant manager allows operators to communicate directly with the agent system and gives it instructions based on information that would not be otherwise available in its sensor space

LLM perceptors can either:

  1. Output language to the operator about what the agent system is doing (ex. the analyst)

  2. Take in inputs in natural language and then transform them into information that the decision-making layer of the agent can use (ex. the executive and plant manager)

Create an LLM Perceptor

Step 1: Create the perceptor shell

From the CLI, when logged into Composabl, type composabl perceptor new. You will be prompted for a location to save your new perceptor, and then a new directory with your perceptor will be created.

This will include the pyproject.toml file that will allow you to publish the perceptor to the UI once it is created.

Step 2: Define the perceptor class

Within the perceptor.py file, create the API call and prompt for the LLM.

Analyst Perceptor Code Sample

The analyst displays information to the human user, but doesn't send information to the decision-making layer of the agent, so the perceptor returns 0.

Executive Code Sample

The executive in this sample related to the industrial mixer use case automatically queries a chemical engineering LLM for advice about control actions to take.

The perceptor returns an action that it recommends the decision-making layer of the skill agent to take. This becomes a new sensor variable that the skill agent teacher(s) will take into account when training the agent system in simulation.

Examples and Reference

Step 3: Filter the Sensor Space

Composabl agent systems can include text fields in perceptors, but they must be transformed or filtered out in the teacher.py file before training with DRL. For any text variables that are not transformed into a different data type, use the filtered_sensor_space method of the teacher to remove them.

Step 4: Publish the Perceptor

Publish the perceptor to the UI.

Naviage to the folder above your perceptor. Then publish your perceptor.

Select the organization and project that your perceptor to add your perceptor. The refresh your Agent Orchestration Studio to see the perceptor and add it to agents.

Strategy Pattern

The strategy pattern is one of the key design patterns of machine teaching. When you use the strategy pattern, you break down the task into specific skill agents that each handle one aspect of the process to be controlled. This allows you to "teach" the agent system using subject matter expertise.

In the strategy pattern, each skill agent is either trained using deep reinforcement learning or controlled with a programmed algorithm. Then, a special skill agent called an orchestrator decides which skill agent should make the decision based on the current conditions.

In the industrial mixer problem, the process is divided into three skill agents based on the phase of the process. All three action skill agents and the orchestrator are trained with DRL: each skill agent practices in the conditions it will face and learns to control its part of the process by experimenting over time.

Think of the strategy pattern as like a math class with three students. Student A loves fractions, Student B is great at decimal problems, and Student C thinks in percentages. The orchestrator is their teacher. She reads each question, sees what kind of problem it is, and then assigns it to the student who can solve it best, because of their own special math talent.

Let's get started configuring this agent system!

1. Create your skill agents

This agent system has three skill agents called Start Process ,Control Transition , and Produce Product . To create these skill agents in the UI, go to the skill agent page and click Add skill agent . Create all three skill agents and then set the goals and constraints.

2. Set Skill Agent Goals and Constraints

The goal for these skill agents is to maximize yield, and the constraint is to keep the temperature from going above 400 degrees Kelvin.

The goals and constraints are exactly the same in all three skill agents. The agents become specialized during training, as each skill agent trains in a different scenario, corresponding with the three phases of the process. We will create these scenarios later in the tutorial.

  1. Click Add goal In the left drop-down menu, select Maximize, and in the right one, select Eps_Yield .

  2. Click Add constraint. In the left drop-down menu, select Avoid , and in the right one, select T . After you select T you're going to see a slider appear for you to set boundaries you want to train the system to avoid. In this case, we want to set the boundaries from 400 to 500.

  3. Save your skill agent and return to the Agent Orchestration Studio.

3. Add the Skill Agents to Your Strategy Pattern Agent System

Drag the skill agents Start Process, Control Transition, and Produce Product that you can now see on the left-hand side of your use case onto the skills layer. Drag the skill agents from the side in the order you want them to be used.

3. Configure the Orchestrator

The green diamond that appears when you place multiple skill agents alongside each other is the orchestrator. This is the "math teacher" agent that decides which specialized skill agent should be chosen to make each decision.

The goals of the top-level orchestrator in an agent will typically be the same as the goals of the agent system as a whole. So, we can set it to Maximize Eps_Yield.

A fixed-order sequence is appropriate for a phased process like the industrial mixer reaction. That means the orchestrator has the skill agent apply the skill agents one at a time, rather than switching back and forth between skill agents.

4. Configure Scenarios

This is what allows the skill agents to differentiate from each other. The three specialized skill agents practice only in their designated phase of the process and learn to succeed in their own specific conditions. The orchestrator practices with the whole process so that it knows which skill agent to choose at any point.

4.1 Add Scenarios

Go to the Scenarios page using the left-hand navigation menu. Click Add Scenario to create a new scenario for your agent to use in training.

When building an agent system for your use case, you will define the scenarios based on your knowledge of the task and process. In this case, we provide the values that define the phases of the chemical manufacturing process. Create these scenarios for your agent:

  • Full reaction: Cref Is 8.57, Tref Is 311 |

  • Startup: Cref Is 8.5698, Tref Is 311.2612 |

  • Transition: Cref Is 8.56, Tref Is 311, Is 22 |

  • Production: Cref Is 2, Tref Is 373.1311 |

4.2 Create Scenario Flows

Scenario flows allow you to connect scenarios that have a sequential relationship to ensure that your agent gets practice in navigating the different conditions in the order in which they will occur.

For this problem, you do not need to create sequential connections between the scenarios. Drag all the scenarios to the first column to make them available to your skill agents and orchestrators.

4.3 Add Scenarios to Skill Agents and Orchestrators

Once you have your scenarios set up and connected with scenario flows, you can add them to skill agents and orchestrators to tell the skill agents and orchestrators what conditions they need to practice in. This helps them to develop their specialized expertise.

In the Agent Builder Studio, click on each skill agent and the orchestrator in turn. Check the box for each scenario to apply to the skill agent.

  • Start process: Startup

  • Control Transition: Transition

  • Produce product: Production

  • Orchestrator: Full reaction

5. Run Your Training Session

We are ready to train your agent system and see the results. Select the cluster you want to use and the number of training cycles. We suggest you run 150 training cycles.

You will see the skill agents training one at a time, and you assign the number of cycles you want each skill agent to use. It will automatically assign an equal number of training sessions for each skill agent, but in some agent system designs, some skill agents might require more training than others. For example, in this use case, the transition phase is more difficult to control than the two steady states, so the Control Transition skill agent may need more training time than the others to become effective.

6. View Results

When the training has been completed, you can view your results in the training sessions tab in the UI. This will show you information on how well the agent is learning.

You will likely see a steep learning curve as the agent experiments with different control strategies and learns from the results. When the learning curve plateaus, that usually means that the skill agent is trained.

Analyzing the Strategy Pattern Agent's Performance

Conversion rate: 92% Thermal runaway risk: Low

We tested this fully trained agent and plotted the results.

Orchestrators can be trained with DRL just like the other skill agents, and you set goals for them the same way. Click on Configure Orchestrator to be taken to the menu to set your orchestrator’s goals. Note that the goals of the agent’s top-level orchestrator should be the same as the .

A common plan-execute design is the set point pattern, where a DRL agent skill determines what the set point of a system should be, and then a traditional automation technology (a controller that uses model predictive control or optimization) determines what actions are necessary to get the system to the set point. For example, in , the DRL plan skill agent decides what temperature the liquid in the tank should be. Then the MPC execute skill agent determines what temperature of coolant needs to be flowed around the tank to achieve the desired set point.

The first thing you will need to do after you login to your account is to .

Step 1: UI |

Step 2: UI |

Create skill agents to train with rewards using deep reinforcement learning in , or .

Create or package ML models to import to UI with the SDK to to agent systems.

Create or with the SDK to the UI to add natural language to agent systems.

to import to the UI with the SDK to add programmed decision-making to skill agents.

Step 4: SDK | with one CLI command

Step 5: UI | together to create teams of agents in the UI

Step 6: UI | with one click using the UI

Step 8: UI and SDK | and for deployment

are key to successfully training an agent with the strategy pattern. Scenarios are different possible conditions represented within the simulation. Skill agents are trained to specialize in different scenarios - for example, the Start Reaction skill agent specializes in controlling the reaction when the temperature and concentration levels are those found at the beginning of the reaction.

This agent system performance is not perfect, but it stays closer to the benchmark line than either of the two single-skill agent systems. It just needs some help avoiding thermal runaway. We can provide that by

goals for the agent system as a whole
connect a simulation to Composabl
Create your first use case
Set performance goals
the UI
the SDK
add advanced perception
package LLMs to import
Create or package controllers and optimization algorithms
Publish to the UI
Orchestrate modular components
Train your teams of agents at scale
Export trained multi-agent systems
connect them to the Composabl runtime
the industrial mixer use case
from fake_llm import llm_client
from fake_factory_console import factory_console_client

from composabl_core import PerceptorImpl

class AnalystPerceptor(PerceptorImpl):
    """
    The analyst type that displays information to the human operators but doesn't send any information to the agent.
    """
    def __init__(self, *args, **kwargs):
        # Example:
        self.llm_client = llm_client()
        self.factory_console_client = factory_console_client()
        pass    
    async def compute(self, obs_spec, obs):       
        # First, ask the LLM for its thoughts on the current state of the plant
        llm_response = self.llm_client.ask(f"You are controlling a CSTR plant, the current state of the plant is {obs}. What are your thoughts on the current state of the plant?")

        # Second, post the LLM's thoughts to the factory console for a human to read
        self.factory_console_client.post(f"The LLM thoughts on the current state of the plant are: {llm_response}")

        return {"chemical_engineer_llm": 0}
from fake_llm import llm_client
from composabl_core import PerceptorImpl

class ChemicalEngineerPerceptor(PerceptorImpl):
    """
    The perceptor for the text agent
    """
    def __init__(self, *args, **kwargs):
        self.llm_client = llm_client()
        pass

    async def compute(self, obs_spec, obs):
        """
        Asks the LLM for its thoughts on the current state of the plant, and returns a recommended action
        """
        llm_response = self.llm_client.ask(f"You are controlling a CSTR plant, the current state of the plant is {obs}. what action do you recommend?")
        llm_action = llm_response.find("action")
        return {"chemical_engineer_llm": llm_action}
composabl login
composabl perceptor publish foldername

Analyze Data in Detail with the Historian

In this tutorial, we will explore how to use the historian to validate the trained AI agent system in Composabl and training logs. The historian stores historical time-series data in an optimized format (parquet) - https://www.databricks.com/glossary/what-is-parquet, which helps in evaluating how the agent is performing during training.

Step 1: Accessing the Historian Data

The historian file stores time-series data essential for validating agent system training. There are several ways to access and store the historian data, but the recommended format is as a delta file (parquet).

  1. Understanding the Format:

    • The historian data is typically large, around 500 megabytes for standard operations. It is stored in a Delta Lake file format, optimized for time-series data and supporting efficient queries.

  2. Downloading the Historian File:

    • From the Composabl UI, download the historian file. This file will come in a compressed format (e.g., .gz).

    • After extracting it, you should see the delta file containing time-series data.

Step 2: Setting Up for Validation

  1. Unpacking the Historian File:

    • If the historian file is compressed (e.g., .gz), unpack the file using a tool like gzip:

      
      gunzip -k historian_file.gz 
      
    • Once unzipped, you’ll see a 10 MB+ delta file with historical time-series data.

  2. Understanding the Delta File:

    • The delta file is optimized for fast reads and writes of time-series data.

    • It supports an append-only structure, which ensures that each new piece of data can be added efficiently without modifying the existing data.

Step 3: Querying the Historian Data

  1. Setting Up a Query Environment:

    • To validate your agent system’s training, you’ll need to set up an environment that allows you to query the delta file. Delta Lake integrates well with systems like Apache Spark, but for simple querying, you can use tools like pandas in Python.

  2. Querying for Agent Training Logs:

    • Extract and analyze relevant historical data from the delta file. Here's a simple Python example for querying the delta file using pandas:

    
    import pandas as pd 
    
    
    
    # Load the historian delta file 
    
    df = pd.read_parquet('historian_delta_file.parquet') 
    
    df = df.sort_values(by=['timestamp'])
    
    df_data = df[df['category_sub'].isin(['step', 'skill-training','skill-training-cycle'])]
    #filter df with composabl_obs on "data" col only
    df_data = df_data[(df_data['data'].str.contains('composabl_obs')) | (df_data['category_sub'].str.contains('skill-training')) | (df_data['category_sub'].str.contains('skill-training-cycle'))]
    
    #df_data['data'] = df_data['data'].apply(lambda x: x if 'composabl_obs' in x else None)
    def convert_to_dict(x):
       try:
          return json.loads(x)
       except:
          try:
                return ast.literal_eval(x)
          except:
                return None
    
    df_data['data'] = df_data['data'].apply(lambda x: convert_to_dict(x))
    
    df_data['skill_name'] = df_data['data'].apply(lambda x: x['name'] if 'is_done' in x else None)
    df_data['skill_name'] = df_data['skill_name'].fillna(method='bfill')
    
    df_data['reward'] = df_data['data'].apply(lambda x: x['teacher_reward'] if 'composabl_obs' in x else None)
    
    df_data['obs'] = df_data['data'].apply(lambda x: x['composabl_obs'] if 'composabl_obs' in x else None)
    
    #df_data['done'] = df_data['data'].apply(lambda x: x["teacher_terminated"] if "teacher_terminated" in x else None)
    df_data['cycle'] = df_data['data'].apply(lambda x: x['cycle'] if 'cycle' in x else None)
    df_data['cycle'] = df_data['cycle'].fillna(method='bfill')
    
    df_data = df_data[df_data['category_sub'] == 'step']
    
    print(df_data)
    
    # group by runs
    df_group = df_data.groupby(['run_id','skill_name','cycle'])['reward'].mean()
    
    # Process observation data
    df_obs = pd.DataFrame(data=[[v[0] for v in list(x.values())] for x in df_data['obs'].values], columns=[list(df_data['obs'][0].keys())])
    
    df_obs['cycle'] = df_data['cycle']
    df_obs['run_id'] = df_data['run_id']
    df_obs['skill_name'] = df_data['skill_name']
    df_obs.columns = [x[0] for x in list(df_obs.columns)]
    
    # Episode Reward by Run Id
    for run_id in list(set([x[0] for x in df_group.index])):
       for skill in list(set([x[1] for x in df_group.index])):
          #df_group[run_id].plot(subplots=True, title=run_id)
          plt.plot(df_group[run_id][skill])
          plt.ylabel(f'Mean Episode Reward')
          plt.xlabel(f'Cycle')
          plt.title(f'{run_id} - {skill}')
    
          plt.show()
    

Key Benefits of Using the Historian for Validation:

  • Optimized Data Handling: The Delta Lake format is designed for fast querying, making it ideal for time-series data.

  • Efficient Storage: The append-only nature ensures that new data can be added without overwriting or modifying existing data, making it easy to track data over time.

  • Continuous Monitoring: By continuously adding data to the historian, you can validate your agent system's long-term impact on machine performance, uptime, and safety.

Analyze Agent System Behavior

Analyzing agent system behavior is a key part of using Composabl. This capability is key for:

  • Building agent systems that beat performance benchmarks

  • Monitoring the performance of your agent systems and simulators in real-time

  • Analyzing your agent systems and simulators to understand what went wrong in case of problems

  • Creating dashboards and reports to share with your team or customers

Composabl offers multiple options for analyzing agent system behavior. For high-level summaries, you can use the benchmark reporting features in the UI. For more in-depth information, you can use the Historian and raw benchmark data in the SDK, which creates a database of all the information gathered during training and deployment. You can also design an LLM module in your agent to provide customizable communications capacity using natural language.

See full code samples and more examples.
Scenarios
adding a perception layer.

Access a Trained Agent System

Once you have completed training an agent and are ready to deploy it you can download it from the UI.

  1. Go to the Training tab in your project.

  2. Select the training session for the agent you want to deploy.

  3. Click on the Artifacts drop down in the upper right of the screen.

  4. Click Inference Model and you will down load a file called agent.json.

Create Skill Agents

You can use Composabl to create skill agents that learn with deep reinforcement learning. Configure the Composabl teacher by setting goals, constraints, and success criteria for your skill agent. Composabl then turns these into reward functions and trains the skill agent.

Create a New Skill Agent

To create or edit a skill agent, follow these steps:

  1. Navigate to the Skill Agents page

  2. Click on a skill agent, or click + to create a new skill agent

  3. Configure the skill agent

Use the dropdown menus to select the variables and then define the parameters for each goal, constraint, or success criterion you want to include.

Choose Implementation Method

To create a new skill agent that learns with deep reinforcement learning, select Teacher under the Implementation Method dropdown. The other option, Controller, is used for a programmed skill agent that is configured with the Composabl SDK and published to the Agent Orchestration Studio.

Configure Goals

Goals define what a skill agent should do. Goals apply to one of the sensor variables and are defined using one of three possible directives:

  • Maximize: Maximize the value of the variable

  • Minimize: Minimize the value of the variable

  • Maintain: Keep the variable at a specified value or track a specified variable (such as a set point)

Configure Constraints

Constraints set the boundaries for the skill agent. They define rules that the skill agent must follow during operation. Constraints are defined using one of two possible directives:

  • Avoid: The skill agent learns to keep the variable from reaching a specified value or range through withholding rewards

  • Terminate: When the actions of the skill agent lead to certain conditions within a variable, the skill agent has failed and must stop and start a new episode

Configure Success Criteria

Success criteria tell the skill agent when it's doing something right. They are defined using one of two possible directives:

  • Approach: The skill agent learns to get close to a specified value by getting increased reward

  • Succeed: When the success criteria are achieved, the session ends, and a new one begins so that the skill agent can keep practicing and learn to win every time

Connect a Simulator to Composabl

In this tutorial, we will learn how to upload simulators via the Composabl CLI as well as your custom, adapted simulators to the Composabl UI.

Upload simulators via Composabl CLI

To upload simulators that already follows the Composabl simulation specification all you need to do is the fallowing command from the simulation folder:

composabl sim publish
  • After that, you can go to the Composable editor and connect that sim to any project.

Upload Third-Party Simulators via Docker

Prerequisites

If your simulator isn't already compatible with the Composabl platofrm you will need to create gRPC bindings and upload it as a Docker image in the Composabl editor. You can follow along with these Composabl API patterns and the following structure:

.
└── your-simulator-folder/
    ├── docker/
    │   └── entrypoint.sh
    ├── src/
    │   ├── exceptions/
    │   │   └── invalid_usage.py
    │   ├── __init__.py
    │   ├── main.py
    │   ├── server_impl.py
    │   └── sim.py
    ├── Dockerfile
    └── requirements.txt

Going through the files:

  • docker/entrypoint.sh: This file is the entrypoint of your Docker container.

  • src/exceptions/invalid_usage.py: This file contains the exception class InvalidUsage that is used to raise exceptions in the simulator.

  • src/__init__.py: This file is the initialization file of the module. No code is needed here, but for Python to recognize the folder as a module, this file is necessary.

  • src/main.py: This file is the main file of the simulator. It uses the composabl_core.networking module to expose the simulator to the Composabl platform. This file is also available zipped along with this tutorial.

  • src/server_impl.py: This file contains the implementation of the server that will be used to run the simulator.

  • src/sim.py: This file contains your implementation of the simulator itself. Usually, a Env (inheriting from gym.Env) class is implemented here, and it is used to run the simulator.

  • Dockerfile: This file is the Dockerfile that will be used to build the Docker image of your simulator.

  • requirements.txt: This file contains the Python packages that are necessary to run your simulator. It is used to install the necessary packages in the Docker image.

gRPC Spec for Implementation

Under the hood, the Composabl SDK uses gRPC to communicate with the Composabl platform.

To create a simulator that works with the Composabl API, you have to implement the following gRPC methods:

service Composabl {
  // Creates the environment with specified configurations.
  // - MakeRequest: Contains parameters to configure the environment.
  // - MakeResponse: Returns an identifier for the created environment and possibly other initial setup information.
  rpc make(MakeRequest) returns (MakeResponse) {}

  // Advances the environment by one timestep using the action provided in the request.
  // - StepRequest: Includes the action to be taken in the current state of the environment.
  // - StepResponse: Returns the new state of the environment, reward received, and a flag indicating if the episode has ended.
  rpc step(StepRequest) returns (StepResponse) {}

  // Resets the state of the environment, returning it to its initial conditions.
  // - ResetRequest: May include parameters for resetting to specific scenarios.
  // - ResetResponse: Provides the initial observation of the reset environment.
  rpc reset(ResetRequest) returns (ResetResponse) {}

  // Performs any necessary cleanup before the environment is closed.
  // - CloseRequest: May be empty or include specific closing instructions.
  // - CloseResponse: Acknowledges the environment has been successfully closed.
  rpc close(CloseRequest) returns (CloseResponse) {}

  // Generates a sample action from the environment's action space.
  // - ActionSpaceSampleRequest: May be empty or specify particular sampling criteria.
  // - ActionSpaceSampleResponse: Provides a sample action from the action space.
  rpc action_space_sample(ActionSpaceSampleRequest) returns (ActionSpaceSampleResponse) {}

  // Retrieves information about the environment's action space.
  // - ActionSpaceInfoRequest: May be empty or include parameters for the information request.
  // - ActionSpaceInfoResponse: Returns detailed information about the action space.
  rpc action_space_info(ActionSpaceInfoRequest) returns (ActionSpaceInfoResponse) {}

  // Retrieves information about the environment's observation space.
  // - ObservationSpaceInfoRequest: May be empty or include parameters for the information request.
  // - ObservationSpaceInfoResponse: Returns detailed information about the observation space.
  rpc observation_space_info(ObservationSpaceInfoRequest) returns (ObservationSpaceInfoResponse) {}

  // Sets the current scenario for the environment.
  // - SetScenarioRequest: Includes parameters defining the scenario to set.
  // - SetScenarioResponse: Acknowledges the scenario has been set.
  rpc set_scenario(SetScenarioRequest) returns (SetScenarioResponse) {}

  // Retrieves the current scenario of the environment.
  // - GetScenarioRequest: May be empty if simply retrieving the current scenario.
  // - GetScenarioResponse: Returns details of the current scenario.
  rpc get_scenario(GetScenarioRequest) returns (GetScenarioResponse) {}

  // Sets the render mode of the environment.
  // - SetRenderModeRequest: Includes parameters for the desired render mode.
  // - SetRenderModeResponse: Confirms the render mode has been set.
  rpc set_render_mode(SetRenderModeRequest) returns (SetRenderModeResponse) {}

  // Retrieves the current render mode of the environment.
  // - GetRenderModeRequest: May be empty if simply querying the current mode.
  // - GetRenderModeResponse: Returns the current render mode.
  rpc get_render_mode(GetRenderModeRequest) returns (GetRenderModeResponse) {}

  // Retrieves the current render of the environment.
  // - GetRenderRequest: May include parameters specifying the render details.
  // - GetRenderResponse: Provides the current render of the environment.
  rpc get_render(GetRenderRequest) returns (GetRenderResponse) {}
}

After making sure that your simulator is compatible with the Composabl platform, you can proceed to the next section.

Docker


  • Go to the folder where your simulator is located. We can navigate to the simulator folder and see what is inside it.

  • After that, ensure that you have Docker installed. You can check if Docker is installed by running the following command:

    docker --version

DockerHub


  • To log in to Docker Hub, run the following command:

    docker login

    You will be prompted to enter your Docker Hub username and password. After that, you should see a message saying that you are logged in.

  • Now, we can build the Docker image of the simulator. To do so, run the following command:

    docker build -t <your-docker-hub-username>/<simulator-name> .

    This command will build the Docker image of the simulator. The -t flag is used to tag the image with the name <your-docker-hub-username>/<simulator-name>. The . at the end of the command indicates that the Dockerfile is in the current directory.

  • After building the image, you can check if it was built successfully by running the following command:

    docker images

    And then push the image to Docker Hub:

    docker push <your-docker-hub-username>/<simulator-name>

Composabl UI


After that, you can go to the Composabl UI and upload your simulator. To do so, follow the steps below:

  • Then, on the left sidebar, click on the "Simulators" tab. You should see a list of simulators that are already available on the platform.

    On the top right corner, you should see a button to "New Simulator". Click on it.

  • A pop-up will appear, asking you to select between "Internal" and "External" simulators. Select "External".

  • Then, you can fill the Title and Description of the simulator. We suggest smaller names for the Title and a more detailed description for the Description.

  • After clicking in next step, a brief tutorial will open up. Take care to read it and then click on "Next Step" again.

  • After that, you can fill the Docker Image field with the name of the Docker image you pushed to Docker Hub <your-docker-hub-username>/<simulator-name>.

    If the image is public, no more fields are needed. If the image is private, you need to fill the Docker Username and Docker Password fields with your Docker Hub username and password, respectively. Then, click on "Validate and next step".

Conclusion

If you've followed all the steps correctly, you should have successfully uploaded your simulator to the Composabl UI. You can now use your simulator to train agents and run simulations on the platform. If you have any questions or need help, feel free to reach out to us.

Connecting to Agent System Runtime and Plotting Results of Agent System Operations

In this tutorial, we will cover how to connect to the agent system runtime, load a pre-trained agent system, run inference, and visualize the results in a production-like environment. The provided script, agent_inference.py, is a key component that demonstrates connecting to the Composabl agent system runtime, initializing the environment, and plotting agent system operation results.


Step 1: Understanding agent_inference.py

The script agent_inference.py connects to the runtime, loads a pre-trained agent system, connects to a local simulation, collects sensor data from the sim and plots the results. Here is an outline of the core steps in the process:

  1. Start Runtime and Load Agent System: The script initializes the trainer and loads a pre-trained agent system from a model folder.

  2. Set Up the Simulation Environment: It connects to a simulation environment.

  3. Run Inference: The pre-trained agent system interacts with the simulation to perform inference (decisions), collecting observations and giving actions at each step.

  4. Collect Data and Plot Results: Sensor data and actions are collected in a Pandas DataFrame, and the results are plotted using Matplotlib to visualize how the agent system is performing over time in a production like environment.


Step 2: Connecting to the Runtime and Loading the Agent System

The first task is to connect to the Composabl runtime and load the pre-trained agent system. This is accomplished using the Trainer and Agent classes. The agent system's model is loaded from the directory where the model was saved during training.

Here:

  • Trainer(config) initializes the runtime with a configuration file.

  • Agent.load(PATH_CHECKPOINTS) loads the saved agent from the specified checkpoint directory.

  • trainer._package(agent) prepares the agent for inference by packaging it.


Step 3: Connecting to the Simulation Environment

Next, we connect the agent system to the simulation environment. The make() function creates a connection to the local simulator, and the environment is initialized.

Here:

  • The simulator is configured to run locally (localhost:1337) and you have to start it locally and manually before.

  • The environment is initialized with sim.init(), and the agent system is connected to it.


Step 4: Setting the Scenario and Running Inference

After connecting to the simulator, you need to set up the specific scenario that the agent system will operate in. This scenario determines the environment's initial state.

With the environment set, the agent can now run inference for a set number of iterations. At each iteration, the agent observes the environment, takes an action, and collects the results (observations and rewards). This is done in a loop.

In each iteration:

  • The agent system performs an action based on the current observations.

  • The environment advances one step with sim.step(action), and the agent receives a new observation and reward.

  • Sensor data and actions are logged into a Pandas DataFrame for later analysis.


Step 5: Saving Data and Plotting Results

Once the inference loop is complete, the collected data is saved, and the results are visualized. The results are plotted using Matplotlib.

This code generates three subplots:

  1. Temperature Controller (Tc) over time.

  2. Temperature (T) and Reference Temperature (Tref) over time.

  3. Concentration (Ca) and Reference Concentration (Cref) over time.

The plots provide a visual representation of the agent system's performance during the simulation. Finally, the figure is saved as inference_figure.png in the benchmarks directory.


Step 6: Running the Script

To run the script, execute the agent_inference.py in your terminal.


Conclusion

In this tutorial, we demonstrated how to:

  • Connect a pre-trained Composabl agent system to a runtime and simulation environment.

  • Set up a scenario and run inference.

  • Collect observations and actions, and plot the results using Matplotlib.

By following these steps, you can visualize the performance of your agent system and gain insights into how it interacts with the environment over time.

Set KPI and ROI

The Composabl benchmarking feature allows you to compare the performance of different multi-agent systems against key performance indicators (KPIs). This tool helps you evaluate your agent system's effectiveness, track its improvements, and calculate potential return on investment (ROI).

Benchmarks are generated only after training is complete. You won't see benchmark data while training progresses, even for long-running jobs. If a training job fails, no benchmark data will be generated.

Note: Benchmarking does not influence agent system training, goals, or rewards. It is purely an analysis tool that helps you answer the question: "Based on how this agent system was trained and how we believe it should perform in real life, how much money is this agent system generating?"

Getting Started

Define the KPI for Your Project

The key performance indicator (KPI) for your project is the top-level objective that you will use to compare and evaluate your multi-agent systems' performance.

To define the KPI for your project, choose the sensor variable that represents the metric that best indicates performance success. For most processes, this is a metric with a clear business impact, such as product yield or energy use.

Setting up the KPI

To set up the KPI:

  1. Navigate to the Benchmark page from the main dashboard

  2. Click the "Settings" button in the top right of the KPI Performance Metrics section

  3. In the "Set up KPI" section, configure the following:

    • Sensor Name: Select a sensor or preceptor variable to track (e.g., "Eps_Yield")

    • Target Value: Set the desired target for this KPI (e.g., 0.33)

    • Benchmark Unit: Select the unit of measurement (e.g., "% of Max Theoretical Yield"). Note that this is for UI readability only and doesn't affect calculations.

Setting up ROI Calculations

Return on Investment (ROI) calculations help quantify the financial impact of your agent systems' performance. The ROI is calculated based on the monetary value of the difference between your target value and the actual value of the KPI generated by that agent system, minus training costs.

  1. In the Settings modal, navigate to the "Enter ROI criteria" section

  2. Define the conditions for calculating ROI:

    • When: Select the metric to track (e.g., "Eps_Yield")

    • Condition: Choose how the metric changes (e.g., "increases by")

    • Percentage: Enter the percentage change that matters (e.g., 2%)

    • Value: Set the financial value of this change (e.g., $1,000,000)

    • Period: Select the time period (e.g., "per year")

For example: "When Eps_Yield increases by 1% of max theoretical yield, it is worth $1,000,000 per year."

You can change KPI and ROI settings at any time after training completes, and the benchmark results will be automatically recalculated. This allows you to explore different business scenarios without retraining your agent systems.

Reading the Benchmarking Dashboard

Understanding the Overview Panel

At the top of the benchmarking page, you'll find an Overview panel containing three key metrics:

  • Highest performing Agent System: Displays the outcome of the highest performing agent system

  • Lowest performing Agent System: Displays the outcome of the lowest performing agent system

  • ROI: Shows the calculated financial return based on your KPI settings

Each metric displays both the value and the relevant unit (e.g., "% of Max Theoretical Yield").

Bar Graph Visualization

The bar graph in the KPI Performance Metrics section provides a visual comparison of your agent systems:

  • Each bar represents an agent system's performance on your selected KPI (all bars correspond to the same KPI)

  • The height of the bar indicates the specific value of the KPI for that agent system (Y-axis)

  • Horizontal dashed lines show benchmark averages

  • The actual value of the KPI for each agent system is shown above the respective bar

Performance Ranking Table

Below the graph is a detailed table showing:

  • Agent System Number: Sequential ID for each agent system

  • Agent System Name: Name or description of the agent system

  • ROI: Calculated financial return based on the agent system's performance

    • Green arrows (↑) indicate positive ROI with the percentage and absolute value

    • Red arrows (↓) indicate negative ROI with the percentage and absolute value

  • Eps_Yield (variable) per % of Max Theoretical Yield: Performance metric with indicators for highest agent system performers

Connect Runtime Container to Your Operation

Based on the notes provided in the image, I'll help you create a tutorial on accessing the agent system runtime after deploying it to Docker. This tutorial will explain the steps for building the Docker container, deploying the agent system, and accessing the runtime for inference or interaction.


Tutorial: Accessing the Agent System Runtime After Deploying to Docker

Once you have packaged and deployed your agent system inside a Docker container, the next step is accessing its runtime for operations like model inference. This tutorial will guide you through the process of building and running the Docker container and then connecting to the agent system's runtime for further interactions.


Step 1: Preparing the Dockerfile and Environment

To deploy the agent system to Docker, we need to first create a Dockerfile. The Dockerfile will package the necessary runtime, model, and environment for the agent system.

  1. Dockerfile Setup: Your Dockerfile should contain the following key components:

    • Base Image: Use a Python base image (or any base that supports the necessary libraries).

    • Copy Model Files: Copy the pre-trained model (e.g., .gz file) to the container.

    • Install Dependencies: Install any required Python libraries (like OHTTP or other packages for the agent system).

Here’s an example Dockerfile:


Step 2: Building the Docker Image

  1. Building the Image: You can build the Docker image by running the following command in the terminal. This will take the Dockerfile and the associated files (like the pre-trained model) and create an image.

  • The -t flag allows you to tag the image (my-agent-runtime), which makes it easier to reference later.

  • Make sure that the model file (agent.gz) and all relevant scripts are reachable within the Docker context (i.e., the directory from which you are building).

  1. Checking the Image: Once the build is complete, you can verify that the image was created successfully by running:


Step 3: Running the Docker Container

Now that the image is built, the next step is to run it in a container. You will run the Docker container in an interactive mode to access the runtime.

  • -it: Runs the container interactively.

  • -p 8000:8000: Maps port 8000 from the container to port 8000 on your local machine so that you can access the HTTP server for the agent system runtime.

The HTTP server should now be up and running within the container, ready to handle model inference or other tasks.


Step 4: Accessing the Agent System Runtime

With the Docker container running, you can now connect to the agent system's runtime. The runtime will be an HTTP server, as mentioned in your notes. You can access it through a POST request for model inference or other operations.

  1. Sending Requests to the Agent System: You can send a POST request to the running server using a tool like curl, Postman, or any Python HTTP library (such as requests).

Here’s an example using curl:

This request will:

  • POST data to the /infer endpoint on localhost:8000, which is being forwarded from the Docker container.

  • The agent system will handle the request, infer the model, and return the result.

  1. Interacting with the Agent System: If you prefer to interact with the agent system directly, you can also enter the container’s interactive mode and run commands.

This will open a shell inside the running Docker container, allowing you to execute any runtime commands manually.


Step 5: Automating the Process

For convenience, you can automate the entire process of building the image, running the container, and interacting with the agent system by creating a script.

Here’s a basic example of an automation script:

Save this as run_agent.sh, and then execute it:

This script will:

  • Build the Docker image.

  • Run the container, mapping the necessary port and exposing the HTTP server for inference.


Step 6: Troubleshooting and Debugging

If the container fails to start, or if the server doesn't respond, you can debug the container by checking the logs:

This command displays the output of the running container, which can help diagnose issues such as missing dependencies or server errors.


Conclusion

In this tutorial, we walked through the process of:

  • Building a Docker image with your agent system and its runtime.

  • Running the Docker container interactively to expose the agent’s HTTP server.

  • Accessing the agent system runtime by sending HTTP requests for inference or other tasks.

By following these steps, you can deploy and interact with your Composabl agent system in a Dockerized environment.

Manual

Overview

Due to the many options available to you for installing Kubernetes clusters, this document will not go into the specifics of setting up the cluster. Rather, it will provide you with guidance and requirements for your cluster.

Nodes

Depending on whether you want to use GPUs or not, you need the following nodes:

Nodes that are always required:

  1. "main": This means nodes to run the control plane. The Composabl controller does not interact with these nodes, so they should be provisioned as recommended by the Kubernetes distribution you use.

  2. "composabl": This node or nodes will be where the Composabl controller and Historian software are scheduled.

  3. "envrunners": These nodes will handle training workloads. If you're not using GPUs, all training will be done on these nodes. If you are, these nodes will manage the communication with the simulators, and can be reduced in size

  4. "simscpu": These nodes are where the simulators will be scheduled. Sizing depends on the simulator.

If you want to use GPU training, you need the following nodepool: 5. "learners": These nodes with GPUs will accelerate the learning step of the training process.

If your simulator can be accelerated using GPU, you can add the final node pool: 6. "simsgpu": These will run simulators, assigning a GPU to them.

1. Sizing

  1. main: As required by your Kubernetes distribution

  2. composabl: In total, 16GB of memory and 4 CPU - with 1 node at least 8GB of memory.

  3. envrunners: If not using GPUs, we recommend 8 CPU and 8 or 16 GB of memory. In any case, the number of simulators that can be managed by each envrunner instance depends on the number of CPU

  4. simscpu: The sizing of these nodes depends on the resource requirements of your simulator

  5. learners: These nodes should have 1 Nvidia GPU. Other resources can be limited - 2 CPU and 8GB of memory is sufficient

  6. simsgpu: As with simscpu, depends on the simulator requirements.

2. Labels

All groups of nodes must be labeled accordingly. The name, as given in the sizing guide is the name you should set as the agentpool label.

You may be able to define this during your cluster setup, but if not, you can use the following commands:

Replace the values in between <> with the name of the nodes you'd like to assign to a specific pool.

Storage

The components also need access to (semi)persistent, shared storage. This section will detail the types and amount of storage needed.

It needs the following PersistentVolumeClaims in the composabl-train namespace:

  1. pvc-controller-data with a size of ±1Gi and ReadWriteOnce (or better) accessMode When using Azure, you will need to set the nobrl mountOption for this PVC, as this is required for the Composabl controller to function.

  2. pvc-training-results with a suitable size - this is where your final agent system data will be stored before it is uploaded to the No-code application. It needs accessmode to be ReadWriteMany (RWX). A good initial size is to match historian-tmp.

  3. historian-tmp is used as temporary storage for historian data. It needs to have an accessMode of ReadWriteOnce and the size will depend on the length of your training sessions. We recommend starting with 5Gi.

The size of pvc-training-results and historian-tmp is dependent on the amount and size of training jobs you want to run simultaneously on your cluster. If you plan on running long-lived training sessions with many cycles, you may want to increase the capacity for both,

Private image registry

If you want to use a private registry for simulator images, you will need to set up this private registry yourself, and make sure the cluster is able to pull images from this registry.

Next steps

Configure API Connections to Third-Party Software as Skill Agents

In this tutorial, we will walk through how to set up a skill agent in Composabl that integrates with a third-party API. This type of integration allows your agent system to communicate with external systems, such as machine performance APIs, and use the data to make informed decisions.

We will create a programmed skill agent that connects to a mock third-party API, process its response, and return an action based on the data received. This tutorial will also touch on orchestrating this skill agent within your agent system.


Step 1: Defining the Programmed Skill Agent

A programmed skill agent in Composabl is created by specifying the logic for interacting with the external API and processing the response. In this case, we will create a simple API connection to a fake endpoint that returns data about machine performance. The agent system will act based on the information received.

1.1. Creating the API Integration Skill Agent

We’ll define a programmed skill agent for making the API request. Here's an example of how to define the skill agent using a controller function that calls the API and processes the response.

In this example:

  • The compute_action() method sends observation data (e.g., from sensors) to a third-party API.

  • The _call_api() function makes the API call and handles any errors that might occur.

  • The _process_response() function processes the response from the third-party API and determines the appropriate action for the agent system to take based on the data.

Step 2: Adding the Programmed Skill Agent to the Agent System

2.1. Adding the Skill Agent to Composabl UI

Once the skill agent is defined, you can add it to your agent system in the UI using the methods below:

  1. Create a new Skill Agent using the Composabl CLI with a given name and description and implementation type, that in this case will be a controller. The name will be "third_party_api_skill"

  1. Change the controller.py code to use the class that you created: ThirdPartyAPISkill(). Change the pyproject.toml file to include your class ThirdPartyAPISkill in the entrypoint and its name:

  1. Publish the Skill Agent to the UI

Select your organization and project that you want to publish it to.

Reference: https://docs.composabl.io/changelog/0-8-0.html

2.2. Adding the Skill Agent to Composabl SDK

Once the skill agent is defined, you can add it to your agent system using the add_skill() SDK method. This allows the agent system to execute the API connection skill agent when necessary.

Here’s how to add the ThirdPartyAPISkill to the skill agent:

By importing and creating the class with SkillController, you are indicating that this skill agent is programmed and does not require training. It will use predefined logic to interact with the third-party API and make decisions based on the data returned.


Conclusion

By following these steps, you’ve successfully defined and integrated a programmed skill agent that communicates with a third-party API into your Composabl agent system. The agent system can now take actions based on external data and dynamically respond to scenarios.

This approach allows agent systems to interface with a wide range of external systems, from monitoring equipment to adjusting machine settings, all through programmable skill agents.

Orchestration of skill agents through selectors ensures the agent system executes the correct skill agents at the right time, whether the skill agents are learned or programmed.

Creating a Cluster

Introduction

To train Composabl agent systems on a cluster, you need to install the Composabl controller onto a compatible Kubernetes cluster.

This document will guide you through setting up this cluster using various options:

  • Azure: Set up a cluster through our Azure Marketplace Composabl Training Cluster offer

  • Automated: Set up a compatible Kubernetes cluster on Azure through the use of Pulumi (IaaS)

  • Manually: Set up a compatible Kubernetes cluster through any tool and infrastructure of your choice

Components

The entire Composabl controller installation consists of two main components:

  • Controller: This component connects to the Composabl No-code ui, and allows you to initiate training on your cluster.

  • Historian: The historian component records detailed statistics regarding the training process of your agent system. After training, these will be available alongside the agent system in the no-code UI

RBAC

The installation of Composabl will introduce 2 namespaces, composabl-train and composabl-sims, for training workers and simulators respectively.

For normal operation, a series of permissions are required by the ServiceAccounts of the Composabl installation.

Cluster Setup

We currently have the following guides to setting up your own cluster:

Submit a Training Job through the UI

About Cluster Training

Composabl offers two options for cluster training:

  • Use Composabl's Training as a Service offering to train on our clusters

  • Use your own compute clusters through Azure, AWS, or another provider

Ensure that Your Agent System is Ready

Before you submit your job for training on a cluster, make sure that your agent is fully configured and all the parameters have been set. That means checking all the agent components:

  • Goals

  • Perceptors

  • Orchestrators, including goals for learned selectors and scenarios

  • Skill agents, including goals for learned skills

  • Scenarios, including scenario flows

Any component of the agent with a warning sign is not fully configured and not ready for training. Go back to edit that agent component and make sure that all of the fields are filled out.

Choose the Right Cluster

You can train on your own cluster or on Composabl’s clusters using training as a service (TaaS) credits. If you want to use Composabl’s clusters, ensure that you have credits available.

To train on your own cluster, make sure that you have set your cluster up and installed Composabl successfully.

Click Train and then choose the cluster option in the menu. You will then have the option to configure your training session.

Configure Your Training Session

Training session configuration options are the same whether you’re using TaaS or training on your own cluster.

Set the Number of Training Cycles

A training cycle is a complete pass through the entire task, with the agent system continuing until it reaches success or some other stop criteria. Your agent system will train each skill one at a time for the selected number of training cycles, starting from the bottom of the agent system design.

A training cycle involves about 1,000 agent decisions. Depending on the complexity of the task, agent systems may need to complete anywhere between 100 and several thousand training cycles to become proficient.

Set the Number of Simulators

You can run multiple simulators in parallel to speed up training. If you run more than one simulator during a training, the number of training cycles selected will be multiplied by the number of simulators, so 5 training cycles with 3 simulators selected would lead to 15 training cycles total.

You can use the Advanced Configuration to choose how powerful each machine running a simulator should be. If you choose Small, each training cycle selected will result in one training cycle completed. If you choose GPU, you will get 4 training cycles for each training cycle.

More training cycles running simultaneously will speed up training, but also increase costs. How long your training takes also depends on the complexity of your agent system and your simulator.

Start Training

When you have configured your settings correctly, click Start Training.

You will then be taken to the Training Sessions page. There you can follow the agent system training progress by viewing the real-time plots or the console output.

Note that it will take a few minutes for the visualization to begin.

For example, for , we want to maximize the yield of the product over the course of an entire episode, eps_yield.

For example, we want to avoid the temperature, T, getting above 400 degrees Kelvin in use case.

If Docker is installed, you should see the Docker version. If not, you can install Docker by following the instructions on the .

Then, before building the image, ensure that you have a Docker Hub account. If you don't have one, you can create one by going to the .

Go to the Composabl UI by accessing the following link: . You'll be asked to login and then redirected to your dashboard page.

A note on GPUs: Currently, only Nvidia GPUs are supported. The cluster must have the installed for training on GPU to be enabled.

Whether or not you use autoscaling using , each node type must be sized accordingly.

Once your cluster is running, and you have verified your setup is working, you can continue to

, our recommended way of setting up a training cluster

, which describes the use of Pulumi for setting up a compatible cluster on Azure

, if you're experienced in setting up a cluster, or have existing standard tooling for setting up kubernetes clusters

Composabl agent systems use clusters to train at scale. A cluster is a collection of computers that work on large tasks simultaneously. This provides enough compute to complete large training tasks as efficiently as possible.

official Docker website
Docker Hub website
https://app.composabl.com/
the industrial mixer
the industrial mixer
async def run_agent():
    # Start Runtime
    trainer = Trainer(config)

    # Load the pre-trained agent
    agent = Agent.load(PATH_CHECKPOINTS)

    # Prepare the agent for inference
    trained_agent = await trainer._package(agent)
    # Inference
    print("Creating Environment")
    sim = make(
        run_id="run-benchmark",
        sim_id="sim-benchmark",
        env_id="sim",
        address="localhost:1337",
        env_init={},
        init_client=False
    )

    print("Initializing Environment")
    await sim.init()
    print("Initialized")
    # Set scenario
    noise = 0.0
    await sim.set_scenario(Scenario({
        "Cref_signal": "complete",
        "noise_percentage": noise
    }))
    obs_history = []
    df = pd.DataFrame()
    print("Resetting Environment")
    obs, info = await sim.reset()
    obs_history.append(obs)
    action_history = []

    for i in range(90):
        action = await trained_agent._execute(obs)  # Get action from agent
        obs, reward, done, truncated, info = await sim.step(action)  # Step the environment

        # Create a temporary DataFrame for the current observation
        df_temp = pd.DataFrame(columns=[s.name for s in sensors] + ['time'], data=[list(obs) + [i]])
        # Concatenate the new data to the existing DataFrame
        df = pd.concat([df, df_temp])

        obs_history.append(obs)
        action_history.append(action)

        if done:
            break
    # Save the DataFrame to a pickle file for later use
    df.to_pickle(f"{PATH_HISTORY}/inference_data.pkl")

    # Plot results
    plt.figure(figsize=(10, 5))

    # Plot Temperature Controller Data (Tc)
    plt.subplot(3, 1, 1)
    plt.plot(df.reset_index()['time'], df.reset_index()['Tc'])
    plt.ylabel('Tc')
    plt.legend(['reward'], loc='best')
    plt.title(f'Agent Inference DRL - Noise: {noise}')

    # Plot Temperature and Reference Temperature (T, Tref)
    plt.subplot(3, 1, 2)
    plt.plot(df.reset_index()['time'], df.reset_index()['T'])
    plt.plot(df.reset_index()['time'], df.reset_index()['Tref'], 'r--')
    plt.ylabel('Temp')
    plt.legend(['T', 'Tref'], loc='best')

    # Plot Concentration and Reference Concentration (Ca, Cref)
    plt.subplot(3, 1, 3)
    plt.plot(df.reset_index()['time'], df.reset_index()['Ca'])
    plt.plot(df.reset_index()['time'], df.reset_index()['Cref'], 'r--')
    plt.legend(['Ca', 'Cref'], loc='best')
    plt.ylabel('Concentration')
    plt.xlabel('Iteration')

    # Save plot
    plt.savefig(f"{PATH_BENCHMARKS}/inference_figure.png")
python agent_inference.py
# Use an official Python runtime as the base image
FROM python:3.10-slim

# Set the working directory
WORKDIR /usr/src/app

# Copy the necessary files into the Docker image
COPY . .

# Install any dependencies specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Expose port 8000 for the HTTP server
EXPOSE 8000

# Command to run the server when the container starts
CMD ["python", "agent_inference.py"]
docker build -t my-agent-runtime .
docker images
docker run -it -p 8000:8000 my-agent-runtime
curl -X POST http://localhost:8000/infer -d '{"input_data": "your_input_here"}'
docker exec -it <container_id> bash
#!/bin/bash

# Build the Docker image
docker build -t my-agent-runtime .

# Run the Docker container
docker run -it -p 8000:8000 my-agent-runtime
bash run_agent.sh
docker logs <container_id>
kubectl label node <my-composabl-node> agentpool=composabl --overwrite
kubectl label node <my-envrunners-node> agentpool=envrunners --overwrite
kubectl label node <my-simulator-node> agentpool=simscpu --overwrite
kubectl label node <my-learners-node> agentpool=learners --overwrite
kubectl label node <my-simulator-gpu-node> agentpool=simsgpu --overwrite
import requests 
from composabl import SkillController

# Define the programmed skill 
class ThirdPartyAPISkill(SkillController): 

    def __init__(self, *args, **kwargs):
        self.api_url = "https://api.example.com/machine-status" 

    async def compute_action(self, obs, action):
        # Send sensor data to the third-party API 
        response = self._call_api(obs) 
        # Process the response and return an action 
        action = self._process_response(response) 
        return action
  
    def _call_api(self, observation): 
        try: 
            response = requests.post( 
                self.api_url,  
                json=observation,  
                headers={'Content-Type': 'application/json'} 
            ) 
            response.raise_for_status() 
            return response.json() 

        except requests.RequestException as e: 
            print(f"API call failed: {e}") 
            return None 

    def _process_response(self, response): 
        if not response:
            # Default action 
            return 0.0

        action = float(response.get("action"))
        reason = response.get("reason", "No reason provided") 

        print(f"Action: {action} - Reason: {reason}") 
        return action

    async def transform_sensors(self, obs):
        return obs

    async def filtered_sensor_space(self):
        return ['sensor1', 'sensor2', 'sensor3']

    async def compute_success_criteria(self, transformed_obs, action):
        return False

    async def compute_termination(self, transformed_obs, action):
        return False
composabl skill new
[project]
name = "Third Party API Skill Agent"

entrypoint = "third_party_api_skill.controller:ThirdPartyAPISkill"
composabl login
composabl skill publish third_party_api_skill
# Define and add the third-party API skill agent
third_party_skill = Skill("third_party_api", ThirdPartyAPISkill) 
agent.add_skill(third_party_skill) 
nvidia-gpu-operator
cluster-autoscaler
Installing Composabl
Azure
Automated
Manual
Kubernetes

Publish Skill Agent Components to the UI

Once you have created new skill agent components in the SDK or configured existing algorithms and models, you can publish them with a simple CLI workflow to make them available in the UI to drag and drop into agent system designs.

Agent system components are published into . They will then be available to use for all the agent systems you create for that use case.

Publish Skill Agents and Orchestrators

  1. Log in to Composabl: composabl login.The system will redirect you to the UI to enter your credentials and log in.

  2. Return to the command line and navigate to the folder containing the skill agent or orchestrator you want to publish.

  3. Publish the skill agent or orchestrator: composabl skill publish or composabl selector publish.

  4. Select your organization from the dropdown menu.

  5. Select your from the dropdown menu. Save your new skill agent or orchestrator there.

  6. Your skill agent or orchestrator will begin publishing. When the process is complete, go to the UI, navigate to the Agent Orchestration Studio, and refresh your browser to see your new skill agent or orchestrator in the sidebar.

Publish Perceptors

  1. Log in to Composabl: composabl login.The system will redirect you to the UI to enter your credentials and log in.

  2. Navigate to the perceptors folder (one level above the individual perceptor folder).

  3. Publish the perceptor: composabl perceptor publish perceptor_name

  4. Select your organization from the dropdown menu.

  5. Select your from the dropdown menu. Save your new skill or selector there.

  6. Your perceptor will begin publishing. When the process is complete, go to the UI, navigate to the Agent Orchestration Studio, and refresh your browser to see your new perceptor in the sidebar.

Connecting a Cluster

Introduction

This page covers manually installing the Composabl components onto the cluster.

Requirements

Additionally, to run the installation script, you need to have Kubectl set up with access to the cluster.

Steps

  1. Log in to the No-Code app

  2. Navigate to the Clusters page

  3. Click "Create"

  1. In the pop-up, choose a display name for your cluster

  1. Select the cloud vendor you're using, or "custom" if none.

  2. Copy the command

  3. Run the command in your terminal where you have access to a kubectl

  4. When the command exits without errors, the installation was complete.

  5. Refresh the clusters page. Your custom cluster should now say "Connected" and list its specifications.

You can now select the cluster from the dropdown menu when training.

View Training Session Information

The Training Sessions page allows you to view agent systems' training in real time and analyze their performance in training.

When you begin a training session, the graphs for each trained skill agent will begin to generate. You can watch your skills learn by viewing the graphs, or you can click on the Console Output tab for detailed information about each training decision.

The shape of the curve can help you understand how your skill agents are learning. When the curve plateaus, that usually means that the skill has been successfully trained and will not learn more. If the curve shows jagged ups and downs, then the skill isn't performing consistently and has more learning to do. Sometimes this is a sign that you should go back and adjust the training settings.

The training sessions page shows a list of all the training sessions for a project in a menu on the left of the screen, allowing you to jump between different agents, as well as different training sessions for the same agent system.

Analyze Data

Benchmark Testing and Data Generation

After training a multi-agent system in Composabl, the system automatically runs a series of standardized tests to evaluate its performance. This benchmarking process:

  1. Places the system in controlled testing environments

  2. Records detailed metrics at each step of operation

  3. Aggregates results to provide comprehensive performance statistics

The output of this testing process is compiled into a structured benchmark.json file, which contains rich performance data that can be analyzed to assess effectiveness, identify improvement opportunities, and compare different design approaches. This file is a performance record and a valuable analytics resource for optimizing your agentic systems.

Downloading Benchmark Artifacts

To download benchmark data for further analysis:

  1. Navigate to the "Training Sessions" page

  2. Click the artifacts dropdown in the top right page of a trained system

  3. Select "Benchmark"

  4. The benchmark.json file will be saved to your local machine

Understanding the Benchmark.json File

The benchmark.json file contains structured data about the performance of a trained agent system. Here's how to interpret this file:

File Structure

Key Components

Scenario Data: Contains reference values for the scenario:

Episode Data: Array of state-action pairs showing how the agent performed in each step:

Aggregate Statistics: Summary statistics for the entire benchmark:

To install Composabl on your cluster, you'll need access to the and a valid license key.

Composabl No-Code UI
{
  "skill-name": {
    "scenario-0": {
      "scenario_data": { ... },
      "episode-0": [ ... ],
      "aggregate": { ... }
    }
  }
}
"scenario_data": {
  "sensor_one": {"data": 8.57, "type": "is_equal"}, 
  "sensor_two": {"data": 373, "type": "is_equal"}
}
[
  {
    "state": "{'sensor_one': array([311.2639], dtype=float32), ...}",
    "action": "[-1.253192]",
    "teacher_reward": 1.0,
    "teacher_success": false,
    "teacher_terminal": null
  },
  ...
]
"aggregate": {
  "mean": { ... },
  "medians": { ... },
  "std_dev": { ... },
  "max": { ... },
  "min": { ... }
}

Automated

Introduction

This example uses Azure Kubernetes Service, but can be adapted to other supported providers.

Prerequisites

  1. An Azure subscription with sufficient permissions to create and update various resources

  2. If you're following along in typescript, a working installation of NodeJS

Overview

We will be deploying the following resources to your Azure subscription:

  1. Resource group, containing all resources

  2. A container registry, to hold simulator images

  3. An AKS cluster

Resource group

The resource group will contain all resources. It is also what determines in what Azure location the resources will be deployed.

import * as resources from "@pulumi/azure-native/resources/index.js";

const resourceGroup = new resources.ResouceGroup('my-resource-group-', {
  location: 'eastus'
});

export const rgName = pulumi.interpolate`${resourceGroup.name}`;

At the end, we export the name of the resource group (which will be randomized by pulumi) for further use in our definition

Container registry

The container registry is where you will be able to privately store your simulator docker images, if any.

import * as containerregistry from "@pulumi/azure-native/containerregistry/index.js";

const registry = new containerregistry.Registry("registry", {
  resourceGroupName: resourceGroup.name,
  sku: {
    name: "Basic",
  },
  adminUserEnabled: true,
});

export const registryName = pulumi.interpolate`${registry.name}`;

Kubernetes Cluster

The cluster is where both the Composabl components and your training will be running. This configuration is more complex, so additional information will be provided as comments in the typescript definition:

import * as containerservice from "@pulumi/azure-native/containerservice/index.js";

const k8sCluster = new containerservice.ManagedCluster("aks", {
  resourceGroupName: resourceGroup.name, // Here, we reference the resourceGroup we created earlier
  location: resourceGroup.location,

  dnsPrefix: "composabl-aks",
  kubernetesVersion: "1.31.1", // you can get supported versions using the Azure CLI: az aks get-versions -l <location> -o table - replace <location> with the location you set in your resourcegroup.
  enableRBAC: true,

  // Assign a managed identity to the cluster
  identity: {
    type: "UserAssigned",
    userAssignedIdentities: [appMiAKS.id],
  },

  // Configure 3 pools
  // 1. Main (the kubernetes control plane nodes)
  // 2. Train (Composabl system components and training workers)
  // 3. Sims (Composabl simulators)
  agentPoolProfiles: [
    // The Main pool has 3 small nodes to act as a control plane
    {
      name: "main",
      count: 3,
      vmSize: "Standard_B2s", // (2 core, 4GB RAM, 0.041/hour)
      osType: "Linux",
      osSKU: "Ubuntu",
      mode: "System",
    }
  ],
  sku: {
    name: "Base",
    tier: "Standard"
  },
  // This is an optional part, unless using very large clusters with several 100s of nodes.
  networkProfile: {
    networkPlugin: "azure",
    networkPolicy: "calico",
  }
});

// the "Composabl" agent pool will run the composabl system components (Controller, Historian)
const composablPool = new containerservice.AgentPool("composabl",
  {
    resourceGroupName: resourceGroup.name,
    resourceName: k8sCluster.name,
    agentPoolName: "composabl",
    count: 1,
    vmSize: "Standard_D4s_v3", // (4 core, 16GB RAM)
    osType: "Linux",
    osSKU: "Ubuntu",
    mode: "System",
    osDiskSizeGB: 100,
    osDiskType: "Premium_LRS",
  },
  { replaceOnChanges: ["vmSize"] }
);

// the Env Runners will contain the part of the SDK that deals with data gathering from the simulators
// If training using GPU is disabled, all training will happen on these nodes as well
const envrunnersPool = new containerservice.AgentPool("envrunners",
  {
    resourceGroupName: resourceGroup.name,
    resourceName: k8sCluster.name,
    agentPoolName: "envrunners",
    vmSize: "Standard_D8d_v4",
    count: 1,
    minCount: 1,
    maxCount: 10,
    enableAutoScaling: true,
    osType: "Linux",
    osSKU: "Ubuntu",
  },
  { replaceOnChanges: ["vmSize"] }
);

// The Sims-CPU pull will run all simulator instances
const simsCpuPool = new containerservice.AgentPool("simscpu",
  {
    resourceGroupName: resourceGroup.name,
    resourceName: k8sCluster.name,
    agentPoolName: "simscpu",
    vmSize: "Standard_D8d_v4",
    count: 2,
    minCount: 2,
    maxCount: 1000,
    enableAutoScaling: true,
    osType: "Linux",
    osSKU: "Ubuntu",
  },
  { replaceOnChanges: ["vmSize"] }
);

GPU Training and simulators

If you want to enable GPU training and GPU-enhanced simulators, you will also need to add the following pools.

Finally, GPU_ENABLED must be set to true on the Composabl controller deployment, if it hasn't been already.

// The learners will run the learning part of the training, accelerated by GPU
const learnersPool = new containerservice.AgentPool("learners",
  {
    resourceGroupName: resourceGroup.name,
    resourceName: k8sCluster.name,
    agentPoolName: "learners",
    vmSize: "Standard_NC4as_T4_v3", // (4vCPU, 28GB RAM, 1GPU (Nvidia Tesla T4), 0.0570$/hour)
    count: 1,
    minCount: 1,
    maxCount: 10,
    enableAutoScaling: true,
    osType: "Linux",
    osSKU: "Ubuntu",
    osDiskSizeGB: 100,
    osDiskType: "Premium_LRS",
  },
  { replaceOnChanges: ["vmSize"] }
);

// Optional - if you also want to run Simulators on machines with GPUs, provision this pool as well:
const simsGpuPool = new containerservice.AgentPool("simsgpu",
  {
    resourceGroupName: resourceGroup.name,
    resourceName: k8sCluster.name,
    agentPoolName: "simsgpu",
    vmSize: "Standard_NC4as_T4_v3", // (4vCPU, 28GB RAM, 1GPU (Nvidia Tesla T4), 0.0570$/hour)
    count: 1,
    minCount: 1,
    maxCount: 10,
    enableAutoScaling: true,
    osType: "Linux",
    osSKU: "Ubuntu",
  },
  { replaceOnChanges: ["vmSize"] }
);

Notes:

  1. Autoscaling:

    • This template enables autoscaling to have the cluster automatically scale to the required size and back down afterward to reduce costs.

    • You can disable autoscaling by removing the minCount, maxCount and enableAutoScaling properties, but you'll have to set the count value accordingly.

  2. vmSize: The vmSizes used above can be adjusted to instances that adhere more to your needs.

This guide covers setting up your Composabl training cluster using , an Infrastructure as Code tool.

A new pulumi project, as per the pulumi documentation. You can find the documentation for Azure

In addition, you will also need to install the nvidia-gpu-operator on the cluster. This can be done according to the instructions on the .

pulumi
A working installation of pulumi
here
project website

Deploy an Agent System in a Container

Based on the notes provided in the image, we'll help you create a tutorial on accessing the agent system runtime after deploying it to Docker. This tutorial will explain the steps for building the Docker container, deploying the agent system, and accessing the runtime for inference or interaction.


Tutorial: Accessing the Agent System Runtime After Deploying to Docker

Once you have packaged and deployed your agent system inside a Docker container (https://docs.composabl.com/deploy-agents/deploy-an-agent-in-a-container), the next step is accessing its runtime for operations like model inference. This tutorial will guide you through the process of building and running the Docker container and then connecting to the agent system's runtime for further interactions.


Step 1: Preparing the Dockerfile and Environment

To deploy the agent system to Docker, we need to first create an image from the Dockerfile (https://docs.composabl.com/deploy-agents/deploy-an-agent-in-a-container). The Dockerfile will package the necessary runtime, model, and environment for the agent system.

Step 2: Building the Docker Image

  1. Building the Image: You can build the Docker image by running the following command in the terminal. This will take the Dockerfile and the associated files (like the pre-trained model) and create an image.

docker build -t composabl_agent_api .
  • The -t flag allows you to tag the image (composabl_agent_api), which makes it easier to reference later.

  • Make sure that the model file (agent.json) and all relevant scripts are reachable within the Docker context (i.e., the directory from which you are building).

  1. Checking the Image: Once the build is complete, you can verify that the image was created successfully by running:

docker images

Step 3: Running the Docker Container

Now that the image is built, the next step is to run it in a container. You will run the Docker container in an interactive mode to access the runtime.

docker run -e COMPOSABL_LICENSE="<your_license>" -it -p 8000:8000 composabl_agent_api
  • -it: Runs the container interactively.

  • -p 8000:8000: Maps port 8000 from the container to port 8000 on your local machine so that you can access the HTTP server for the agent system runtime.

  • -e COMPOSABL_LICENSE="<your_license>" : is exporting the environment variable and linking to your composabl license

The HTTP server should now be up and running within the container, ready to handle model inference or other tasks.


Step 4: Accessing the Agent System Runtime

With the Docker container running, you can now connect to the agent system's runtime. The runtime will be an HTTP server, as mentioned in your notes. You can access it through a POST request for model inference or other operations.

  1. Sending Requests to the Agent System: You can send a POST request to the running server using a tool like curl, Postman, or any Python HTTP library (such as requests).

Here’s an example using curl:

curl -X POST http://localhost:8000/predict -H "Content-Type: application/json" -d '{"observation": {"T": 311.0, "Tc": 292.0, "Ca": 8.56, "Cref": 8.56, "Tref": 311.0, "Conc_Error": 0.0, "Eps_Yield": 0.0, "Cb_Prod": 0.0}}'

This request will:

  • POST data to the /predict endpoint on localhost:8000, which is being forwarded from the Docker container.

  • The agent system will handle the request, infer the model, and return the action as a result.


Conclusion

In this tutorial, we walked through the process of:

  • Building a Docker image with your agent system and its runtime.

  • Running the Docker container interactively to expose the agent system’s HTTP server.

  • Accessing the agent system runtime by sending HTTP requests for inference or other tasks.

By following these steps, you can deploy and interact with your Composabl agent system in a Dockerized environment.

Cover

Industrial Mixer

Build four agent systems for a realistic use case with step by step instructions

MPC controller performance results
Diagram of DRL agent
Graph of plan-execute agent performance, showing the agent sticking closely to the reference line
graph of strategy pattern performance

Resolving Certificate Issues for Installing the Composabl SDK on WSL

If you use Windows Subsystem for Linux (WSL) on a corporate machine with internal security tools, such as Zscaler, you may encounter issues installing the SDK due to certificate replacement during HTTPS traffic inspection. Below is a method to resolve this issue by importing the appropriate certificates into WSL.

Step-by-Step Guide to Import Certificates into WSL

  1. Identify the Certificates

    • Click on the padlock symbol next to the URL to view the site certificates. This method may vary slightly depending on your browser.

    • Export the certificates in the chain needed to trust the "app.composabl.com" certificate, which may be replaced by Zscaler or similar security tools. The intermediate and root certificates are often issued by your organization's IT department.

  2. Export the Certificates

    • Export the certificates in Base 64 encoded X.509 (PEM) format.

    • Make sure the exported certificate files have a .crt extension.

  3. Copy Certificates to WSL

    • Copy the exported certificates to /usr/local/share/ca-certificates in WSL. You can use any file transfer method between Windows and WSL.

  4. Update the Certificate Store

    • Run the following command to update the certificate store in WSL: sudo update-ca-certificates\

  5. Verify the Certificate

    • You can verify if the certificates are correctly recognized by running: openssl s_client -connect api.composabl.com:443 -prexit

This command should connect and print the certificate chain and other relevant information.

  1. Removing Certificates

    • If you need to remove the certificates, delete them from /usr/local/share/ca-certificates and then run: `sudo update-ca-certificates --fresh`

This command will rebuild the certificate bundle from the remaining individual certificates.

  1. Alternative Method: Export from Windows Certificate Store

    The intermediate/root certificates may already be available in the Windows certificate store. You can identify and export the required certificates from there if needed.

By following these steps, you should be able to configure your WSL environment to trust certificates replaced by corporate security tools, enabling the installation of the SDK without issues.

Deploy an Agent System as an API

Deploying an agent system means exporting the trained agent system, loading it into your production environment, and then asking the agent system for decisions. In Composabl, the export is a json file called agent.json that contains all you need to deploy your agent system. You can load the agent system file to use in your IT infrastructure with many ways.

This document will show how you can deploy your agent system as an API using Python and Flask.

Step 1: Accessing and Preparing the Files

To deploy the agent system as an API, we need to extract the agent.json model, get the agent_api.py script to start the API, requirements.txt to install packages. You can find a sample for these files in our GitHub repo: https://github.com/Composabl/examples.composabl.io/tree/main/deploy_agent

This is the structure needed for the API:

*** How to extract the agent.json ***

  1. Log into the Composabl UI (https://app.composabl.com/onboarding), train your agent system, and navigate to the training sessions section.

  2. Check the status of the agent system:

    • Green status (Done) indicates finished training.

  3. Download the agent system essential file:

    • The agent system file (a .gz compressed file).

    • Extract the agent system file agent.json to the model folder

Step 2: Get the API python file

Step 3: Install the requirement packages

Run the following in your terminal:

It will install these packages:

  • composabl

  • flask[async]

  • numpy

Step 4: Export your license and start the API

To start Composabl API, you will need to export your license as an environment variable and then use Python to start the Flask API with your agent system.

Step 5: Test your API

After running the API, you can test it by opening the terminal and run the script below:

In the POST request, we pass the use case "observation" with sensor variables and their values to receive an action from the agent system. The code above is related to the agent.json demo for Chemical Process Control.

Open in your browser.

https://app.composabl.com
import asyncio
import os

from composabl import Agent, Trainer
import numpy as np

from flask import Flask, request, jsonify

# Initialize Flask app
app = Flask(__name__)

# Global objects (initialized once)
trainer = None
trained_agent = None

license_key = os.environ["COMPOSABL_LICENSE"]

PATH = os.path.dirname(os.path.realpath(__file__))
PATH_CHECKPOINTS = f"{PATH}/model/agent.json"

# Initialize the runtime, load the model, and package it when the app starts
async def init_runtime():
    """
    Initializes the trainer and agent before the first request is processed.
    This sets up the AI model for inference, loading it from checkpoints and preparing the agent.
    """
    global trainer, trained_agent

    # Assuming 'config' is required to initialize the Trainer
    config = {
        "license": license_key,
        "target": {
            "local": {"address": "localhost:1337"}
        },
        "env": {
            "name": "sim-deploy",
        },
        "trainer": {
            "workers": 1
        }
    }

    # Initialize the Trainer with the config
    trainer = Trainer(config)

    # Load the agent from the given checkpoint path
    agent = Agent.load(PATH_CHECKPOINTS)

    # Package the agent for inference using the Trainer's _package function (asynchronously)
    trained_agent = await trainer._package(agent)


# Asynchronous POST route to receive observation and return action
@app.route('/predict', methods=['POST'])
async def predict():
    """
    Receives an observation through a POST request, processes it using the pre-trained agent,
    and returns the corresponding action.
    """
    global trained_agent

    # Check if the agent has been successfully initialized
    if not trained_agent:
        return jsonify({"error": "Agent not initialized"}), 500

    # Extract the observation from the request's JSON body
    obs = request.json.get("observation")

    obs = dict(obs)
    obs = np.array( [float(x) for x in list(obs.values())] )

    # Validate that the observation was provided in the request
    if obs is None:
        return jsonify({"error": "No observation provided"}), 400

    # Asynchronously process the observation to generate the action
    action = await trained_agent._execute(obs)

    # Return the generated action in the response
    return jsonify({"action": str(action)})


if __name__ == "__main__":
    # Run the Flask application with async support on localhost, port 8000
    loop = asyncio.get_event_loop()
    loop.run_until_complete(init_runtime())

    app.run(host="0.0.0.0", port=8000, debug=True)
pip install --no-cache-dir -r requirements.txt
export COMPOSABL_LICENSE='<your license here>'

python agent_api.py
curl -X POST http://localhost:8000/predict -H "Content-Type: application/json" -d '{"observation": {"T": 311.0, "Tc": 292.0, "Ca": 8.56, "Cref": 8.56, "Tref": 311.0, "Conc_Error": 0.0, "Eps_Yield": 0.0, "Cb_Prod": 0.0}}'

Azure

Introduction

This will set up a fully functional Composabl Training Cluster on your Azure tenant.

Prerequisites

  1. An active Azure subscription

  2. Quota for the supported VM sizes (see later during setup)

Overview

Setting up your cluster consists of the following steps:

  1. Registering a new cluster in Composabl

  2. Running through the wizard in Azure

Registering a new cluster in Composabl

  1. Choose "Create Cluster"

  2. Select "Azure Cluster"

  1. Take note of the Cluster ID and License key - you'll need them during the setup wizard on Azure, then click "Create on Azure"

Creating your cluster on Azure

  1. Click "Create" to start the setup wizard

  1. On the first step, select a resource group to create the cluster in. We recommend you use a new Resource Group made specifically for this purpose. Give the resource a descriptive name.

  1. On the second step, fill in the following:

  • Cluster name - this is a technical name and should be unique for all AKS clusters in your Azure subscription

  • Kubernetes version - select the latest version, unless a policy prevents you from doing so. Only supported versions are listed.

  • Container registry name - this is the name of the container registry you'll be pushing your simulator images to. As with the Cluster name, this name has to be unique for all Azure Container Registries in your Azure subscription.

  • GPU Training - Check this if you want to use GPUs to accelerate training.

  • Scaling settings - Whether to enable autoscaling, and limits for the autoscaling, if enabled. Note that we recommend you leave autoscaling enabled, as this can significantly reduce costs.

  • The advanced settings should not be used unless you require specific features to be enabled for testing or support purposes.

Finally, click "Review + Create" to review the settings and create your training cluster.

VM sizing and types

A Composabl cluster consists of up to 6 different node types:

  1. Main nodes - these are Kubernetes system nodes, used for running the Kubernetes master processes

  2. Composabl nodes - these nodes run the Composabl software - e.g. the Composabl Controller and the Historian processor

  3. Environment runner nodes - The Composabl SDK will manage the simulator instances from these nodes. If GPU training is disabled, the training process will run on these nodes as well.

  4. Simulator nodes: These nodes are where the simulator(s) will be running

If you have enabled GPU training and/or GPU simulators, you will have the following: 5. (GPU Training enabled) Learner nodes - these GPU-enabled nodes will perform the actual training on GPU, after samples are collected from the simulators 6. (GPU Training + GPU Sims enabled) - these nodes will run GPU-enabled simulators.

For most cases, we recommend leaving the default sizing. If no quota is available in these for you, you have the option to change to a similar node size for any of these.

You can find the Composabl Training Cluster Azure application

If using GPU training, you'll need quota for GPU-enabled nodes (such as the )

Head to and navigate to the clusters page.

Visit the Microsoft Azure Marketplace page

Various VM sizes - See for additional information on each of these

On the final step, fill in the Cluster ID and License key you received during your

here
NCasT4_v3 series
Composabl
here
VM sizing and types
Cluster Registration