Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Composabl is the platform for creating industrial-strength multi-agent AI systems that make high-impact decisions in the real world.
With Composabl, you can convert expert knowledge about how a process works into a team of agents with specialized skills that work together to allow the system to make the right decision in every situation. These can be either programmed or learned through advanced AI techniques and orchestrated so that the multi-agent system performs effectively in every part of the process and under any conditions. For skill agents that learn by practicing, Composabl trains the agents in realistic scenarios until the agent can succeed at the task and outperform the alternatives.
Find articles on key topics with these links.
You can use any model or Python algorithm with Composabl for training agents, adding perception, analysis, and communication, and making decisions. See how to configure different types of modules and then publish them to the Composabl no-code UI for agent design, training, and deployment.
Composabl multi-agent systems are built on modular skills that break down a task into separate parts. Learn how to create skill agents to train with deep reinforcement learning.
Once Composabl agentic systems are designed and trained, you can export them to the Composabl runtime to connect with your system. Learn how to deploy an agent within the runtime container and how to use Composabl's tools to analyze agent behavior during both training and deployment.
Integrate a machine learning model
Use existing ML models for machine vision, prediction, or analysis in Composabl agents
Integrate a language model
Set up an LLM call to add communication to your agent
Integrate a programmed algorithm
Configure any Python algorithm, such as a PID controller or optimization algorithm, as a skill agent in Composabl
Integrate third-party software
Use an API call to external software as a skill agent in Composabl
Integrate a simulator
Configure your simulator to work with Composabl
Publish to the platform
Upload configured models and algorithms to the Composabl UI with one CLI command
Create skill agents with goals and constraints
Create skills agents with subject matter expertise by configuring goals and constraints for learning
Create skill agents with rewards using the SDK
Create skill agents that learn with rewards and access additional teaching tools in Python
Publish to the platform
Upload skill agents configured with the SDK to the Composabl UI for training and deployment
Evaluate the performance of your multi-agent system
Evaluate performance using the Composabl benchmarking feature
Deploy a Composabl multi-agent system
Export a multi-agent system and connect to the Composabl runtime container
Connect the multi-agent system to your production system
Connect the Composabl runtime container to your system for deployment
Audit multi-agent system behavior with the historian
Use the Composabl historian to analyze system behavior in detail
Install the Composabl CLI Below, you can find the commands available in the Composabl CLI.
composabl login
Login to the Composabl platform
composabl agent train --json-path <PATH>
Train the agent based on the JSON Structure provided
composabl agent visualize
Visualize the agent structure
composabl sim
)composabl sim new --name <NAME> --description <DESCRIPTION> --location <PATH>
Create a new sim template. This will create the boilerplate required to submit a sim
composabl sim validate <PATH>
Validates the simulator at the given path
composabl sim validate --path <PATH>
Validates the simulator at the given path
composabl sim validate --address <ADDRESS>
Validates the simulator running at the given address
composabl sim run <PATH>
Run the sim created through the command above locally (useful for development)
composabl sim list
List the sims associated with an organization in the Composabl UI
composabl sim list --local
List the Docker images for sims running locally
composabl sim status
Gives a status of currently running local sims
composabl sim stop --sim-id <SIM_ID>
Stop the running sim
composabl sim stop
Stop all the running sim
composabl sim mappings --address <ADDRESS> --image <IMAGE>
Based on the address or image, get the mappings for a Sim its observation and action space
composabl perceptor
)composabl perceptor new --name <NAME> --description <DESCRIPTION> --location <PATH>
Create a new perceptor template. This will create the boilerplate required to submit a perceptor
composabl perceptor publish <PATH>
Publish the perceptor to the NoCode app
composabl skill
)composabl skill new --name <NAME> --description <DESCRIPTION> --location <PATH>
Create a new skill template. This will create the boilerplate required to submit a skill
composabl skill publish <PATH>
Publish the skill to the NoCode app
composabl selector
)composabl selector new --name <NAME> --description <DESCRIPTION> --location <PATH>
Create a new selector template. This will create the boilerplate required to submit a selector
composabl selector publish <PATH>
Publish the selector to the NoCode app
composabl job
)composabl job list
List the running jobs
composabl job stop <JOB_ID>
Stop the running job
composabl job submit --path <PATH> --job-id <JOB_ID> --json <JSON> --template-path <TEMPLATE_PATH>
Submit a job for training
Controller: A skill agent that makes decisions based on programmed algorithms. Also called a programmed skill agent.
Decisions: The outputs of a Composabl system - its instructions for controlling the system
Design patterns: Common structures that can be used to quickly create multi-agent systems
Episode: An entire run through the task
Functional Pattern: A machine teaching design pattern used to orchestrate skills in sequences
Iteration: One decision during agent training or performance
Learned skill agent: A skill agent that uses DRL to make decisions and learn by practicing in simulation
Orchestration: Arranging agents as modular building blocks to work together to complete tasks
Orchestrator: A special type of skill agent that assigns decision-making control to the right decision-making skill agent based on current conditions
Perceptor: An ML model or other algorithm that interprets sensor data
Plan-Execute Pattern: A machine teaching design pattern used to orchestrate skill agents in pairs that work together to make decisions
Programmed skill agent: A skill agents that makes decisions based on programmed algorithms. Also called controllers.
Project: A collection of agents for the same use case that share the same simulator and top-level goal
Scenarios: Conditions that are associated with specific skills
Sensors: The part of the system that takes in information from the simulator or real environment - the eyes, ears, and other senses of the system
Simulator: The virtual environment where agents practice and improve performance
Skill agents: Modules within a multi-agent system that make decisions to complete all or part of a task
Strategy Pattern: A machine teaching design pattern used to orchestrate skill agents in hierarchies
Teacher: An algorithm that creates a skill that uses DRL to learn to make decisions
Composabl is a two-part platform with a no-code UI and a Python SDK and CLI. The interplay of these parts gives Composabl its combination of usability and power.
The two parts enable teams to work together effectively. People and roles who primarily use code, such as data scientists and controls engineers, use the SDK to create modules like ML models and deep reinforcement learning skills. Then, subject matter experts, process engineers, and others can use the no-code interface to easily create teams of agents from these modular building blocks and train them to succeed.
We designed the platform this way because, for complex, high-value processes, there will be some tasks that can only be done through coding, and some team members who prefer to work in code, and other tasks that are better performed through a visual interface. Both parts of the platform work together.
Most users use both parts of the platform to some extent but spend more time in the no-code UI or the SDK, depending on their expertise and role. How you will use the platform depends on your role and what part of the process you are working on.
The first thing you will need to do after you login to your account is to connect a simulation to Composabl.
You can access Composabl Via a no-code UI and an SDK. They work together to enable you to build, train, and deploy autonomous agents.
Installing the SDK
Development
Note: The Composabl CLI and SDK require Python version 3.10.x
Step 1: UI | Create your first use case
Step 2: UI | Set performance goals
Step 3: UI and SDK | Create agents
Create or package ML models to import to UI with the SDK to add advanced perception to agents.
Create or package LLMs to import with the SDK to the UI to add natural language to agents.
Create or package controllers and optimization algorithms to import to the UI with the SDK to add programmed decision-making to agents.
Step 4: SDK | Publish to the UI with one CLI command
Step 5: UI | Orchestrate modular components together to create teams of agents in the UI
Step 6: UI | Train your teams of agents at scale with one click using the UI
Step 7: Notebook | Export Historian training data and perform detailed analysis
Step 8: UI and SDK | Export trained multi-agent systems and connect them to the Composabl runtime for deployment
The examples and code samples in this documentation refer to our sample use cases. These examples are real-world use cases with complex goals and constraints. In each case, the Composabl team has built agents that exceed the benchmark control technology by orders of magnitude.
Learn more: Read the Composabl whitepaper about the production scheduling use case.
The industrial mixer use case is a realistic case study of a chemical process control agent controlling a continuous stirred tank chemical reaction. The agent controls the temperature in a tank where a chemical reaction occurs to create a product.
As the chemicals are stirred together in the tank, the reaction produces heat at a nonlinear, unpredictable rate. If the tank isn’t cooled enough, it can reach dangerous temperatures, a condition called thermal runaway. If it’s cooled too much, not enough product will be produced. The agent needs to balance these two goals, keeping the tank at the right temperature at every moment to optimize production while ensuring safety.
Access perceptors, skills, and selectors for this use case.
Learn more: Read the Composabl whitepaper about the production scheduling use case.
The production scheduling use case is an complex production planning problem set in an industrial bakery. The agent must determine the right amount of cookies, cakes, and cupcakes to make each day, directing teams of workers and equipment and responding to fluctuations in costs, pricing, and demand.
The case study, developed in partnership with Composabl partner Rovisys, requires the agent to make a choice every minute between 24 possible combinations of equipment, task, employee and product, over the course of a 400-decision day, with the ultimate goal of maximizing profit.
Along with the strategy pattern, and the perception pattern, the plan-execute pattern is one of the major of machine teaching. In this pattern, the skills work together in a skill group, with the first skill determining what the action should be and the second skill determining how to achieve it.
What is special about this agent is that it combines DRL and MPC, the technologies from the two single-skilled agents — the worst performers — to create the best performing agent.
In this example, the DRL skill first uses its powers of learning and experimentation to determine the goal temperature for the cooling jacket — the set point. It then passes this information on to the MPC skill, which uses its powers of control and execution to direct the agent on what action to take to achieve the desired temperature.
Remember how the strategy pattern is like a math class where each student solves the problems they are best at, as assigned by the teacher? In the plan-execute pattern, the students work in groups to solve problems together. Let’s say Student A is good at translating word problems into equations, while Student B is good at solving equations. Student A works on each problem first, and then passes it over to Student B, who produces the solution. No teacher is needed here, because the students divide each problem the same way.
Let's get started configuring this agent!
This agent has two skills called control_full_reaction
and mpc-skill-group
. We have already published control_full_reaction
to our project, so we only need to publish mpc-skill-group
to build this agent in the Agent Builder UI. To publish mpc-skill-group
to your project you will need to open up your favorite code editor and terminal. In your terminal, navigate to the skills folder of the Industrial Mixer Repo and use this command with the Composabl CLI.
Return to the agent builder studio and refresh the page. The skills will appear in the skills menu on the left of your page.
All skills, perceptors, and selectors have a minimum of two files in them. A Python file contains the code that the agent will use, and a config file.
pyproject.toml
, a config file with the following information.
A Python file. For this agent, we use controller class with the following code and explanations in comments inline.
File Structure
See the Code
pyproject.toml
controller.py
First drag the skill control_full_reaction
from the hand side of the page to the skill container. Once it's there drag the mpc-skill-group
and make sure that it is dropped below the control_full_reaction
skill and not beside it.
Now, we are ready to train your agent and see the results. We suggest you run 50 training cycles.
When the training has been completed, you can view your results in the training sessions tab in the UI. This will show you information on how well the agent is learning.
Conversion rate: 95% Thermal runaway risk: Very low
We tested this fully trained agent and plotted the results.
This agent is the best performer of the group. Combining two imperfect technologies together with Machine Teaching produces much better results than either technology achieves alone.
The DRL agent is a very simple agent design with only one skill. This agent does not use machine teaching to decompose the task into skills that can be trained separately. Instead, the entire reaction is controlled by a skill trained with deep reinforcement learning.
Let's get started configuring this agent!
This agent has a single skill called Control Full Reaction
. To publish that skill to your project you will need to open up your favorite code editor and terminal. In your terminal, navigate to the skills folder and use the command with the Composable CLI.
Return to the agent builder studio and refresh the page. You will see the skill in the skills menu on the left of your page.
All skills, perceptors, and selectors have a minimum of two files in them. A Python file contains the code that the agent will use, and a config file.
pyproject.toml
, a config file with the following information.
Drag the skill control_reaction
that you can now see on the left-hand side of your project onto the Skill Layer.
Now, we are ready to train your agent and see the results. We suggest you run 50 training cycles.
When the training has been completed, you can view your results in the training sessions tab in the UI. This will show you information on how well the agent is learning.
You will most likely see a steep learning curve as the agent experiments with different control strategies and learns from the results. When the learning curve plateaus, that usually means that the skill is trained.
Conversion rate: 90% Thermal runaway risk: Low
We tested this fully trained agent and plotted the results.
The DRL agent performs well. Its relatively thin shadow means that it performs consistently over different conditions and it stays within the safety threshold almost every time
This agent controls the initial steady state very well, staying right on the benchmark line. But during the transition, the DRL agent goes off the benchmark line quite a bit. It doesn't notice right away when the transition phase begins, staying too long in the lower region of the graph, and then overcorrecting. That's because DRL works by experimentation, teaching itself how to get results by exploring every possible way to tackle a problem. It has no prior knowledge or understanding of a situation and relies entirely on trial and error. That means that it is potentially well suited to complex processes – like the transition phase - that can’t easily be represented mathematically.
But its behavior is erratic because it can’t distinguish between the phases. The DRL agent’s skills do better than the traditional automation benchmark but still leave some room for improvement.
This tutorial will take you through the process of building agents for a realistic chemical manufacturing use case.
Learn about the use case (this page)
Experiment with your own designs
Mixing and blending tasks are great use cases for intelligent agents because they are both complex and high-value. In the example in this tutorial, improvements in process could lead to millions in ROI.
In the industrial mixer use case, raw materials are stirred together inside a tank together, undergoing a reaction that produces the desired end product.
The goal of the process is to convert as much of the raw material as possible. But as the chemicals mix and the conversion occurs, the tank heats up. If the temperature gets too high, a condition called “thermal runaway” occurs, causing explosions and fires.
To produce as much chemical as possible, the operator must constantly adjust the temperature in the tank, keeping it high enough to allow productivity but low enough to avoid any thermal runaway risk.
As in all machine teaching use cases, this process can be summarized in the form of two separate goals that must be balanced against each other:
The process is controlled by adjusting the mixture's temperature in the tank using a "jacket" filled with coolant. Lowering the coolant temperature in the jacket lowers the temperature in the tank, decreasing the risk of thermal runaway.
However, cooling the tank can also reduce yield. By how much? The answer varies unpredictably – temperature changes affect chemical concentration differently at different parts of the reaction. That nonlinear relationship between temperature and yield is why this is a nuanced process that benefits so much from intelligent automation.
The industrial mixer simulator uses principles of chemistry to model the behavior of the system in real life.
The current automation solution is a linear MPC controller. As the current solution, this agent's performance is the benchmark for the other designs.
The image below shows an MPC controller represented in the visual system of agent designs. The agent takes in sensor information about the temperature in the tank and the concentrations of the chemicals.
It passes that information to the skills layer of the agent. The skills layer contains a single programmed skill: control reactor. This skill uses a mathematical model to determine the desired temperature set point for the tank. It also determines the control actions to take to achieve that temperature using the cooling jacket, and outputs those actions as decisions.
In simulation, this agent's conversion rate was 82%. That means that 82% of the reagents were turned into product, with 18% waste, but the risk of thermal runaway is high.
Like all technologies, model predictive control has a “personality,” a unique set of strengths and weaknesses that can be seen reflected in this performance. MPC is a rule follower. It is a way of controlling a process using mathematical relationships that can be fully mapped and programmed by human engineers. It works well in situations that are straightforward and linear.
The agent does a good job at the start, in the first steady state. However, as the transition phase begins, the agent’s performance starts to fail. The wide shadow shows that this agent can’t adapt to the unpredictable conditions in the transition. Its performance becomes dangerously inconsistent, potentially allowing the temperature to exceed the thermal runaway checkpoint at nearly every point in the reaction.
When an MPC controller is used to control this process in the real world, a human operator needs to step in and take over control before the automated system lets the temperature cross the thermal runaway threshold.
The following tutorials will walk you through the anatomy of some multi-agent systems that take different approaches to learning how to control the temperature of a mixer more effectively to maximize yield and avoid thermal runaway.
A Python file. For this agent, we use a with the following code and explanations in comments inline.
We wrote a. Read the white paper to learn about the problem, the agent designs, and agent performance. Use this tutorial to build and train all the agents for yourself.
This getting started guide takes you through the steps you will need to go through before you can started building your teams of agents.
A Composabl account
Select the simulator you will use for the project and the assigned team. If you haven't created a team yet, you will need to do that before you create your first project.
Then, you can use the AI prompt to explain what you are attempting to achieve with your project and have it bootstrap based on your description, or you can tell the agent AI what to name your project. I'm going to ask them to call it Chemical Process Control.
Click Create Project
and Finish
To finish setting up your project, set your top-level project goals.
Now, we will set the goals for this example. In this project, we want to:
Maximize yield (make as much product as we can)
Avoid thermal runaway (prevent dangerous overheating)
Fill out the goal title and description for the top level goals.
Goal Title: Chemical Process Control Goals
Description: The goal for this project is to maximize yield and avoid thermal runaway by keeping the temperature below 400 Kelvin
We are going to add two conditions to represent the two goals.
To maximize yield, the objective is Maximize
and the variable is Eps_Yield
To avoid thermal runaway, the objective is Avoid
, the variable is T
, and the target is 400
Now you can save your changes.
Now that you have your project created and your goals set you're ready to start creating agents!
The simulator is the part of the agent ecosystem that tells you what will happen when you take a certain action. Whether based on historical data, physical principles and math, or other methods, the simulation environment models the real system and allows your agents to train.
To look under the hood and see how a simulator works, explore Composabl's public Python simulators hosted on Docker Hub.
You can build and train agents using these simulations to explore Composabl, educate yourself and your team, and create demonstrations and POCs. These are relatively simple simulations that can train agents quickly.
The best way to access these simulations is through the CLI.
To see the options available:
To connect to one of the simulators:
If you have a simulator, this section of the documentation will explain how to connect it to Composabl so that you can use it to train agents.
If you don’t have a simulator, you may want to work with a Composabl partner to help you develop and connect one.
Multi-agent systems have structure, just like our brains. Different parts perform different functions.
This modularity is a key to building high performing agentic systems quickly and effectively. Most complex systems, from factories to rockets to software programs, are broken into modular pieces to separate concerns, reuse design patterns, and integrate pre-built components. Like these systems, decision-making AI also works best when decisions are separated into modular concerns. Modular structure makes intelligent agents easier to build, test, and maintain.
Multi-agent systems work through a feedback loop. The system gets information from the simulation or real environment and then makes a decision to adjust one or more of its control variables in some way. This creates a change that is reflected in new sensor information that the agent can then process and use to decide on the next adjustment.
While agents are being built and trained, they are connected to that give them feedback about how their actions affect the system they are learning to control. Once an agent is deployed, the simulator is replaced by an equivalent connection to the real system.
The simulator and the multi-agent system function in a continuous feedback loop, with the system outputting actions that affect the simulation environment, and the simulator returning new information to the agent based on every decision.
are the part of the multi-agent system that take in information from the simulator or the real environment about conditions and about the results of the agent's actions. They are the equivalent of human senses – the eyes and ears. Sensor readings come from specific variables in the simulation or control system that measure various aspects of the state space.
are the system's outputs, adjusting one or more of the control variables to control the process dynamically throughout the performance of the task. Each decision during training is known as an , while a series of iterations to complete the task is called an .
The team of agents is organized in layers, with information and decisions passing through each layer in sequence.
The layer, or decision-making layer, is the heart of a multi-agent system. This is where the system makes its control decisions. When a specific skill agent is activated, it will determine the control action the system should take.
Multi-agent systems contain multiple agents d together according to design. Skill agents can work together in groups, sequences, hierarchies, or coordinated teams. This modularity allows the agent to learn more efficiently and perform better.
You can imagine skill agents being like students on a math team who are working together to complete a set of problems. Each student performs best solving a particular kind of problem: one is good at fractions and one at decimals. Depending on the type of problem, the appropriate student will use their expertise to solve the problem and produce the answer for the team. Sometimes one student might handle the entire problem, and for other problems more than one student might need to work together.
Just as different students’ capabilities make them able to solve particular problems, different skill agents may make use of different technologies. Some types of decisions are best approached through agents that can be programmed with mathematical calculations, rules, or optimization algorithms. Others that are more complex and nonlinear can be trained using deep reinforcement learning.
For an HVAC system regulating temperature in an office building:
Control the system during the day
Control the system at night
For a factory where responses are needed to different types of alarms:
Handle safety critical alarms (programmed with rules)
Handle simple alarms (programmed with calculations)
Handle complex alarms (learned with reinforcement learning)
For a drone autopilot:
Stabilize
Get to landing zone
Land
Avoid obstacles
For a robotic arm used to grab and stack objects:
Reach (extend the robot arm from the "elbow" and "wrist")
Move (move the arm laterally using the "shoulder)
Orient (turn the "wrist" to position the "hand")
Grasp (Manipulate the "fingers" to clamp down)
Stack (Move laterally while grasping)
skills are the supervisors for your agent. In the math class analogy, the orchestrator would be like the teacher. The teacher assesses the type of problem and assigns the right student.
In a multi-agent system, an orchestrator uses information from the and to understand the scenario and then determine which skill agent is needed. Once the skill agent is called into service, it makes the decision.
For example, for HVAC control, an orchestrator would determine whether day or night control is needed, and then pass control to the appropriate skill agent. In the safety alarm example, the orchestrator determines the type of alarm and then passes the decision to the right skill agent. In the drone and robotic arm examples, the skills need to be performed in sequence. In these cases, the orchestrator assigns the appropriate skill agent as needed for each step in the process.
A perception layer is an optional enhancement layer. process and translate sensor data into a format that can be used to make decisions. In other words, the perception layer inputs the sensor variables and outputs new variables deduced by calculation, machine learning, or other programming.
For example, if we design a team of agents as an autopilot for a drone, we might have sensor variables that measure pitch, yaw, roll (position of the drone), velocity in each of those three directions, and acceleration in each of those three directions. But what about stability? Stability is an important characteristic to understand while flying a drone, but there is no sensor variable that describes stability. It is too complex to be captured by a single sensor.
The perception layer allows us to create a variable for stability. It can be calculated using dynamics equations or trained with supervised machine learning. The new variable then becomes accessible to the rest of the multi-agent system along with the other sensor variables.
Computer Vision: A camera sensor passes image or video feeds into a perceptor module that identifies object types and locations
Auditory Perception: A microphone sensor passes machine sounds to a perceptor module that identifies which state the machine is in based on the sounds that it is making
Prediction: A perceptor module inputs quality measurements and past agent actions and predicts whether current actions will lead to acceptable quality measurements
Anomaly Detection: A perceptor modules inputs market variables and detects when the market is changing regimes.
Classification and Clustering: A perceptor module inputs machine and process data and classifies which of several conditions a manufacturing line is currently in.
Skills are the foundational building blocks for your intelligent agent. They take action to achieve goals in key scenarios where your agent needs to succeed. To build an agent with Machine Teaching, you will create multiple skills and then orchestrate them together.
You can use three different types of skill within Composabl.
For learned skills you can use either the UI or the SDK successfully to create teachers, but the SDK includes some fine-tuning options that are not available in the UI.
Controllers for programmed skills can be created only through the SDK. They can then be published to the UI for use in agents.
You can use either the UI or the SDK to create selectors.
The first thing you will do is in the Composabl UI.
for more information about finding a simulation partner.
Create a to implement a that uses DRL to make decisions. You can set up your teacher to or to . The skill will then train and practice in simulation until it can make decisions successfully.
Create a to implement a that makes decisions based on programming rather than learning. A controller can use math, rules, optimization, or any other technology that can be expressed in a Python program. and publish them to the UI to use in agents.
Create a to . Selectors are specialized skills that direct the agent how to choose between different decision-making skills. Selectors can be either learned or programmed.
You can also create skills in two different ways, using the UI and using the SDK. If you use the SDK, you can then to be included in agent designs.
The strategy pattern agent performs well, but it's not perfect in avoiding thermal runaway. One good way to address that is to add a perception layer.
are special skills that process and interpret sensor data before passing it to the rest of the agent. To improve the strategy pattern's performance on temperature control, you can add a perception layer that uses machine learning to predict thermal runaway. This ML model is trained to interpret the sensor data and check for conditions that might indicate an elevated risk of thermal runaway, and then pass that information to the selector along with the rest of the sensor data.
Think of the perception layer as an additional set of senses that helps the agent predict when something might go wrong, like a teacher monitoring the class for early signs of trouble.
Let's get started configuring this agent!
This agent has a perceptor skill called thermal_runaway_predictor
. To publish it to your project you will need to open up your favorite code editor and terminal. In your terminal, navigate to the perceptors folder and use this command with the Composabl CLI.
Return to the agent builder studio and refresh the page. The skill will appear in the skills menu on the left side of your page.
All skills, perceptors, and selectors have a minimum of two files in them. A Python file contains the code that the agent will use, and a config file. Perceptors have some more files to load in ML models and other python packages.
File Structure
pyproject.toml
thermal_runaway_predictor.py
Drag the Perceptor thermal_runaway_predictor
that you can now see on the left-hand side of your project onto the perception layer.
Now, we are ready to train your agent and see the results. We suggest you run 50 training cycles.
When the training has been completed, you can view your results in the training sessions tab in the UI. This will show you information on how well the agent is learning.
The agent training results will be a little bit different from the strategy pattern alone. That's because the thermal runaway predictor is making a difference in how the agent performs.
Conversion rate: 92% Thermal runaway risk: Very low
We tested this fully trained agent and plotted the results.
Adding perception improves agent temperature control performance.
The red lines on the graph show where the perceptors helped the agent make adjustments to avoid thermal runaway. This agent gets the same yield as the strategy pattern agent, but the improved temperature control has reduced the thermal runaway incidents from low to 0.
In this tutorial, we will learn how to upload simulators via the Composabl CLI as well as your custom, adapted simulators to the Composabl UI.
To upload simulators that already follows the Composabl simulation specification all you need to do is the fallowing command from the simulation folder:
After that, you can go to the Composable editor and connect that sim to any project.
If your simulator isn't already compatible with the Composabl platofrm you will need to create gRPC bindings and upload it as a Docker image in the Composabl editor. You can follow along with these Composabl API patterns and the following structure:
Going through the files:
docker/entrypoint.sh
: This file is the entrypoint of your Docker container.
src/exceptions/invalid_usage.py
: This file contains the exception class InvalidUsage
that is used to raise exceptions in the simulator.
src/__init__.py
: This file is the initialization file of the module. No code is needed here, but for Python to recognize the folder as a module, this file is necessary.
src/main.py
: This file is the main file of the simulator. It uses the composabl_core.networking
module to expose the simulator to the Composabl platform. This file is also available zipped along with this tutorial.
src/server_impl.py
: This file contains the implementation of the server that will be used to run the simulator.
src/sim.py
: This file contains your implementation of the simulator itself. Usually, a Env
(inheriting from gym.Env
) class is implemented here, and it is used to run the simulator.
Dockerfile
: This file is the Dockerfile that will be used to build the Docker image of your simulator.
requirements.txt
: This file contains the Python packages that are necessary to run your simulator. It is used to install the necessary packages in the Docker image.
Under the hood, the Composabl SDK uses gRPC to communicate with the Composabl platform.
To create a simulator that works with the Composabl API, you have to implement the following gRPC methods:
After making sure that your simulator is compatible with the Composabl platform, you can proceed to the next section.
Go to the folder where your simulator is located. We can navigate to the simulator folder and see what is inside it.
After that, ensure that you have Docker installed. You can check if Docker is installed by running the following command:
To log in to Docker Hub, run the following command:
You will be prompted to enter your Docker Hub username and password. After that, you should see a message saying that you are logged in.
Now, we can build the Docker image of the simulator. To do so, run the following command:
This command will build the Docker image of the simulator. The -t
flag is used to tag the image with the name <your-docker-hub-username>/<simulator-name>
. The .
at the end of the command indicates that the Dockerfile is in the current directory.
After building the image, you can check if it was built successfully by running the following command:
And then push the image to Docker Hub:
After that, you can go to the Composabl UI and upload your simulator. To do so, follow the steps below:
Then, on the left sidebar, click on the "Simulators" tab. You should see a list of simulators that are already available on the platform.
On the top right corner, you should see a button to "New Simulator". Click on it.
A pop-up will appear, asking you to select between "Internal" and "External" simulators. Select "External".
Then, you can fill the Title and Description of the simulator. We suggest smaller names for the Title and a more detailed description for the Description.
After clicking in next step, a brief tutorial will open up. Take care to read it and then click on "Next Step" again.
After that, you can fill the Docker Image field with the name of the Docker image you pushed to Docker Hub <your-docker-hub-username>/<simulator-name>
.
If the image is public, no more fields are needed. If the image is private, you need to fill the Docker Username and Docker Password fields with your Docker Hub username and password, respectively. Then, click on "Validate and next step".
If you've followed all the steps correctly, you should have successfully uploaded your simulator to the Composabl UI. You can now use your simulator to train agents and run simulations on the platform. If you have any questions or need help, feel free to reach out to us.
are programmed skills used in Composabl agents. These may be optimizers, PID or MPC controllers, heuristics, or any other Python algorithms that are useful to control parts of a process. Configure controllers in the SDK and then publish them to the UI to use in agents.
To create a skill in the Python SDK, begin by logging in to the SDK by typing Composabl login
from the CLI.
Then type composabl skill new
.
Give the skill a name and a description in response to the prompts that follow. Choose your skill type ascontroller (a programmed module like an optimization algorithm or MPC controller).
Specify the folder where you’d like to create the skill.
The Composal SDK will create a folder and Python controller.py()
file from the template.
The Python controller class offers several functions that you can use to build your algorithms or rule-based controllers in Composabl.
Controllers don't need training, since they are based on programming rather than learning, but they include functions that connect them to the rest of the agent during training.
__init__
FunctionThe __init__
function can be used to initialize your Algorithm and define initial configuration variables, this is called once when the Runtime starts. Let's supose that you want to use a MPC algorithm:
compute_action
FunctionThe compute_action
function will process the observation and returns an action. This function returns a number that represents the action that will be sent to the simulation.
The compute_termination
function tells the Composabl platform when to terminate a practice episode and start over with a new practice scenario (episode). From a controller perspective, it makes most senses to terminate an episode when the agent succeeds, fails, or is pursuing a course of action that you do not find likely to succeed. This function returns a Boolean flag (True
or False
) whether to terminate the episode. You can calculate this criteria however seems best.
The success_criteria
function provides a definition of skill success and a proxy for how completely the agent has learned the skill. The platform uses the output of this function (True
or False
) to calculate when to stop training one skill and move on to training the next skill. It is also used to determine when to move to the next skill in a fixed order sequence. The agent cannot move from one skill in a fixed order sequence to the next, until the success criteria for one skill is reached.
Here are some examples of success criteria definition:
A simple but naive success criteria might return True
if a certain sensors or KPI value for an episode or scenario crosses a threshold, but False
if it does not.
A complex success criteria might compare a benchmark controller or another agent to the agent across many key variables and trials. It returns True
if the agent beats the benchmark on this criteria, but False
otherwise.
As information passes through perceptors, skills, and selectors in the agent, sometimes it needs to change format along the way. You can use three teaching functions to transform sensor and action variables inside agents: transform_
sensors, transform_action
, and filtered_
sensor _space
.
To transform sensor variables, use the transform_sensor
function to calculate changes to specific sensors, then return the complete set of sensor variables (the observation space).
Two of the most common reasons for transforming sensor variables are conversion and normalization. For example, if a simulator reports temperature values in Fahrenheit, but the agent expects temperature values in Celsius, use the transform_sensor
function to convert between the two.
Normalization is when you transform variables into different ranges. For example, one sensor variable in your agent might have very large values (in the thousands), but another variable might have small values (in the tenths), so you might use the transform_sensor
function to transform these disparate sensor values to a range from 0 to 1 so that they can better be compared and used in the agent.
You may want to transform action variables for the same reasons as sensor variables.
Use the filtered_sensor_space
function to pare down the list of sensor variables you need for a particular skill. Pass only the information that a skill or module needs in order to learn or perform well.
are situations where your agent needs to behave differently to succeed.
Scenarios are created by carving out sections of the simulation space, as defined by specific configurations of variables and variable ranges. This allows you to train skills only in the scenarios where they will need to perform, leading to more efficient training and better performance. Selectors learn to recognize scenarios and pass control to the appropriate specialist skill to handle the scenario’s demands.
How you define scenarios depends on the type of sensor variables you are working with.
Discrete variables are categories that describe a particular scenario. For each scenario, there is likely a perceptor in the agent (a machine learning model for example) that inputs the sensors, processes the sensor values, and outputs the discrete variable category.
Continuous variables are numbers. When they are used to define a scenario that one number value defines the section of the space that your agent will explore for decisions.
Sometimes a scenario is better defined by a range of continuous values than by a single continuous value. In that case, the scenario would be defined by a continuous variable range.
Here are some examples of how scenarios can be defined in different ways:
Let’s say that we are creating an agent to plan the operations of a restaurant with varying demand for three different recipes. We could create three different scenarios:
Low demand: 30 recipe A, 20 recipe B, and 10 recipe C
Normal demand: 60 recipe A, 45 recipe B, and 10 recipe C
High demand: 100 recipe A, 50 recipe B, and 25 recipe C
Once you defined these scenarios, you could set the agent to train in the low demand scenario until the success criteria is reached. Then it would move to normal demand, and then to high demand. We would expect the agent to learn faster as it progressed through the scenarios, with cumulative knowledge building over time.
Add scenarios to your project by clicking on Scenarios from the lefthand menu to open the Scenarios page.
Click Add scenario to create a new scenario. Give your scenario a name and then click Add condition to configure it. You will then have the option to choose from any of the available sensor variables and apply conditions to them.
For continuous variables, select Is and enter the exact value.
For continuous variable ranges, select Is between and enter the range.
For discrete variables, select Is element of and enter the possible values.
In addition to identifying scenarios, Composabl also lets you teach your agents about how scenarios relate to each other. To do this, you can build scenario flows to establish sequences of scenarios for the agent to practice.
Without scenario flows, the training platform will connect scenarios at random. But sometimes it is important for agents to practice scenarios in a specific order. For example, it might be important to practice flying a drone in high wind and then landing in the same conditions. Or an agent managing a production line might need to practice going from a scenario in which raw materials are scarce to one in which material costs rapidly rise. By creating a scenario flow, you can ensure that a given sequence of scenarios will be included in training.
Once you have scenarios built, create sequential sections of the task by clicking Add section.
After creating all the sections you need, drag and drop the scenarios into the sections.
Create as many flows as you need to capture the different scenario sequences your agent will need to navigate.
Scenarios are added to agents as part of configuring skills. Scenarios are added to skills so that skills know what specific conditions they need to master. Not all scenarios are relevant to all skills. For example, a drone landing skill doesn’t need to practice taking off in windy weather.
The configurations you set up when building the scenario flows will appear in the skill configuration modal. For each section of the process, as defined by the flows, tell the selector which scenarios it should apply by checking the boxes next to each scenario.
A is a collection of agents for the same use case. The best way to use Composabl is to build multiple agents within the same project using different design patterns and variations. This allows you to iterate and improve your agents to get the best possible performance.
All agents within a project share the same goals and the same simulator.
To create a project, click on New Project in the upper right hand corner of your dashboard.
You'll be prompted to enter the simulator associated with your project and then choose your team.
You also have the option to use Composabl's AI assistant to help with project set up. This specialized AI assistant is designed to help solve the "cold start" problem and get you started with designing your agent more quickly.
Based on your written description of your project, the AI will automatically create customized goals and skills within your project. You can then access those goals and skill in the Agent Builder studio and edit and adjust them as needed.
When prompting the assistant, use as much detail as you can. You can tell the AI:
What the problem or use case is
What equipment or process the agent will control
What the overall goal is
Anything you know about phases of the process or different scenarios
You can integrate your simulator with the Composabl SDK by using the ServerComposabl class. This class provides the necessary methods for the simulator to interact with the Composabl SDK.
The methods of the ServerComposabl class allow the Composabl SDK to automatically take care of serializing and deserializing the different requests and responses.
To conform your simulator to the Composabl SDK, you must define a server implementation class that defines methods of how to talk with the Composabl SDK.
Make is a request to create a new instance of the environment with the specifications requested.
string env_Id;
Identifier for the type of environment to create.
dictionary env_Init;
Initial configuration for the environment, as defined within the runtime configuration (link to section about how to define runtime configuration parameters)
Sensor_space_info
Sensor_space_info
provides details about the environment’s sensor space.
Action_space_info
Action_space_info
defines the agent's action space.
Action_space_sample
The action_space_sample
function returns an element of the simulator’s action space.
Reset
Reset
is a request to reset the environment, and returns the first observation of the newly reset environment.
observation
Initial observation of the environment.
Dictionary info
Additional information about the reset environment.
Step
Step
provides the agent action to be applied to the environment. The return structure is as follows:
observation
; The observation following the action.
float reward
The reward received after taking the action.
bool terminated
Whether the episode has ended.
bool truncated
Whether the episode was truncated before a natural conclusion.
Dictionary info
Additional information about the step.
Close
Close
denotes the simulator is done being used and may perform any necessary cleanups required.
Set_Scenario
Get_Scenario
Get_scenario
returns the scenario that the simulation is currently running.
Get_render provides the current rendered image of the environment, either as a numpy array or a string.
In this tutorial, we will walk through how to integrate a trained machine learning (ML) model into your Composabl agent as a Perceptor. A perceptor allows your agent to interpret data from sensors, process it using a machine learning model, and output new variables that will help the agent make better decisions.
The goal is to publish a pre-trained ML model as a perceptor that adds a new layer of perception to your agent, enabling it to process sensor data in a more advanced way. This could be useful in a variety of scenarios, such as predictive maintenance, anomaly detection, or autonomous decision-making.
A Perceptor in Composabl is a module in the perception layer that inputs sensor data, processes it (potentially using an ML model), and outputs new variables that are automatically added to the list of available sensors.
For this example, let’s assume we are building a perceptor that uses a trained machine learning model to predict thermal runaway in a system.
We will use a pre-trained ML model stored as a pickle file to predict thermal runaway based on certain temperature and chemical sensor readings. Here’s how to set up the trained ML model for use as a perceptor.
Store the ML Model: Assume the ML model has been trained and saved as a .pkl
file. For this example, the model is stored in the path: ml_models/ml_predict_temperature.pkl
.
Load the ML Model in the Perceptor: In the perceptor class, we will load the model and define how it processes the sensor data.
Now, we’ll create the perceptor using the trained ML model to process the sensor data and predict thermal runaway events. The perceptor will be responsible for calling the model and returning the prediction as a new sensor variable.
We can start by creating the preceptor by using the Composable CLI with the following command:
The new preceptor will have the following file structure:
Here’s the Python code to create the perceptor:
In this perceptor:
We load the trained machine learning model from a pickle file.
The compute()
method takes in sensor data (e.g., temperature, chemical concentrations), processes it, and uses the ML model to predict whether a thermal runaway event will occur.
The perceptor outputs the prediction as a new sensor variable, thermal_runaway_predict
.
Once the perceptor is defined, you can login to the Composabl editor and add it to your agent.
In this tutorial, we covered how to publish a trained ML model as a perceptor in Composabl. This allows the agent to integrate more advanced decision-making by processing raw sensor data through a machine learning model and outputting predictions as new sensor variables. This method can be applied in various domains, such as predictive maintenance, anomaly detection, and control systems.
Skills can be arranged in sequences or hierarchies, in skill groups, or as coordinated skills that output multiple decisions together. The orchestration structures reflect common design patterns that can be used to accelerate the design and creation of agents.
For some agent designs, the task will be broken down into different skills that each control the system under certain conditions. For these agents, a special skill called a selector chooses the right skill at the right time to accomplish the task. Selectors are the specialized supervisor skills that orchestrate the skills together, determining which skill to activate based on the conditions the system needs to respond to.
To add a selector to an agent, drag the selector into your agent above the skills.
You will then be prompted to configure the selector.
You will also be prompted to choose between an additional set of options that correspond to two separate Machine Teaching design patterns.
Fixed-order sequence: perform the skills in a set order. This is used in the , a design pattern that is useful for tasks that involve fixed sequences of actions.
Variable order sequence: perform the skills in any order based on the selector’s determination. This is used in the , a design pattern that is useful for tasks that require different control strategies to be used in different situations or conditions.
Unlike agent designs that use a selector to assign control to skills one at a time, agents with skill groups use skills working together to make decisions.
Skill groups always consist of two skills. To create a skill group, simply drag the second skill under the first, and a skill group will automatically be created.
Skill groups are used for the , where one skill determines what the action should be and a second skill then “turns the knobs” to implement the decision.
In the industrial mixer example, the DRL skill is able to train effectively because the actions of the MPC controller are predictable. That means that it can practice and learn knowing that variations in performance are due to its own actions.
In agents with multiple DRL skills arranged in plan-execute patterns, Composabl will always train the skills from the bottom to the top. In other words, the execute skill will have to achieve competence before the plan skill will start training. That allows each skill to effectively interpret the feedback from the system without confusion from each other.
Some tasks require multiple skills to work together on a single decision, but in parallel rather than in sequence. Agents for these tasks use coordinated skills that learn to take action together toward a goal. Also known as Multi-Agent Training, coordinated skills are trained using a coach, rather than a teacher.
Traffic Optimization: Enhancing traffic flow and safety by teaching individual vehicles to navigate optimally and cooperate with each other.
Collaborative Robotics: Enabling robots to work together on tasks such as assembly in manufacturing or coordination in logistics.
Smart Grids: Optimizing energy distribution by having agents represent power plants, storage, and consumers to improve efficiency and stability.
Multiplayer Games: Creating adaptive and intelligent NPCs that can offer dynamic challenges to players in competitive or cooperative game settings.
Communication Networks: Improving network performance by optimizing resource allocation and traffic routing through agents representing network components.
Environmental Management: Balancing economic, ecological, and social goals in land use and resource management by simulating stakeholders as agents.
Healthcare Logistics: Strategizing resource allocation and treatment plans in scenarios like pandemics by considering the actions of hospitals, pharmacies, and patients as agents.
Supply Chain Optimization: Minimizing costs and delivery times in supply chains by coordinating agents representing various stages of the supply chain process.
Coordinated skills are not yet available in the UI. In the SDK, we have expanded the API to integrate Coordinated Skills through the add_coordinated_skill
method on your agent. This method accepts a new class that gets configured named CoordinatedSkill
, just as with the Teacher
or Controller
classes we implement this class by inheriting from the Coach
class.
The coordinated skill will now take the incoming observation and action spaces and pass it to the sub-skills as a shared environment observation and action taking. The sub-skills will then return their own observations and actions, which will be passed back to the coordinated skill. The coordinated skill will then return the combined observations and actions to the agent.
The performance goal is the most important KPI or metric you will use to evaluate your agent’s success.
This goal directs your agent as it trains. The AI learning technology within the agent will reward the agent when it gets closer to the goal, helping it to improve.
To edit a goal created by the copilot, or to create a new goal, follow these steps.
Navigate to the project page
Click Set up goal and enter a name and description for your goal.
Click Add condition to define the goal.
Use the dropdown menus to select the variables and then define the parameters for each goal you want to include.
Goals apply to one of the sensor variables, and are defined using one of five possible directives:
Avoid: Keep the variable from reaching a specified value
Maximize: Maximize the value of the variable
Minimize: Minimize the value of the variable
Approach: Get the value to the target range as quickly as possible
Maintain: Keep the variable at a specified value
You can also use advanced goal settings to fine-tune your goals. Access the advanced settings by clicking on the settings icon associated with one of the goals. You'll then see additional options to configure your goal.
Tolerance only applies to the three objective types that include a target, Avoid, Approach, and Maintain. This setting allows you to tell the agent to accept a range of values around the target as successful performance. You might use this to prevent the agent from using too much compute power trying to get from good enough to perfect.
Stop value allows you to tell the agent to end the training episode when the variable reaches the target value. This could be because the agent has succeeded and the process is complete, or because it has failed and needs to try again.
Stop steps allows you to tell the agent to end the training episode after a certain number of iterations.
Boundaries are for normalizing rewards. This is useful when the problem has variables that are very different, which can otherwise make it difficult for the trainer to calculate reward. For example, one sensor variable in your agent might have very large values (in the thousands), but another variable might have small values (in the tenths), so you might use boundaries to allow the agent to better compare the two variables.
Scale allows you to provide relative weights to goals to account for goals that have different levels of importance or priority. This is very difficult to get right with Machine Teaching and should be handled with care.
If Docker is installed, you should see the Docker version. If not, you can install Docker by following the instructions on the .
Then, before building the image, ensure that you have a Docker Hub account. If you don't have one, you can create one by going to the .
Go to the Composabl UI by accessing the following link: . You'll be asked to login and then redirected to your dashboard page.
If you choose not to use the AI assistant, just click Skip. You can then configure your and manually.
Composabl’s simulation API extends the
Set_scenario
tells the simulator the current scenario the agent wishes to train on.
Selectors can be trained with DRL just like the other skills, and you set up goals for them the same way. Click on Configure Selector to be taken to the menu to set your selector’s goals. Note that the goals of the agent’s top-level selector should be the same as the .
A common plan-execute design is the set point pattern, where a DRL skill determines what the set point of a system should be, and then a traditional automation technology (a controller that uses model predictive control or optimization) determines what actions are necessary to get the system to the set point. For example, in , the DRL plan skill decides what temperature the liquid in the tank should be. Then the MPC execute skill determines what temperature of coolant needs to be flowed around the tank to achieve the desired set point.
For many business use cases, the top-level goal will be expressed in terms of profit or ROI. For example, in, there are many different factors to consider, but profit is the ultimate goal so it is the most important KPI.
Some use cases will have multiple goals that the agent needs to meet simultaneously. In fact, the need to balance more than one goal is a sign that a use case is a good fit for Machine Teaching. In , the agent needs to get the right concentration of the chemical product but also to keep the temperature in the tank from becoming dangerously hot.
When you you can use natural language to describe your goals for the project. Composabl's copilot assistant uses this information to create a starter goal in the format the platform can use for training. This goal may be ready to use, but more likely you will need to adjust it.
For example, for , we want to maximize the concentration of the product, Ca. We also want to avoid temperature, T, getting above 400 degrees Kelvin.
Drone Control
windy, far_from_charger, low_battery
windspeed=20 (knots)
windspeed between 20-40 (knots)
Autonomous Driving
city, highway, high_traffic
vehicle_speed=65 (miles per hour)
vehicle_speed between 65-85 (miles per hour)
Cybersecurity Network Optimization
normal, high_traffic, cyberattack
site_traffic=0 (clicks per second)
site_traffic between 100-150 (clicks per second)
Process Control
startup, steady_state, shutdown
product_thickness=50 (milimeters)
product_thickness between 49.94-50.06 (milimeters)
Machine Control
break-in, normal, wearing_out
rpm=280 (revolutions per minute)
rpm between 250-295 (revolutions per minute)
In this tutorial, we will walk through how to set up a skill in Composabl that integrates with a third-party API. This type of integration allows your agent to communicate with external systems, such as machine performance APIs, and use the data to make informed decisions.
We will create a programmed skill that connects to a mock third-party API, process its response, and return an action based on the data received. This tutorial will also touch on orchestrating this skill within your agent.
A programmed skill in Composabl is created by specifying the logic for interacting with the external API and processing the response. In this case, we will create a simple API connection to a fake endpoint that returns data about machine performance. The agent will act based on the information received.
We’ll define a programmed skill for making the API request. Here's an example of how to define the skill using a controller function that calls the API and processes the response.
In this example:
The compute_action()
method sends observation data (e.g., from sensors) to a third-party API.
The _call_api()
function makes the API call and handles any errors that might occur.
The _process_response()
function processes the response from the third-party API and determines the appropriate action for the agent to take based on the data.
Once the skill is defined, you can add it to your agent in the UI using the methods below:
Create a new Skill using the Composabl CLI with a given name and description and implementation type, that in this case will be a controller
. The name will be "third_party_api_skill"
Change the controller.py
code to use the class that you created: ThirdPartyAPISkill()
. Change the pyproject.toml
file to include your class ThirdPartyAPISkill
in the entrypoint and its name:
Publish the Skill to the UI
Select your organization and project that you want to publish it to.
Reference: https://docs.composabl.io/changelog/0-8-0.html
Once the skill is defined, you can add it to your agent using the add_skill()
SDK method. This allows the agent to execute the API connection skill when necessary.
Here’s how to add the ThirdPartyAPISkill
to the agent:
By importing and creating the class with SkillController
, you are indicating that this skill is programmed and does not require training. It will use predefined logic to interact with the third-party API and make decisions based on the data returned.
By following these steps, you’ve successfully defined and integrated a programmed skill that communicates with a third-party API into your Composabl agent. The agent can now take actions based on external data and dynamically respond to scenarios.
This approach allows agents to interface with a wide range of external systems, from monitoring equipment to adjusting machine settings, all through programmable skills.
Orchestration of skills through selectors ensures the agent executes the correct skills at the right time, whether the skills are learned or programmed.
Perceptors use the SDK and CLI workflow.
To access a template for a perceptor, type composabl perceptor new
into the CLI. Composabl will then generate a perceptor template that you can populate with your information.
In this simple perceptor example we calculate the perceptor outputs that will be added as new sensor variables and we create a list of perceptors that comprise the perception layer.
Analyzing agent behavior is a key part of using Composabl. This capability is key for:
Building agents that beat performance benchmarks
Monitoring the performance of your agents and simulators in real-time
Analyzing your agents and simulators to understand what went wrong in case of problems
Creating dashboards and reports to share with your team or customers
Composabl offers multiple options for analyzing agent behavior. For high-level summaries, you can use the benchmark reporting features in the UI. For more in-depth information, you can use the Historian in the SDK, which creates a database of all the information gathered during training and deployment. You can also design an LLM module in your agent to provide customizable communications capacity using natural language.
In this tutorial, we will explore how to use the historian to validate the trained AI agent in Composabl and training logs. The historian stores historical time-series data in an optimized format (parquet) - https://www.databricks.com/glossary/what-is-parquet, which helps in evaluating how the agent is performing over training.
The historian file stores time-series data essential for validating agent training. There are several ways to access and store the historian data, but the recommended format is as a delta file (parquet).
Understanding the Format:
The historian data is typically large, around 500 megabytes for standard operations. It is stored in a Delta Lake file format, optimized for time-series data and supporting efficient queries.
This file can be downloaded as XML or another format (e.g., CSV or XLS) but is most efficient in Delta Lake for handling larger datasets.
Downloading the Historian File:
From the Composabl UI, download the historian file. This file will likely come in a compressed format (e.g., .gz
).
After extracting it, you should see the delta file containing time-series data.
Unpacking the Historian File:
If the historian file is compressed (e.g., .gz
), unpack the file using a tool like gzip
:
Once unzipped, you’ll see a 10 MB+ delta file with historical time-series data.
Understanding the Delta File:
The delta file is optimized for fast reads and writes of time-series data.
It supports an append-only structure, which ensures that each new piece of data can be added efficiently without modifying the existing data.
Setting Up a Query Environment:
To validate your agent’s training, you’ll need to set up an environment that allows you to query the delta file. Delta Lake integrates well with systems like Apache Spark, but for simple querying, you can use tools like pandas in Python.
Querying for Agent Training Logs:
Extract and analyze relevant historical data from the delta file. Here's a simple Python example for querying the delta file using pandas:
Optimized Data Handling: The Delta Lake format is designed for fast querying, making it ideal for time-series data.
Efficient Storage: The append-only nature ensures that new data can be added without overwriting or modifying existing data, making it easy to track data over time.
Continuous Monitoring: By continuously adding data to the historian, you can validate your agent's long-term impact on machine performance, uptime, and safety.
Once you have created new agent components in the SDK or configured existing algorithms and models, you can publish them with a simple CLI workflow to make them available in the UI to drag and drop into agent designs.
Agent components are published into . They will then be available to use for all the agents you create for that project.
Log in to Composabl: composabl login.
The system will redirect you to the UI to enter your credentials and log in.
Return to the command line and navigate to the folder containing the skill or selector you want to publish.
Publish the skill or selector: composabl skill publish
or composabl selector publish
.
Select your organization from the dropdown menu.
Select your from the dropdown menu. Save your new skill or selector there.
Your skill or selector will begin publishing. When the process is complete, go to the UI, navigate to the Agent Builder Studio, and refresh your browser to see your new skill or selector in the sidebar.
Log in to Composabl: composabl login.
The system will redirect you to the UI to enter your credentials and log in.
Navigate to the perceptors folder (one level above the individual perceptor folder).
Publish the perceptor: composabl perceptor publish perceptor_name
Select your organization from the dropdown menu.
Select your from the dropdown menu. Save your new skill or selector there.
Your skill or selector will begin publishing. When the process is complete, go to the UI, navigate to the Agent Builder Studio, and refresh your browser to see your new skill or selector in the sidebar.
Adding perception modules to your agent can provide more rich, complex, condensed, and nuanced information to the decision-making parts of the agent. For example, you might include a computer vision model in your perception layer that inputs images or video from a camera and outputs classifications of objects that it identifies. You can also add large language models as perceptors to take in and interpret information in natural language.
Each module in the perception layer for a Composabl agent inputs the sensor variables, processes those variables in some way, and outputs one or more new variables that the platform will automatically add to the list of sensors.
Perceptors can use any supported Python function or library to calculate outputs. They can even call machine learning and large language models or their APIs.
The next three pages explain how to use the SDK and CLI workflow to create new perceptors or configure existing models as perceptors to use in Composabl agents.
Just like skills, perceptors can be dragged and dropped into agents using the UI. Perceptors will always be situated in the Perception layer that comes before selectors and skills. That’s because perception needs to be applied to the sensor inputs to create new variables that are then passed to the skills layer for the agent to use in decision-making.
You can use the Composabl UI to create skills that learn with deep reinforcement learning. When you create a in the UI, you select goals. Composabl then turns these goals into reward functions to train the skill.
To create a skill from the UI, click New skill from the skills panel and name your skill in the modal that pops up. You will then be prompted to configure your skill.
If you want your skill to be taught with deep reinforcement learning, select Teacher. You’ll then be prompted to add goals to your skill for training.
Each skill in your agent succeeds as it approaches a specific goal. The goals of each skill should be clean and simple. If your agent is designed well, based on a good breakdown of the task into skills, each skill will have a clear goal.
Goals apply to one of the sensor variables, and are defined using one of five possible directives:
Avoid: Keep the variable from reaching a specified value
Maximize: Maximize the value of the variable
Minimize: Minimize the value of the variable
Approach: Get the value to the target range as quickly as possible
Maintain: Keep the variable at a specified value
For example, for the industrial mixer, we want to maximize the concentration of the product, Ca. We also want to avoid temperature, T, getting above 400 degrees Kelvin.
You can also use advanced goal settings to fine-tune your goals. Access the advanced settings by clicking on the settings icon associated with one of the goals.
You'll then see additional options to configure your goal.
Tolerance only applies to the three objective types that include a target, Avoid, Approach, and Maintain. This setting allows you to tell the agent to accept a range of values around the target as successful performance. You might use this to prevent the agent from using too much compute power trying to get from good enough to perfect.
Stop value allows you to tell the agent to end the training episode when the variable reaches the target value. This could be because the agent has succeeded and the process is complete, or because it has failed and needs to try again.
Stop steps allows you to tell the agent to end the training episode after a certain number of iterations.
Boundaries are for normalizing rewards. This is useful when the problem has variables that are very different, which can otherwise make it difficult for the trainer to calculate reward. For example, one sensor variable in your agent might have very large values (in the thousands), but another variable might have small values (in the tenths), so you might use boundaries to allow the agent to better compare the two variables.
Scale allows you to provide relative weights to goals to account for goals that have different levels of importance or priority. This is very difficult to get right with Machine Teaching and should be handled with care.
The Training Sessions page allows you to view agents' training in real time and analyze their performance in training.
When you begin a training session, the graphs for each trained skill will begin to generate. You can watch your skills learn by viewing the graphs, or you can click on the Console Output tab for detailed information about each training decision.
The shape of the curve can help you understand how your agent is learning. When the curve plateaus, that usually means that the skill has been succesfully trained and will not learn more. If the curve shows jagged ups and downs, then the skill isn't performing consistently and has more learning to do. Sometimes this is a sign that you should go back and adjust the training settings.
The training sessions page shows a list of all the training sessions for a project in a menu on the left of the screen, allowing you to jump between different agents, as well as different training sessions for the same agent.
Composabl agents use Kubernetes clusters to train at scale. A cluster is a collection of computers that work on large tasks simultaneously. This provides enough compute to complete large training tasks as efficiently as possible.
Composabl offers two options for cluster training:
Use Composabl's Training as a Service offering to train on our clusters
Use your own compute clusters through Azure, AWS or another provider
Before you submit your job for training on a cluster, make sure that your agent is fully configured and all the parameters have been set. That means checking all the agent components:
Goals
Perceptors
Selectors, including goals for learned selectors and scenarios
Skills, including goals for learned skills
Scenarios, including scenario flows
Any component of the agent with a warning sign is not fully configured and not ready for training. Go back to edit that agent component and make sure that all of the fields are filled out.
You can train on your own cluster or on Composabl’s clusters using training as a service (TaaS) credits. If you want to use Composabl’s clusters, ensure that you have credits available.
To train on your own cluster, make sure that you have set your cluster up and installed Composabl successfully.
Click Train and then choose the cluster option in the menu. You will then have the option to configure your training session.
Training session configuration options are the same whether you’re using TaaS or training on your own cluster.
A training cycle is a complete pass through the entire task, with the agent continuing until it reaches success or some other stop criteria. Your agent will train each skill one at a time for the selected number of training cycles, starting from the bottom of the agent design.
A training cycle typically involves about 1,000 agent decisions. Depending on the complexity of the task, agents may need to complete anywhere between 100 and several thousand training cycles to become proficient.
You can run multiple simulators in parallel to speed up training. If you run more than one simulator during a training, the number of training cycles selected will be multiplied by the number of simulators, so 5 training cycles with 3 simulators selected would lead to 15 training cycles total.
You can use the Advanced Configuration to choose how powerful each machine running a simulator should be. If you choose Small, each training cycle selected will result in one training cycle completed. If you choose GPU, you will get 4 training cycles for each training cycle.
More training cycles running simultaneously will speed up training but also increase costs. How long your training takes also depends on the complexity of your agent and your simulator.
When you have configured your settings correctly, click Start Training.
You will then be taken to the Training Sessions page. There you can follow the agent training progress by viewing the real-time plots or the console output.
Note that it will take a few minutes for the visualization to begin.
The Composabl benchmarking feature allows you to compare the performance of different multi-agent systems against key performance indicators (KPIs). This tool helps you evaluate your agent system's effectiveness, track their improvements, and calculate potential return on investment (ROI).
Benchmarks are generated only after training is complete. You won't see benchmark data while training progresses, even for long-running jobs. If a training job fails, no benchmark data will be generated.
Note: Benchmarking does not influence agent system training, goals, or rewards. It is purely an analysis tool that helps you answer the question: "Based on how this agent system was trained and how we believe it should perform in real life, how much money is this agent system generating?"
The key performance indicator (KPI) for your project is the top-level objective that you will use to compare and evaluate your multi-agent systems' performance.
To define the KPI for your project, choose the sensor variable that represents the metric that best indicates performance success. For most processes, this is a metric with a clear business impact, such as product yield or energy use.
To set up the KPI:
Navigate to the Benchmark page from the main dashboard
Click the "Settings" button in the top right of the KPI Performance Metrics section
In the "Set up KPI" section, configure the following:
Sensor Name: Select a sensor or preceptor variable to track (e.g., "Eps_Yield")
Target Value: Set the desired target for this KPI (e.g., 0.33)
Benchmark Unit: Select the unit of measurement (e.g., "% of Max Theoretical Yield"). Note that this is for UI readability only and doesn't affect calculations.
Return on Investment (ROI) calculations help quantify the financial impact of your agent systems' performance. The ROI is calculated based on the monetary value of the difference between your target value and the actual value of the KPI generated by that agent system, minus training costs.
In the Settings modal, navigate to the "Enter ROI criteria" section
Define the conditions for calculating ROI:
When: Select the metric to track (e.g., "Eps_Yield")
Condition: Choose how the metric changes (e.g., "increases by")
Percentage: Enter the percentage change that matters (e.g., 2%)
Value: Set the financial value of this change (e.g., $1,000,000)
Period: Select the time period (e.g., "per year")
For example: "When Eps_Yield increases by 1% of max theoretical yield, it is worth $1,000,000 per year."
You can change KPI and ROI settings at any time after training completes, and the benchmark results will be automatically recalculated. This allows you to explore different business scenarios without retraining your agent systems.
At the top of the benchmarking page, you'll find an Overview panel containing three key metrics:
Highest performing Agent System: Displays the outcome of the highest performing agent system
Lowest performing Agent System: Displays the outcome of the lowest performing agent system
ROI: Shows the calculated financial return based on your KPI settings
Each metric displays both the value and the relevant unit (e.g., "% of Max Theoretical Yield").
The bar graph in the KPI Performance Metrics section provides a visual comparison of your agent systems:
Each bar represents an agent system's performance on your selected KPI (all bars correspond to the same KPI)
The height of the bar indicates the specific value of the KPI for that agent system (Y-axis)
Horizontal dashed lines show benchmark averages
The actual value of the KPI for each agent system is shown above the respective bar
Below the graph is a detailed table showing:
Agent System Number: Sequential ID for each agent system
Agent System Name: Name or description of the agent system
ROI: Calculated financial return based on the agent system's performance
Green arrows (↑) indicate positive ROI with the percentage and absolute value
Red arrows (↓) indicate negative ROI with the percentage and absolute value
Eps_Yield (variable) per % of Max Theoretical Yield: Performance metric with indicators for highest agent system performers
Deploying an agent means exporting the trained agent, loading it into your production environment, and then asking the agent for decisions. In Composabl, the export is a json file called agent.json
that contains all you need to deploy your agent. You can load the agent file to use in your IT infrastructure with many ways.
This document will show how you can deploy your agent as an API using Python and Flask.
To deploy the agent as an API, we need to extract the agent.json
model, get the agent_api.py
script to start the API, requirements.txt
to install packages. You can find a sample for these files in our GitHub repo: https://github.com/Composabl/examples.composabl.io/tree/main/deploy_agent
This is the structure needed for the API:
*** How to extract the agent.json ***
Log into the Composabl UI (https://app.composabl.com/onboarding), train your agent, and navigate to the training sessions section.
Check the status of the agent:
Green status (Done) indicates finished training.
Download the agent essential file:
The agent file (a .gz
compressed file).
Extract the agent file agent.json
to the model
folder
Run the following in your terminal:
It will install these packages:
composabl
flask[async]
numpy
To start Composabl API, you will need to export your license as an environment variable and then use Python to start the Flask API with your agent.
After running the API, you can test it by opening the terminal and run the script below:
In the POST request, we pass the use case "observation" with sensor variables and their values to receive an action from the agent. The code above is related to the agent.json
demo for Chemical Process Control.
Once you have completed training an agent and are ready to deploy it you can download it from the UI.
Go to the Training
tab in your project.
Select the training session for the agent you want to deploy.
Click on the Artifacts
drop down in the upper right of the screen.
Click Inference Model
and you will down load a file called agent.json
.
Based on the notes provided in the image, We'll help you create a tutorial on accessing the agent runtime after deploying it to Docker. This tutorial will explain the steps for building the Docker container, deploying the agent, and accessing the runtime for inference or interaction.
Once you have packaged and deployed your agent inside a Docker container (https://docs.composabl.com/deploy-agents/deploy-an-agent-in-a-container), the next step is accessing its runtime for operations like model inference. This tutorial will guide you through the process of building and running the Docker container and then connecting to the agent's runtime for further interactions.
To deploy the agent to Docker, we need to first create an image from the Dockerfile (https://docs.composabl.com/deploy-agents/deploy-an-agent-in-a-container). The Dockerfile will package the necessary runtime, model, and environment for the agent.
Building the Image: You can build the Docker image by running the following command in the terminal. This will take the Dockerfile and the associated files (like the pre-trained model) and create an image.
The -t
flag allows you to tag the image (composabl_agent_api
), which makes it easier to reference later.
Make sure that the model file (agent.json
) and all relevant scripts are reachable within the Docker context (i.e., the directory from which you are building).
Checking the Image: Once the build is complete, you can verify that the image was created successfully by running:
Now that the image is built, the next step is to run it in a container. You will run the Docker container in an interactive mode to access the runtime.
-it
: Runs the container interactively.
-p 8000:8000
: Maps port 8000 from the container to port 8000 on your local machine so that you can access the HTTP server for the agent runtime.
-e
COMPOSABL_LICENSE="<your_license>" : is exporting the environment variable and linking to your composabl license
The HTTP server should now be up and running within the container, ready to handle model inference or other tasks.
With the Docker container running, you can now connect to the agent's runtime. The runtime will be an HTTP server, as mentioned in your notes. You can access it through a POST request for model inference or other operations.
Sending Requests to the Agent: You can send a POST request to the running server using a tool like curl
, Postman, or any Python HTTP library (such as requests
).
Here’s an example using curl
:
This request will:
POST data to the /predict
endpoint on localhost:8000
, which is being forwarded from the Docker container.
The agent will handle the request, infer the model, and return the action as a result.
In this tutorial, we walked through the process of:
Building a Docker image with your agent and its runtime.
Running the Docker container interactively to expose the agent’s HTTP server.
Accessing the agent runtime by sending HTTP requests for inference or other tasks.
By following these steps, you can deploy and interact with your Composabl agent in a Dockerized environment.s
Based on the notes provided in the image, I'll help you create a tutorial on accessing the agent runtime after deploying it to Docker. This tutorial will explain the steps for building the Docker container, deploying the agent, and accessing the runtime for inference or interaction.
Once you have packaged and deployed your agent inside a Docker container, the next step is accessing its runtime for operations like model inference. This tutorial will guide you through the process of building and running the Docker container and then connecting to the agent's runtime for further interactions.
To deploy the agent to Docker, we need to first create a Dockerfile. The Dockerfile will package the necessary runtime, model, and environment for the agent.
Dockerfile Setup: Your Dockerfile should contain the following key components:
Base Image: Use a Python base image (or any base that supports the necessary libraries).
Copy Model Files: Copy the pre-trained model (e.g., .gz
file) to the container.
Install Dependencies: Install any required Python libraries (like OHTTP or other packages for the agent).
Here’s an example Dockerfile:
Building the Image: You can build the Docker image by running the following command in the terminal. This will take the Dockerfile and the associated files (like the pre-trained model) and create an image.
The -t
flag allows you to tag the image (my-agent-runtime
), which makes it easier to reference later.
Make sure that the model file (agent.gz
) and all relevant scripts are reachable within the Docker context (i.e., the directory from which you are building).
Checking the Image: Once the build is complete, you can verify that the image was created successfully by running:
Now that the image is built, the next step is to run it in a container. You will run the Docker container in an interactive mode to access the runtime.
-it
: Runs the container interactively.
-p 8000:8000
: Maps port 8000 from the container to port 8000 on your local machine so that you can access the HTTP server for the agent runtime.
The HTTP server should now be up and running within the container, ready to handle model inference or other tasks.
With the Docker container running, you can now connect to the agent's runtime. The runtime will be an HTTP server, as mentioned in your notes. You can access it through a POST request for model inference or other operations.
Sending Requests to the Agent: You can send a POST request to the running server using a tool like curl
, Postman, or any Python HTTP library (such as requests
).
Here’s an example using curl
:
This request will:
POST data to the /infer
endpoint on localhost:8000
, which is being forwarded from the Docker container.
The agent will handle the request, infer the model, and return the result.
Interacting with the Agent: If you prefer to interact with the agent directly, you can also enter the container’s interactive mode and run commands.
This will open a shell inside the running Docker container, allowing you to execute any runtime commands manually.
For convenience, you can automate the entire process of building the image, running the container, and interacting with the agent by creating a script.
Here’s a basic example of an automation script:
Save this as run_agent.sh
, and then execute it:
This script will:
Build the Docker image.
Run the container, mapping the necessary port and exposing the HTTP server for inference.
If the container fails to start, or if the server doesn't respond, you can debug the container by checking the logs:
This command will display the output of the running container, which can help diagnose issues like missing dependencies or server errors.
In this tutorial, we walked through the process of:
Building a Docker image with your agent and its runtime.
Running the Docker container interactively to expose the agent’s HTTP server.
Accessing the agent runtime by sending HTTP requests for inference or other tasks.
By following these steps, you can deploy and interact with your Composabl agent in a Dockerized environment._
In this tutorial, we will cover how to connect to the agent runtime, load a pre-trained agent, run inference, and visualize the results in a production like evironment. The provided script, agent_inference.py
, is a key component that demonstrates connecting to the Composabl agent runtime, initializing the environment, and plotting agent operation results.
agent_inference.py
The script agent_inference.py
connects to the runtime, loads a pre-trained agent, connects to a local simulation, collects sensor data from the sim and plots the results. Here is an outline of the core steps in the process:
Start Runtime and Load Agent: The script initializes the trainer and loads a pre-trained agent from a model folder.
Set Up the Simulation Environment: It connects to a simulation environment.
Run Inference: The pre-trained agent interacts with the simulation to perform inference (decisions), collecting observations and giving actions at each step.
Collect Data and Plot Results: Sensor data and actions are collected in a Pandas DataFrame, and the results are plotted using Matplotlib to visualize how the agent is performing over time in a production like environment.
The first task is to connect to the Composabl runtime and load the pre-trained agent. This is accomplished using the Trainer
and Agent
classes. The agent's model is loaded from the directory where the model was saved during training.
Here:
Trainer(config)
initializes the runtime with a configuration file.
Agent.load(PATH_CHECKPOINTS)
loads the saved agent from the specified checkpoint directory.
trainer._package(agent)
prepares the agent for inference by packaging it.
Next, we connect the agent to the simulation environment. The make()
function creates a connection to the local simulator, and the environment is initialized.
Here:
The simulator is configured to run locally (localhost:1337) and you have to start it locally and manually before.
The environment is initialized with sim.init()
, and the agent is connected to it.
After connecting to the simulator, you need to set up the specific scenario that the agent will operate in. This scenario determines the environment's initial state.
With the environment set, the agent can now run inference for a set number of iterations. At each iteration, the agent observes the environment, takes an action, and collects the results (observations and rewards). This is done in a loop.
In each iteration:
The agent performs an action based on the current observations.
The environment advances one step with sim.step(action)
, and the agent receives a new observation and reward.
Sensor data and actions are logged into a Pandas DataFrame for later analysis.
Once the inference loop is complete, the collected data is saved, and the results are visualized. The results are plotted using Matplotlib.
This code generates three subplots:
Temperature Controller (Tc) over time.
Temperature (T) and Reference Temperature (Tref) over time.
Concentration (Ca) and Reference Concentration (Cref) over time.
The plots provide a visual representation of the agent's performance during the simulation. Finally, the figure is saved as inference_figure.png
in the benchmarks directory.
To run the script, execute the agent_inference.py
in your terminal.
In this tutorial, we demonstrated how to:
Connect a pre-trained Composabl agent to a runtime and simulation environment.
Set up a scenario and run inference.
Collect observations and actions, and plot the results using Matplotlib.
By following these steps, you can visualize the performance of your agent and gain insights into how it interacts with the environment over time.
The Composabl SDK offers a suite of advanced tools to train skills using deep reinforcement learning. Using the Python teacher class, you can fine-tune the rewards for your skills. Once you have configured a skill with the SDK, you can publish it to the UI to use in agent designs.
To create a skill in the Python SDK, begin by logging in to the SDK by typing Composabl login
from the CLI.
Then type composabl skill new
.
Give the skill a name and a description in response to the prompts that follow. Choose whether your skill should be a teacher (learned with AI), controller (a programmed module like an optimization algorithm or MPC controller), or coach.
Specify the folder where you’d like to create the skill.
The Composal SDK will create a folder and Python teacher file from the template.
The Python teacher class offers several functions that you can use to fine-tune the training of your skills.
compute_reward
FunctionThe compute_termination
function tells the Composabl platform when to terminate a practice episode and start over with a new practice scenario (episode). From a teaching perspective, it makes most senses to terminate an episode when the agent succeeds, fails, or is pursuing a course of action that you do not find likely to succeed. This function returns a Boolean flag (True
or False
) whether to terminate the episode. You can calculate this criteria however seems best.
The success_criteria
function provides a definition of skill success and a proxy for how completely the agent has learned the skill. The platform uses the output of this function (True
or False
) to calculate when to stop training one skill and move on to training the next skill. It is also used to determine when to move to the next skill in a fixed order sequence. The agent cannot move from one skill in a fixed order sequence to the next, until the success criteria for one skill is reached.
Here are some examples of success criteria definition:
A simple but naive success criteria might return True
if the average reward for an episode or scenario crosses a threshold, but False
if it does not.
A more complex success criteria might calculate root mean squared error (RMSE) for key variables across the episode and return True
if the error is less than a customer specified benchmark, but False
otherwise.
A complex success criteria might compare a benchmark controller or another agent to the agent across many key variables and trials. It returns True
if the agent beats the benchmark on this criteria, but False
otherwise.
Training with goals lets you use a predefined reward structure rather than configuring the rewards individually. When you use a goal, your agent will inherit the compute reward, compute termination, and compute success functions from the goal. (You will still have the option to further customize those functions as needed.)
The five goal types you can use are:
AvoidGoal
MaximizeGoal
MinimizeGoal
ApproachGoal
MaintainGoal
Goals are added using specialized teacher classes rather than the general teacher class that you would otherwise use to teach skills. For example, for a skill named Balance that you wanted to train with a goal to maintain a specific orientation, you would use the MaintainGoal teacher class.
The parameters you can use for goals are:
You can also use more than one goal for a single skill using the CoordinatedGoal
teacher class. This is useful when your agent needs to behave in a way that creates a balance between two goals that are both important.
Just like rules guide training and behavior for humans, providing rules for the agent to follow can guide agent decision-making more quickly to success. Rules guide the behavior of an agent based on expertise and constraints.
compute_action_mask
FunctionThe compute_action_mask
teaching function expresses rules that trainable agents must follow.
The compute_action_mask
teaching function works only for discrete action spaces (where the actions are integers or categories), not for continuous action spaces (where decision actions are decimal numbers). If you specify a mask for a skill whose actions are continuous, the platform will ignore the action mask.
The function returns a list of 0 and 1 values. Zero means that the action is forbidden by the rule. One means that the action is allowed by the rule. The function may change returned value after each decision. This allows complex logic to express nuanced rules.
In the example above, the first action is forbidden for the next decision, but the second and third actions are allowed. The logic in the skill itself (whether learned or programmed) will choose between the allowed second and third actions.
All selectors have a discrete action space (they choose which child skill to activate), so you can always apply the compute_action_mask
function to teach them.
As information passes through perceptors, skills, and selectors in the agent, sometimes it needs to change format along the way. You can use three teaching functions to transform sensor and action variables inside agents: transform_
sensors, transform_action
, and filtered_
sensor _space
.
To transform sensor variables, use the transform_sensor
function to calculate changes to specific sensors, then return the complete set of sensor variables (the observation space).
Two of the most common reasons for transforming sensor variables are conversion and normalization. For example, if a simulator reports temperature values in Fahrenheit, but the agent expects temperature values in Celsius, use the transform_sensor
function to convert between the two.
Normalization is when you transform variables into different ranges. For example, one sensor variable in your agent might have very large values (in the thousands), but another variable might have small values (in the tenths), so you might use the transform_sensor
function to transform these disparate sensor values to a range from 0 to 1 so that they can better be compared and used in the agent.
You may want to transform action variables for the same reasons as sensor variables.
Use the filtered_sensor_space
function to pare down the list of sensor variables you need for a particular skill. Pass only the information that a skill or module needs in order to learn or perform well.
The compute_reward
function provides the bulk of the feedback after each agent action about how much that action contributed to the success of the skill. This function returns a number that represents the reward signal the agent will receive for its last decision. Reward functions, as they are called in reinforcement learning, can be tricky to craft. .
These have the same parameters and work the same way as .
Due to the many options available to you for installing Kubernetes clusters, this document will not go into the specifics of setting up the cluster. Rather, it will provide you with guidance and requirements for your cluster.
Depending on whether you want to use GPUs or not, you need the following nodes:
Nodes that are always required:
"main": This means nodes to run the control plane. The Composabl controller does not interact with these nodes, so they should be provisioned as recommended by the Kubernetes distribution you use.
"composabl": This node or nodes will be where the Composabl controller and Historian software are scheduled.
"envrunners": These nodes will handle training workloads. If you're not using GPUs, all training will be done on these nodes. If you are, these nodes will manage the communication with the simulators, and can be reduced in size
"simscpu": These nodes are where the simulators will be scheduled. Sizing depends on the simulator.
If you want to use GPU training, you need the following nodepool: 5. "learners": These nodes with GPUs will accelerate the learning step of the training process.
If your simulator can be accelerated using GPU, you can add the final node pool: 6. "simsgpu": These will run simulators, assigning a GPU to them.
A note on GPUs: Currently, only Nvidia GPUs are supported. The cluster must have the nvidia-gpu-operator installed for training on GPU to be enabled.
Whether or not you use autoscaling using cluster-autoscaler, each node type must be sized accordingly.
main
: As required by your Kubernetes distribution
composabl
: In total, 16GB of memory and 4 CPU - with 1 node at least 8GB of memory.
envrunners
: If not using GPUs, we recommend 8 CPU and 8 or 16 GB of memory. In any case, the number of simulators that can be managed by each envrunner instance depends on the number of CPU
simscpu
: The sizing of these nodes depends on the resource requirements of your simulator
learners
: These nodes should have 1 Nvidia GPU. Other resources can be limited - 2 CPU and 8GB of memory is sufficient
simsgpu
: As with simscpu
, depends on the simulator requirements.
All groups of nodes must be labeled accordingly. The name, as given in the sizing guide is the name you should set as the agentpool label.
You may be able to define this during your cluster setup, but if not, you can use the following commands:
Replace the values in between <>
with the name of the nodes you'd like to assign to a specific pool.
The components also need access to (semi)persistent, shared storage. This section will detail the types and amount of storage needed.
It needs the following PersistentVolumeClaim
s in the composabl-train
namespace:
pvc-controller-data
with a size of ±1Gi
and ReadWriteOnce
(or better) accessMode
When using Azure, you will need to set the nobrl
mountOption for this PVC, as this is required for the Composabl controller to function.
pvc-training-results
with a suitable size - this is where your final agent data will be stored before it is uploaded to the No-code application. It needs accessmode
to be ReadWriteMany
(RWX). A good initial size is to match historian-tmp
.
historian-tmp
is used as temporary storage for historian data. It needs to have an accessMode
of ReadWriteOnce
and the size will depend on the length of your training sessions. We recommend starting with 5Gi
.
The size of pvc-training-results
and historian-tmp
is dependent on the amount and size of training jobs you want to run simultaneously on your cluster. If you plan on running long-lived training sessions with many cycles, you may want to increase the capacity for both,
If you want to use a private registry for simulator images, you will need to set up this private registry yourself, and make sure the cluster is able to pull images from this registry.
Once your cluster is running, and you have verified your setup is working, you can continue to Installing Composabl
This guide covers setting up your Composabl training cluster using pulumi, an Infrastructure as Code tool.
This example uses Azure Kubernetes Service, but can be adapted to other supported providers.
An Azure subscription with sufficient permissions to create and update various resources
If you're following along in typescript, a working installation of NodeJS
A new pulumi project, as per the pulumi documentation. You can find the documentation for Azure here
We will be deploying the following resources to your Azure subscription:
Resource group, containing all resources
A container registry, to hold simulator images
An AKS cluster
The resource group will contain all resources. It is also what determines in what Azure location the resources will be deployed.
At the end, we export the name of the resource group (which will be randomized by pulumi) for further use in our definition
The container registry is where you will be able to privately store your simulator docker images, if any.
The cluster is where both the Composabl components and your training will be running. This configuration is more complex, so additional information will be provided as comments in the typescript definition:
If you want to enable GPU training and GPU-enhanced simulators, you will also need to add the following pools.
In addition, you will also need to install the nvidia-gpu-operator on the cluster. This can be done according to the instructions on the project website.
Finally, GPU_ENABLED
must be set to true
on the Composabl controller deployment, if it hasn't been already.
Autoscaling:
This template enables autoscaling to have the cluster automatically scale to the required size and back down afterward to reduce costs.
You can disable autoscaling by removing the minCount
, maxCount
and enableAutoScaling
properties, but you'll have to set the count
value accordingly.
vmSize
: The vmSizes used above can be adjusted to instances that adhere more to your needs.
You can use an LLM as a to add language and communication capabilities to your agent.
This allows you to create human-like assistants or copilots who can contribute natural language capabilities to your agent. Composabl has several personas for LLM assistants to help structure your agent design.
The analyst interprets sensor data and passes it to an interface that the user can access, allowing real-time monitoring of conditions and the agent's responses.
The executive reads external data sources in text and reports information to the agent, such as trends in the business press that would help to anticipate demand for a product
The plant manager allows operators to communicate directly with the agent and give it instructions based on information that would not be otherwise available in its sensor space
LLM perceptors can either:
Output language to the operator about what the agent is doing (ex. the analyst)
Take in inputs in natural language and then transform them into information that the decision-making layer of the agent can use (ex. the executive and plant manager)
From the CLI, when logged into Composabl, type composabl perceptor new
. You will be prompted for a location to save your new perceptor, and then a new directory with your perceptor will be created.
This will include the pyproject.toml
file that will allow you to publish the perceptor to the UI once it is created.
Within the perceptor.py
file, create the API call and prompt for the LLM.
The analyst displays information to the human user, but doesn't send information to the decision-making layer of the agent, so the perceptor returns 0.
The executive in this sample related to the industrial mixer use case automatically queries a chemical engineering LLM for advice about control actions to take.
The perceptor returns an action that it recommends the decision-making layer of the skill to take. This becomes a new sensor variable that the skill teacher(s) will take into account when training the agent in simulation.
See full code samples and more examples.
Composabl agents can include text fields in perceptors, but they must be transformed or filtered out in the teacher.py
file before training with DRL. For any text variables that are not transformed into a different data type, use the filtered_sensor_space
method of the teacher to remove them.
Publish the perceptor to the UI.
Naviage to the folder above your perceptor. Then publish your perceptor.
Select the organization and project that your perceptor to add your perceptor. The refresh your Agent Builder Studio to see the perceptor and add it to agents.
After training a multi-agent system in Composabl, the system automatically runs a series of standardized tests to evaluate its performance. This benchmarking process:
Places the system in controlled testing environments
Records detailed metrics at each step of operation
Aggregates results to provide comprehensive performance statistics
The output of this testing process is compiled into a structured benchmark.json file, which contains rich performance data that can be analyzed to assess effectiveness, identify improvement opportunities, and compare different design approaches. This file is a performance record and a valuable analytics resource for optimizing your agentic systems.
To download benchmark data for further analysis:
Navigate to the "Training Sessions" page
Click the artifacts dropdown in the top right page of a trained system
Select "Benchmark"
The benchmark.json file will be saved to your local machine
The benchmark.json file contains structured data about the performance of a trained agent system. Here's how to interpret this file:
Scenario Data: Contains reference values for the scenario:
Episode Data: Array of state-action pairs showing how the agent performed in each step:
Aggregate Statistics: Summary statistics for the entire benchmark:
This page covers manually installing the Composabl components onto the cluster.
To install Composabl on your cluster, you'll need access to the Composabl No-Code UI and a valid license key.
Additionally, to run the installation script, you need to have Kubectl set up with access to the cluster.
Log in to the No-Code app
Navigate to the Clusters page
Click "Create"
In the pop-up, choose a display name for your cluster
Select the cloud vendor you're using, or "custom" if none.
Copy the command
Run the command in your terminal where you have access to a kubectl
When the command exits without errors, the installation was complete.
Refresh the clusters page. Your custom cluster should now say "Connected" and list its specifications.
You can now select the cluster from the dropdown menu when training.
You can find the Composabl Training Cluster Azure application here
This will set up a fully functional Composabl Training Cluster on your Azure tenant.
An active Azure subscription
Quota for the supported VM sizes (see later during setup)
If using GPU training, you'll need quota for GPU-enabled nodes (such as the NCasT4_v3 series)
Setting up your cluster consists of the following steps:
Registering a new cluster in Composabl
Running through the wizard in Azure
Head to Composabl and navigate to the clusters page.
Choose "Create Cluster"
Select "Azure Cluster"
Take note of the Cluster ID and License key - you'll need them during the setup wizard on Azure, then click "Create on Azure"
Visit the Microsoft Azure Marketplace page here
Click "Create" to start the setup wizard
On the first step, select a resource group to create the cluster in. We recommend you use a new Resource Group made specifically for this purpose. Give the resource a descriptive name.
On the second step, fill in the following:
Cluster name - this is a technical name and should be unique for all AKS clusters in your Azure subscription
Kubernetes version - select the latest version, unless a policy prevents you from doing so. Only supported versions are listed.
Container registry name - this is the name of the container registry you'll be pushing your simulator images to. As with the Cluster name, this name has to be unique for all Azure Container Registries in your Azure subscription.
GPU Training - Check this if you want to use GPUs to accelerate training.
Various VM sizes - See VM sizing and types for additional information on each of these
Scaling settings - Whether to enable autoscaling, and limits for the autoscaling, if enabled. Note that we recommend you leave autoscaling enabled, as this can significantly reduce costs.
On the final step, fill in the Cluster ID and License key you received during your Cluster Registration
The advanced settings should not be used unless you require specific features to be enabled for testing or support purposes.
Finally, click "Review + Create" to review the settings and create your training cluster.
A Composabl cluster consists of up to 6 different node types:
Main nodes - these are Kubernetes system nodes, used for running the Kubernetes master processes
Composabl nodes - these nodes run the Composabl software - e.g. the Composabl Controller and the Historian processor
Environment runner nodes - The Composabl SDK will manage the simulator instances from these nodes. If GPU training is disabled, the training process will run on these nodes as well.
Simulator nodes: These nodes are where the simulator(s) will be running
If you have enabled GPU training and/or GPU simulators, you will have the following: 5. (GPU Training enabled) Learner nodes - these GPU-enabled nodes will perform the actual training on GPU, after samples are collected from the simulators 6. (GPU Training + GPU Sims enabled) - these nodes will run GPU-enabled simulators.
For most cases, we recommend leaving the default sizing. If no quota is available in these for you, you have the option to change to a similar node size for any of these.
The strategy pattern is one of the key design patterns of Machine Teaching. When you use the strategy pattern, you break down the task into specific skills that each handle one aspect of the process to be controlled. This allows you to "teach" the agent using subject matter expertise.
In the strategy pattern, each skill is either trained using deep reinforcement learning or controlled with a programmed algorithm. Then a special skill called a selector decides which skill should make the decision based on the current conditions.
In the industrial mixer problem, the process is divided into three skills based on the phase of the process. All three action skills and the selector are trained with DRL: each skill practices in the conditions it will face and learns to control its part of the process by experimenting over time.
Think of the strategy pattern as like a math class with three students. Student A loves fractions, Student B is great at decimal problems, and Student C thinks in percentages. The selector is their teacher. She reads each question, sees what kind of problem it is, and then assigns it to the student who can solve it best, because of their own special math talent.
Let's get started configuring this agent!
This agent has three skills called start_reaction
, control_transition
, and produce_product
. To publish them to your project you will need to open up your favorite code editor and terminal. In your terminal, navigate to the skills folder and use the command with the Composable CLI.
Return to the agent builder studio and refresh the page. The skills will appear in the skills menu on the left of your page.
All skills, perceptors, and selectors have a minimum of two files in them. A Python file contains the code that the agent will use and a config file.
pyproject.toml
, a config file with the following information.
A Python file. For this agent, we use teacher functions with the following code and explanations in the comments inline.
File Structure
Drag the skills start_reaction
, control_transition
, and produce_product
that you can now see on the left-hand side of your project onto the skills layer. Drag the skills from the side in the order you would like them to be used.
The green diamond that appears when you place multiple skills alongside each other is the selector. This is the "math teacher" skill that makes the decision about which of the action skills should be chosen to make each decision.
Click on the selector to configure it. In this case, the default configurations are most likely correct.
The goals of the top-level selector in an agent should be the same as the goals of the agent as a whole. When the UI automatically creates a selector, it adds the project-level goals by default.
Make sure that the goals for your project are set up. The selector goals should then automatically be set to:
Maximize eps_yield
Avoid T = 400K
For a phased process like the industrial mixer reaction, a fixed-order sequence is appropriate. That means that the selector has the agent apply the skills one at a time, rather than switching back and forth between skills.
Scenarios are a key piece of successfully training an agent with the strategy pattern. Scenarios are different possible conditions represented within the simulation. Skills train to specialize in the different scenarios - for example, the Start Reaction skill specializes in controlling the reaction when the temperature and concentration levels are those found at the beginning of the reaction.
This is what allows the skills to differentiate from each other. The three specialized skills practice only on their designated phase of the process and learn to succeed in their own specific conditions. The selector practices with the whole process so that it knows which skill to choose at any point.
Go to the Scenarios page using the lefthand navigation menu. There, click Add Scenario to create a new scenario for your agent to use in training.
When you are building an agent for your own use case, you will define the scenarios based on your knowledge of the task and process. In this case, we provide the values that define the phases of the chemical manufacturing process. Create these scenarios for your agent:
Full reaction: Cref Is 8.57, Tref Is 311 |
Startup: Cref Is 8.5698, Tref Is 311.2612 |
Transition: Cref Is 8.56, Tref Is 311, Is 22 |
Production: Cref Is 2, Tref Is 373.1311 |
Scenario flows allow you to connect scenarios that have a sequential relationship to ensure that your agent gets practice in navigating the different conditions in the order in which they will occur.
For this problem, you do not need to create sequential connections between the scenarios. Drag all the scenarios to the first column to make them available to your skills and selectors.
Once you have your scenarios set up and connected with scenario flows, you can add them to skills and selectors to tell the skills and selectors what conditions they need to practice in. This helps them to develop their specialized expertise.
In the Agent Builder Studio, click on each skill and the selector in turn. For each, click on Scenarios and then click the dropdown arrows to show the available scenarios. Check the box for each scenario to apply to the skill.
Start reaction: Startup
Transition: Transition
Produce product: Production
Selector: Full reaction
Now, we are ready to train your agent and see the results. We suggest you run 50 training cycles. You will see the skills training one at a time. Each skill will train the selected number of cycles.
When the training has been completed, you can view your results in the training sessions tab in the UI. This will show you information on how well the agent is learning.
You will likely see a steep learning curve as the agent experiments with different control strategies and learns from the results. When the learning curve plateaus, that usually means that the skill is trained.
Conversion rate: 92% Thermal runaway risk: Low
We tested this fully trained agent and plotted the results.
This agent performance is not perfect, but it stays closer to the benchmark line than either of the two single-skill agents. It just needs some help avoiding thermal runaway. We can provide that by adding a perception layer.
If you use Windows Subsystem for Linux (WSL) on a corporate machine with internal security tools, such as Zscaler, you may encounter issues installing the SDK due to certificate replacement during HTTPS traffic inspection. Below is a method to resolve this issue by importing the appropriate certificates into WSL.
Identify the Certificates
Click on the padlock symbol next to the URL to view the site certificates. This method may vary slightly depending on your browser.
Export the certificates in the chain needed to trust the "app.composabl.com" certificate, which may be replaced by Zscaler or similar security tools. The intermediate and root certificates are often issued by your organization's IT department.
Export the Certificates
Export the certificates in Base 64 encoded X.509 (PEM) format.
Make sure the exported certificate files have a .crt
extension.
Copy Certificates to WSL
Copy the exported certificates to /usr/local/share/ca-certificates
in WSL. You can use any file transfer method between Windows and WSL.
Update the Certificate Store
Run the following command to update the certificate store in WSL: sudo update-ca-certificates\
Verify the Certificate
You can verify if the certificates are correctly recognized by running: openssl s_client -connect api.composabl.com:443 -prexit
This command should connect and print the certificate chain and other relevant information.
Removing Certificates
If you need to remove the certificates, delete them from /usr/local/share/ca-certificates
and then run: `sudo update-ca-certificates --fresh`
This command will rebuild the certificate bundle from the remaining individual certificates.
Alternative Method: Export from Windows Certificate Store
The intermediate/root certificates may already be available in the Windows certificate store. You can identify and export the required certificates from there if needed.
By following these steps, you should be able to configure your WSL environment to trust certificates replaced by corporate security tools, enabling the installation of the SDK without issues.
Open in your browser.
To train Composabl agents on a cluster, you need to install the Composabl controller onto a compatible Kubernetes cluster.
This document will guide you through setting up this cluster using various options:
Azure: Set up a cluster through our Azure Marketplace Composabl Training Cluster offer
Automated: Set up a compatible Kubernetes cluster on Azure through the use of Pulumi (IaaS)
Manually: Set up a compatible Kubernetes cluster through any tool and infrastructure of your choice
The entire Composabl controller installation consists of two main components:
Controller: This component connects to the Composabl No-code ui, and allows you to initiate training on your cluster.
Historian: The historian component records detailed statistics regarding the training process of your agent. After training, these will be available alongside the agent in the no-code UI
The installation of Composabl will introduce 2 namespaces, composabl-train
and composabl-sims
, for training workers and simulators respectively.
For normal operation, a series of permissions are required by the ServiceAccount
s of the Composabl installation.
We currently have the following guides to setting up your own cluster: