Anatomy of a Multi-Agent System
Last updated
Last updated
Multi-agent systems have structure, just like our brains. Different parts perform different functions.
This modularity is a key to building high performing agentic systems quickly and effectively. Most complex systems, from factories to rockets to software programs, are broken into modular pieces to separate concerns, reuse design patterns, and integrate pre-built components. Like these systems, decision-making AI also works best when decisions are separated into modular concerns. Modular structure makes intelligent agents easier to build, test, and maintain.
Multi-agent systems work through a feedback loop. The system gets information from the simulation or real environment and then makes a decision to adjust one or more of its control variables in some way. This creates a change that is reflected in new sensor information that the agent can then process and use to decide on the next adjustment.
While agents are being built and trained, they are connected to that give them feedback about how their actions affect the system they are learning to control. Once an agent is deployed, the simulator is replaced by an equivalent connection to the real system.
The simulator and the multi-agent system function in a continuous feedback loop, with the system outputting actions that affect the simulation environment, and the simulator returning new information to the agent based on every decision.
are the part of the multi-agent system that take in information from the simulator or the real environment about conditions and about the results of the agent's actions. They are the equivalent of human senses – the eyes and ears. Sensor readings come from specific variables in the simulation or control system that measure various aspects of the state space.
are the system's outputs, adjusting one or more of the control variables to control the process dynamically throughout the performance of the task. Each decision during training is known as an , while a series of iterations to complete the task is called an .
The team of agents is organized in layers, with information and decisions passing through each layer in sequence.
The layer, or decision-making layer, is the heart of a multi-agent system. This is where the system makes its control decisions. When a specific skill agent is activated, it will determine the control action the system should take.
Multi-agent systems contain multiple agents d together according to design. Skill agents can work together in groups, sequences, hierarchies, or coordinated teams. This modularity allows the agent to learn more efficiently and perform better.
You can imagine skill agents being like students on a math team who are working together to complete a set of problems. Each student performs best solving a particular kind of problem: one is good at fractions and one at decimals. Depending on the type of problem, the appropriate student will use their expertise to solve the problem and produce the answer for the team. Sometimes one student might handle the entire problem, and for other problems more than one student might need to work together.
Just as different students’ capabilities make them able to solve particular problems, different skill agents may make use of different technologies. Some types of decisions are best approached through agents that can be programmed with mathematical calculations, rules, or optimization algorithms. Others that are more complex and nonlinear can be trained using deep reinforcement learning.
For an HVAC system regulating temperature in an office building:
Control the system during the day
Control the system at night
For a factory where responses are needed to different types of alarms:
Handle safety critical alarms (programmed with rules)
Handle simple alarms (programmed with calculations)
Handle complex alarms (learned with reinforcement learning)
For a drone autopilot:
Stabilize
Get to landing zone
Land
Avoid obstacles
For a robotic arm used to grab and stack objects:
Reach (extend the robot arm from the "elbow" and "wrist")
Move (move the arm laterally using the "shoulder)
Orient (turn the "wrist" to position the "hand")
Grasp (Manipulate the "fingers" to clamp down)
Stack (Move laterally while grasping)
skills are the supervisors for your agent. In the math class analogy, the orchestrator would be like the teacher. The teacher assesses the type of problem and assigns the right student.
In a multi-agent system, an orchestrator uses information from the and to understand the scenario and then determine which skill agent is needed. Once the skill agent is called into service, it makes the decision.
For example, for HVAC control, an orchestrator would determine whether day or night control is needed, and then pass control to the appropriate skill agent. In the safety alarm example, the orchestrator determines the type of alarm and then passes the decision to the right skill agent. In the drone and robotic arm examples, the skills need to be performed in sequence. In these cases, the orchestrator assigns the appropriate skill agent as needed for each step in the process.
A perception layer is an optional enhancement layer. process and translate sensor data into a format that can be used to make decisions. In other words, the perception layer inputs the sensor variables and outputs new variables deduced by calculation, machine learning, or other programming.
For example, if we design a team of agents as an autopilot for a drone, we might have sensor variables that measure pitch, yaw, roll (position of the drone), velocity in each of those three directions, and acceleration in each of those three directions. But what about stability? Stability is an important characteristic to understand while flying a drone, but there is no sensor variable that describes stability. It is too complex to be captured by a single sensor.
The perception layer allows us to create a variable for stability. It can be calculated using dynamics equations or trained with supervised machine learning. The new variable then becomes accessible to the rest of the multi-agent system along with the other sensor variables.
Computer Vision: A camera sensor passes image or video feeds into a perceptor module that identifies object types and locations
Auditory Perception: A microphone sensor passes machine sounds to a perceptor module that identifies which state the machine is in based on the sounds that it is making
Prediction: A perceptor module inputs quality measurements and past agent actions and predicts whether current actions will lead to acceptable quality measurements
Anomaly Detection: A perceptor modules inputs market variables and detects when the market is changing regimes.
Classification and Clustering: A perceptor module inputs machine and process data and classifies which of several conditions a manufacturing line is currently in.