Anatomy of an Agent
Last updated
Last updated
Intelligent autonomous agents have structure, just like our brains. Different parts perform different functions.
This modularity is a key to building high performing agents quickly and effectively. Most complex systems, from factories to rockets to software programs, are broken into modular pieces to separate concerns, reuse design patterns, and integrate pre-built components. Like these systems, decision-making AI also works best when decisions are separated into modular concerns. Modular structure makes intelligent agents easier to build, test, and maintain.
Agents work through a feedback loop. The agent gets information from the system and then makes a decision to adjust the system in some way. This creates a change that is reflected in new sensor information that the agent can then process and use to decide on the next adjustment.
While agents are being built and trained, they are connected to that give them feedback about how their actions affect the system they are learning to control. Once an agent is deployed, the simulator is replaced by an equivalent connection to the real system.
The simulator and agent function in a continuous feedback loop, with the agent outputting actions that affect the simulation environment, and the simulator returning new information to the agent based on every decision.
are the part of the agent that take in information from the simulator or the real system about conditions within the system and about the results of the agent's actions. They are the equivalent of human senses – the eyes and ears of the agent. Sensors are tied to specific variables in the simulation that measure various aspects of the state space.
are the agent’s outputs – the way the agent acts on the system. Decisions are tied to control variables that change at the agent’s direction. The agent makes decisions at a set cadence, adjusting the system dynamically throughout the performance of the task. Each decision during training is known as an , while a series of iterations to complete the task is called an .
The inside of an agent is composed of layers, with information and decisions passing through each layer in sequence.
The layer, or decision-making layer, is the heart of an agent. This is where your agent will make decisions about how to control the system. When a specific skill is activated, it will determine the control action the system should take.
Most successful agents contain multiple skills that you can together according to design. Skills can work together in groups, sequences, hierarchies, or coordinated teams. This modularity allows the agent to learn more efficiently and perform better.
You can imagine skills being like students on a math team who are working together to complete a set of problems. Each student performs best solving a particular kind of problem: one is good at fractions and one at decimals. Depending on the type of problem, the appropriate student will use their expertise to solve the problem and produce the answer for the team. Sometimes one student might handle the entire problem, and for other problems more than one student might need to work together.
Just as different students’ capabilities make them able to solve particular problems, different skills may make use of different technologies. Some types of decisions are best approached through skills that can be programmed with mathematical calculations, rules, or optimization algorithms. Others that are more complex and nonlinear can be trained using deep reinforcement learning.
For an HVAC system regulating temperature in an office building:
Control the system during the day
Control the system at night
For a factory where responses are needed to different types of alarms:
Handle safety critical alarms (programmed with rules)
Handle simple alarms (programmed with calculations)
Handle complex alarms (learned with reinforcement learning)
For a drone autopilot:
Stabilize
Get to landing zone
Land
Avoid obstacles
For a robotic arm used to grab and stack objects:
Reach (extend the robot arm from the "elbow" and "wrist")
Move (move the arm laterally using the "shoulder)
Orient (turn the "wrist" to position the "hand")
Grasp (Manipulate the "fingers" to clamp down)
Stack (Move laterally while grasping)
skills are the supervisors for your agent. In the math class analogy, the selector would be like the teacher. The teacher assesses the type of problem and assigns the right student.
In an intelligent agent, a selector module will use information from the and to understand the scenario and then determine which skill is needed. Once the action skill is called into service, it makes the decision for the brain.
For example, for HVAC control, a selector skill would determine whether day or night control is needed, and then pass control to the appropriate skill. In the safety alarm example, the selector determines the type of alarm and then passes the decision to the right skill. In the drone and robotic arm examples, the skills need to be performed in sequence. In these cases, the selector determines where in the process the agent is and assigns the appropriate skill as needed.
A perception layer is an optional enhancement layer for agents. process and translate sensor data into a format that can be used to make decisions. In other words, the perception layer of the agent inputs the sensor variables and outputs new variables deduced by calculation, machine learning, or other programming.
For example, if we design an agent as an autopilot for a drone, you might have sensor variables that measure pitch, yaw, roll (position of the drone), velocity in each of those three directions, and acceleration in each of those three directions. But what about stability? Stability is an important characteristic to understand while flying a drone, but there is no sensor variable that describes stability. It is too complex to be captured by a single sensor.
The perception layer allows us to create a variable for stability. It can be calculated using dynamics equations or trained with supervised machine learning. The new variable then becomes accessible to the rest of the agent along with the other sensor variables.
Computer Vision: A camera sensor passes image or video feeds into a perceptor module that identifies object types and locations
Auditory Perception: A microphone sensor passes machine sounds to a perceptor module that identifies which state the machine is in based on the sounds that it is making
Prediction: A perceptor module inputs quality measurements and past agent actions and predicts whether current actions will lead to acceptable quality measurements
Anomaly Detection: A perceptor modules inputs market variables and detects when the market is changing regimes.
Classification and Clustering: A perceptor module inputs machine and process data and classifies which of several conditions a manufacturing line is currently in.