Submit a Training Job through the UI
Last updated
Last updated
Composabl agents use Kubernetes clusters to train at scale. A cluster is a collection of computers that work on large tasks simultaneously. This provides enough compute to complete large training tasks as efficiently as possible.
Composabl offers two options for cluster training:
Use Composabl's Training as a Service offering to train on our clusters
Use your own compute clusters through Azure, AWS or another provider
Before you submit your job for training on a cluster, make sure that your agent is fully configured and all the parameters have been set. That means checking all the agent components:
Goals
Perceptors
Selectors, including goals for learned selectors and scenarios
Skills, including goals for learned skills
Scenarios, including scenario flows
Any component of the agent with a warning sign is not fully configured and not ready for training. Go back to edit that agent component and make sure that all of the fields are filled out.
You can train on your own cluster or on Composabl’s clusters using training as a service (TaaS) credits. If you want to use Composabl’s clusters, ensure that you have credits available.
To train on your own cluster, make sure that you have set your cluster up and installed Composabl successfully.
Click Train and then choose the cluster option in the menu. You will then have the option to configure your training session.
Training session configuration options are the same whether you’re using TaaS or training on your own cluster.
A training cycle is a complete pass through the entire task, with the agent continuing until it reaches success or some other stop criteria. Your agent will train each skill one at a time for the selected number of training cycles, starting from the bottom of the agent design.
A training cycle typically involves about 1,000 agent decisions. Depending on the complexity of the task, agents may need to complete anywhere between 100 and several thousand training cycles to become proficient.
You can run multiple simulators in parallel to speed up training. If you run more than one simulator during a training, the number of training cycles selected will be multiplied by the number of simulators, so 5 training cycles with 3 simulators selected would lead to 15 training cycles total.
You can use the Advanced Configuration to choose how powerful each machine running a simulator should be. If you choose Small, each training cycle selected will result in one training cycle completed. If you choose GPU, you will get 4 training cycles for each training cycle.
More training cycles running simultaneously will speed up training but also increase costs. How long your training takes also depends on the complexity of your agent and your simulator.
When you have configured your settings correctly, click Start Training.
You will then be taken to the Training Sessions page. There you can follow the agent training progress by viewing the real-time plots or the console output.
Note that it will take a few minutes for the visualization to begin.