Syskit Runtime Behavior
This part of the documentation deals with the Syskit "runtime model", that is how Syskit manages a system at runtime, the underlying mechanisms of execution and how a user can interact with a Syskit system.
Later on this page, I will give a high-level explanation of the video we've seen at the end of the Basics section. We will then get deeper into each part of Syskit that handle Syskit's runtime behavior: an overview of the Syskit execution and of error representation and common type of errors
This is the first part that will deal with runtime aspects. For more advanced related topics, one may want to also read all about coordination.
Actions and Jobs
In the Basics part, we've seen how to define network of components on a profile, and that this profile could be then exposed on an action interface, which allowed us in the last part to control the system. An action is an abstract concept that represents one thing the system can do. In order to actually have it executed, one starts a job. Once a job has been started it can be dropped, that is tell Syskit that this particular job is not part of current system's goal.
All job-related commands are processed in batches: they are queued, and sent to the Syskit app and processed by it all at once. We will see the reason for this later in this section.
Scheduling and Garbage Collection
Starting the first two jobs from the IDE seemed to be a very transparent process. However, behind the scenes, even this seemingly simple actions require a bunch of things, such as (in no particular order):
- connecting to the Gazebo task that handle our arm model
- creating the joint constant generator task
- connecting the two
- configuring and starting the components
The sequencing of these different actions is controlled by Syskit's
scheduler. While the overall scheduler could be in principle arbitrary,
Syskit internally relies on services that are currently provided by
the temporal scheduler (that you had to set in the initial bundle
setup). Startup of
the arm_safe_position_def
job looks like this:
Now, let's look at stopping things.
If we have the two initial jobs running (ur10_fixed_dev
and
arm_safe_position_def
) and stop the latter first and then the former. Focus
on the right panel, that shows the state of the "real" components (i.e. not the
compositions).
When arm_safe_position_def
was stopped, only the setpoint generator
joint_position_setpoint
has been stopped. The ModelTask
gazebo:empty_world:ur10_fixed
is still running. This is because we still have
the ur10_fixed_dev
action running and that this action "depends on" the
component. When we stop ur10_fixed_dev
, this one is stopped as well. At the
end, starting the arm_safe_position_def
by itself starts both components, and
stopping it stops both.
Syskit maintains a set of components and compositions that are currently in use by its goals. Everything else is "not useful" and stopped. This relies on two things: the internal relationships between compositions and components which tracks the "usefulness" of a task, and a garbage collection mechanism that stops and removes not-useful tasks.
In a nutshell, so far:
- "starting" or "killing" a job is actually either adding a new goal or removing an existing goal from Syskit's goal set
- the scheduler is what actually starts things based on this goal set
- the garbage collector is what actually stops things based on this goal set
Transitions
One of Syskit's most important features is its ability to transparently
transform the component network to build one or a combination of behaviors.
We have seen this interactively in the video we saw at the end of the Basics
section: the system was maintaining the
arm_safe_position_def
and we transitioned it into a parametrized
arm_cartesian_constant_control_def
to move its tip into a given cartesian
position. This entailed changing the network from a simple joint command to a
network that can do cartesian arm control. What we saw was that the transition
happened smoothly: the arm was controlled during the change of system
configuration.
The same mechanisms are key to autonomously transitioning between behaviours. This is how one can build coordination models.
When we transitioned from the joint control to the cartesian control, we first
queued the action start and the action drop and then processed them at
once. When we clicked Process
, the two changes were processed together.
That is, Syskit could understand that the intent was to stop an action and
start a new one at the same time, which it handled as a transition.
Generally speaking, Syskit's execution engine acts as an event loop, in which all events that
are received at the same time are processed as if they happened at the
same time.
What if we dropped the action first, and only then
started arm_cartesian_constant_control_def
? Syskit would have applied the
kill first and then the start. We would have basically had the same effect
than in the video we just saw, with the arm falling
uncontrolled.
What if we started
arm_cartesian_constant_control_def
and only then dropped the existing job ?
Ouch … The start command failed. This is because we've tried to run two different control chains that controlled the same device. This is an impossibility, and the request is therefore rejected by Syskit's network generation.