circuit/use/circuit.) These addresses can be used in conjunction with a low-level API to establish communication with or kill the underlying workers. The same API provides mechanisms for spawning new workers, verifying they run healthily (at least at process level), and returning their newly-allocated or user-assigned IDs.
Furthermore, an abstract notion of host locality is supported whereby hosts represent abstract execution sites. Our practice shows that locality awareness is necessary both in low-level apps—like a distributed database—as well as in high-level data processing and analysis apps.
In the circuit user interface, hosts are identified by opaque strings. The internal implementation of the responsible circuit module transparently assigns physical meaning to these host strings. In addition, of course, the user may assign their own out-of-band domain-specific meaning to them.
The default implementation included with the circuit treats abstract hosts as Internet host names. And therefore when a new worker is requested to spawn on a given host, it is executed on the corresponding Internet host (e.g. another machine in the datacenter) and an efficient TCP-based communication channel is established between participating parties.
To be precise, two circuit modules are responsible for the entire process of spawning a worker and talking to it.
worker module (found in
circuit/sys/worker) is responsible for logging into the remote host
and executing the worker binary. A separate
transport module (found in
circuit/sys/transport) takes care of maintaining efficient, persistent and multiplexed connections between worker processes.
In this document we explain the operation of the default spawning mechanism. Knowledge of this mechanism is helpful in administering and debugging circuit applications, and is a starting point for developers who would like to substitute their own logic in.
For example, by modifying the existing
worker module only slightly, one is able to provision user spawn requests on newly allocated, on-demand Amazon spot instances and furthermore release these instances as soon as workers die.
The spawning algorithm is illustrated above. The worker initiating the spawn sequence is the parent worker and the worker to be spawned is the child worker. The spawn sequence proceeds as follows:
sshsession with the child host.
The reason why the parent worker does not start the child worker process directly (through the
ssh session) is because the kicker—being implemented in Go—is significantly better equipped to manage and verify correct execution of the child worker.
The main administrative requirement imposed due to the spawning mechanism is:
For any pair of participating parent and child host, the parent must be pre-configured so as to be able to log into the child using a password-less