GO’CIRCUIT
A project by Petar Maymounkov.

Underlying spawning mechanism

 
The main linguistic tool in the circuit is starting a goroutine, called worker function, in a new remote worker process. Underlying this is a lower-level mechanism for spawning a remote worker process and establishing contact with it. The design of this spawning mechanism is the topic of this chapter.

Design

Worker and host abstractions

Within the runtime exposed to the user, the circuit presents live workers in a clean uniform abstraction. Workers are identified by unique 64-it IDs called addresses. (See type Addr in circuit/use/circuit.) These addresses can be used in conjunction with a low-level API to establish communication with or kill the underlying workers. The same API provides mechanisms for spawning new workers, verifying they run healthily (at least at process level), and returning their newly-allocated or user-assigned IDs.

Furthermore, an abstract notion of host locality is supported whereby hosts represent abstract execution sites. Our practice shows that locality awareness is necessary both in low-level apps—like a distributed database—as well as in high-level data processing and analysis apps.

In the circuit user interface, hosts are identified by opaque strings. The internal implementation of the responsible circuit module transparently assigns physical meaning to these host strings. In addition, of course, the user may assign their own out-of-band domain-specific meaning to them.

Implementation

The default implementation included with the circuit treats abstract hosts as Internet host names. And therefore when a new worker is requested to spawn on a given host, it is executed on the corresponding Internet host (e.g. another machine in the datacenter) and an efficient TCP-based communication channel is established between participating parties.

To be precise, two circuit modules are responsible for the entire process of spawning a worker and talking to it. The worker module (found in circuit/sys/worker) is responsible for logging into the remote host and executing the worker binary. A separate transport module (found in circuit/sys/transport) takes care of maintaining efficient, persistent and multiplexed connections between worker processes.

In this document we explain the operation of the default spawning mechanism. Knowledge of this mechanism is helpful in administering and debugging circuit applications, and is a starting point for developers who would like to substitute their own logic in.

For example, by modifying the existing worker module only slightly, one is able to provision user spawn requests on newly allocated, on-demand Amazon spot instances and furthermore release these instances as soon as workers die.

 

Spawning mechanism

Illustration of spawning mechanism.

The spawning algorithm is illustrated above. The worker initiating the spawn sequence is the parent worker and the worker to be spawned is the child worker. The spawn sequence proceeds as follows:

  1. The parent worker establishes an ssh session with the child host.
  2. Within this session, the parent worker executes the worker binary at the child host (which is already installed by a prior deploy) in what we call kicker mode. And we call the resulting process the kicker.
  3. The kicker is a short-lived process, whose purpose is to start the worker process, which is housed in the same executable file as the kicker for sheer convenience.

    The reason why the parent worker does not start the child worker process directly (through the ssh session) is because the kicker—being implemented in Go—is significantly better equipped to manage and verify correct execution of the child worker.

  4. One the child worker is running and ready to accept circuit network interactions, the kicker returns various operational parameters—like the child worker's address—back to the parent worker. Immediately after, the kicker dies.
  5. Having obtained the child worker's operating parameters, the parent worker establishes a circuit-specific communication channel to the child worker (using the transport module).
  6. Subsequently, the parent and child's language-level runtimes ensue high-level interaction.

 


Administrative requirements

The main administrative requirement imposed due to the spawning mechanism is:

For any pair of participating parent and child host, the parent must be pre-configured so as to be able to log into the child using a password-less ssh session.