GO’CIRCUIT
A project by Petar Maymounkov.

Programming guide

 

Table of contents

Introduction
Related things
Programming environment
The future
Basics
Activate the circuit runtime
Import API packages
Spawn
Worker functions
Spawn semantics
Daemonizing worker functions
Communicating across workers
Values
Cross-interfaces
Making cross-interfaces
Using cross-interfaces
Recursive cross-values*
Cross-runtime garbage collection*
Permanent cross-references
Persisting cross-interfaces
Cross-services
Built-in services
Handling errors
Runtime linking and version compatibility*

*Sections marked with a star are advanced and can be skipped on first reading.

Introduction

The circuit can be described as a distributed operating system that sits on top of the traditional OS on multiple machines in a datacenter deployment. It provides a clean and uniform abstraction for treating an entire hardware cluster as a single, monolithic compute resource. To do so, it provides:

The circuit has two interfaces: one for application developers and low-level engineers, and one for system administrators and operations managers.

The first interface is embodied in a full integration with the Go programming language, whereby it is possible to write distributed multi-process applications in a single programming environment, dubbed a circuit program. This is the topic of the manual at hand. (Nothing precludes adding integration with other languages, as the circuit rests on a small and very simple networking protocol.)

The second interface is embodied in a collection of command-line utilities that enable deep, on-the-fly introspection and control over running processes. This interface can be likened to the command shell in traditional operating systems. It is documented in Command-line toolkit.

Virtually every aspect of the circuit's internal mechanics and protocols — i.e. the logic behind linguistic facilities exposed to application developers — is easily customizable and instrument-able. For example, it takes minimal effort to tie in services like Google's Dapper, for distributed call tracing, or to enable inter-connected globally-idempotent data processing applications.

If one were to position a circuit system in a traditional datacenter stack — consisting of a (1) hardware provisioning layer, (2) service deploy, execute and monitor layer, and (3) application services layer — it would belong right on top of (1), obviating the need for (2). And while a circuit deployment could benefit from a complementary watchdog like monit, typically found in (2), implementing equivalent tools inside the circuit is considerably easier and less error-prone. A basic monitoring system, which can be flexibly customized to the needs of specific applications, as well as a basic alerting/notification system is included in the circuit distribution.

Programming environment

Before we dive into specifics, I'd like to say a word about how the circuit is integrated into the Go programming environment of the application- or low-level systems engineer. The current version of the circuit is no more than a Go package library. Once imported in your source, your executables will automatically be started with a circuit runtime in their background and therefore they will automatically be part of a globally-monitored and instrumented system. In fact, just importing the main circuit package (this is explained in the next section) is sufficient to turn any legacy Go application into a circuit-managed application, albeit one that does not take advantage of the full spectrum of circuit benefits.

Accessing the powerful programming tools of the circuit is currently achieved by importing and using any of a set of specialized circuit packages. We have gone to great lengths to ensure that using circuit functionality flows as smoothly as possible with the traditional Go semantics and that the circuit has a minimal API. We have nearly achieved this goal by heavy usage of the flexible and performing reflection mechanism of Go. There a very few places where a nominal amount of boilerplate is necessary. Most importantly, though, the engineer need never leave the Go programming environment to build, or even maintain, complex distributed applications.

Even the minimal boilerplate required by the circuit programmer pales in size in comparison to the amount of code usually written to achieve similar goals using traditional language stacks. The circuit provides savings across the board — starting from protocol definitions (not needed inside the circuit) and going all the way to the oodles of scripts usually written to maintain an end-to-end cloud application.

The future

That said, we are in the process of implementing a Circuit Compiler which will perform a source-to-source transform of the “circuit language” to native Go. The Circuit Language is precisely equal to the Go Language with the addition of a new “spawn” operator, which behaves identically to the go operator but with the added ability to fork new goroutines on desired remote machines. This is somewhat similar to the way things are done in Erlang. We hope that the notorious success of Erlang, in its domain of application, and the growing popularity of Go, in the Internet/datacenter systems domain, is a promising sign for the usefulness of the circuit design.

Basics

Activate the circuit runtime

Importing the package circuit/load has the side-effect of starting the Circuit Runtime as soon as your program executes and before your code starts running. This package exposes no public resources, and therefore it should always be imported unnamed:

import _ "circuit/load/cmd"

Importing this package at multiple places is OK, but not necessary. Typically, and in good style, only the main program package should make this import, thereby saying “this program is a circuit (application)”.

Import API packages

Throughout its execution, your code has access to various facilities provided by the Circuit Runtime. These facilities are exposed by the package circuit/use/circuit, like so

import "circuit/use/circuit"

There are a few other circuit/use/… packages, which provide additional access to various components of the Circuit Runtime, like network, etc. They are discussed elsewhere.

Spawn

The main operation provided by the Circuit Runtime is spawning a function on a remote (or local) host, in a new “worker” OS-process — in other words, in a new Go runtime. Spawning is performed by the library function circuit.Spawn, that is declared as follows:

Spawn(host circuit.Host, anchor []string, f circuit.Func, in ...interface{}) (out []interface{}, addr circuit.Addr, err error)

Spawn executes the function f inside a new goroutine, residing inside a new OS-process, called a worker, on the desired host machine.

  1. host indicates the physical machine where f is to be executed. The type circuit.Host is an abstract interface, intended to accommodate different notions of a “host” that can house worker processes. One implementation of circuit.Host, which we mention later, represents hosts by their TCP addresses.
  2. anchor specifies a list of directories in the anchor file system, under which the new worker will register. Briefly, the anchor file system is akin to Linux' procfs, except that in our case it keeps track of all live workers across machines involved in a circuit application. Furthermore, unlike procfs, the spawner can choose multiple paths under which the new worker will be registered. This enables complex dynamic service discovering patterns, while avoiding the semantically-fragile notion of symbolic (or other) links.
  3. f identifies a Go language function which is to be executed on the remote worker. We call it a worker function. The way in which worker functions are specified linguistically — i.e. what is the nature of the values of type circuit.Func — is explained in section Worker functions.
  4. in is a list of arguments to be passed on to f, which the user must ensure are of the correct types. All type errors will be caught at runtime, and often as soon as the program is executed (as opposed to when the worker functions are spawned).

Spawn blocks while f is being executed on the remote worker. When the worker function completes, Spawn returns the following values:

  1. The return values of f are placed in out. Their count and types will be exactly as the signature of the worker function specifies.
  2. The circuit address of the newly spawned worker is stored in addr. This is an opaque value that can be used in conjunction with various circuit API calls to introspect into, control or otherwise utilize the worker process while it is alive, and to inspect its remains post mortem using the circuit command-line toolkit.
  3. If any error occurs during execution, it is returned as err, in which case the first two return values are undefined.

Worker functions

Functions that are intended to be executed remotely using the Spawn command are called worker functions. To define a worker function, one must create a public type with a singleton public method, containing the worker function logic. (This redundant boilerplate convention is needed at the moment, since the current implementation of the circuit avoids having a dedicated circuit compiler in favor of using the Go reflection machinery. The need for boilerplate will be removed in an upcoming Circuit Compiler.)

There are no restrictions on the signature of the singleton method. The rules regarding how Go values are serialized and sent to remote worker processes are discussed in detail in chapter Values. Perhaps the most exciting part is that, via the arguments and returns values of a worker function, we can not only send and receive any concrete Go type, but also send and receive “interfaces” — a way of exchanging APIs, if you will. This powerful feature allows us to leverage the full power of Go's concurrency semantics in synchronizing with functions executing on remote processes. These notions are detailed in section Interfaces.

Once implemented, the worker function container type must be registered with the runtime type system of the circuit by calling circuit.RegisterFunc during program initialization. Take this worker function definition as an example:

import "circuit/use/circuit"

type MyWorkerFunc struct{}

func (MyWorkerFunc) SingletonMethod(greeting string) (response string) {
	return "Hello " + greeting
}

func init() {
	circuit.RegisterFunc(MyWorkerFunc{})
}

It is in good style to include the call to circuit.RegisterFunc in a dedicated init function, defined at the source site of the type definition. This rule is enabled by Go's ability to have multiple init definitions in the same file and package. Henceforth, the worker function container type might be refered to as the worker function type or simple as the worker function.

Spawn semantics

The spawning mechanism provides flexible ways to execute both short- and long-lived worker functions, while providing clear semantics regarding the lifespan of the worker processes hosting them. We explain these here.

When Spawn is invoked,

(1) A new worker process is started on the desired host and a connection is established between the parent and child circuit runtimes.
(2) The worker (an OS-process) proceeds to:
(2a) Execute the desired worker function,
(2b) Send its return values back to the parent process, and
(2c) Die immediately after.
(3) The calling process receives the return values (or detects failure) and returns (unblocks) from the call to Spawn.

Most notably, (2c), the worker process dies as soon as the worker function completes. This simple semantic is ideal for running short-lived jobs remotely. In order to run long-lived jobs without blocking the caller, one might wrap the blocking call to Spawn in a goroutine:

done := make(chan int)
go func() {
	circuit.Spawn("host25.datacenter.net", nil, LongLivedFunc, …)
	done <- 1
}()
…
<-done // Wait for remote long-lived job to complete

This technique can do the job, but it is frequently too clunky. One rarely spawns a long running process, willing to forget about it until its death. A more common pattern — we find — is to spawn a long-running process which confirms successful startup to its spawner, and often communicates back some sort of dynamic operational parameters, before it is let to proceed into its long-running logic.

In order to facilitate this pattern, the spawning mechanism allows for an exception to rule (2c) which, while not strictly necessary, could simplify implementation in such cases quite a bit. Circumventing rule (2c) is embodied in the machanism of “daemonizing” worker functions, described in the next section.

Daemonizing worker functions

Consider the following worker function definition:

import "circuit/use/circuit"

type StartServer struct{}

func (StartServer) Main() (hostport string, err error) {
	server, err := StartLocalWebServer()   // Bind a web server
	if err != nil {
		return "", err
	}
	go func() {                            // Start accepting requests in new goroutine
		for { server.AcceptNext() }
	}()
	return server.HostPort(), nil          // Communicate the URL of the server to the caller
}

func init() {
	circuit.RegisterFunc(StartServer{})
}

This is a natural programming pattern: The worker function starts a network server in a separate thread of execution, so that it can return to the caller with a description of how to find the new server. As written above, this worker function wouldn't behave as desired. As soon as execution reaches the return statement at the end, the return values will be communicated to the caller and the hosting worker process will be killed, as per line (2c) from the spawn semantics. This will kill the web server running in the anonymous goroutine, thereby defeating our goal to install a long-running web server.

To remedy this, we need to substitute the go statement with the following:

	…
	circuit.RunInBack(func() {
		for { server.AcceptNext() }
	})
	…

circuit.RunInBack(f func()) is a circuit API call, which can only be called from within the execution of a worker function. Like a go statement, it has the effect of forking the function f into a new goroutine. In addition, it instructs the circuit runtime that the hosting worker process should not be killed, as per semantic (2c), until both (i) the calling worker function returns and (ii) the argument function f returns.

Communicating across workers

There are two situations, collectively called cross calls or cross invocations, in which data is being transported out of a worker process, across the network and into another worker process: When spawning worker functions and when invoking methods of objects that live on remote workers. We have already discussed the former in section Spawn. The latter will be covered in section Cross-interfaces. Here we are going to see how the declared types of functional arguments and return values determine the manner in which Go values are sent across workers.

When performing cross calls, typed data is implicitly being serialized and transported over a network, both when sending the functional arguments and when receiving the return values. It is important to remember that sent data (which from programming point of view comprises Go typed values) arrives in a different runtime environment where, for example, pointers from the original environment do not make sense. To prevent programming mistakes of this nature, the circuit provides two ways of sending data to other workers, which are discussed in the following sections.

To aid ourselves in this discussion, we'll benefit from two terms. Often it is more convenient to talk about the “sender” and “receiver” of data. So let's see how they relate to the caller and the callee in the functional (i.e. linguistic) context. Data travels across workers (and across the network) only during cross-calls. The calling worker is the sender of the functional arguments and the receiver of the return values. Whereas the worker being called is the receiver of the functional arguments and the sender of the return values.

Values

If a functional argument or return value is declared of a concrete (non-interface) type, then whenever this function is cross-called the supplied Go value will be transported across the network by value. In other words, it will be flattened recursively, serialized and consequently reconstructed in the same form at the receiving worker's Go runtime environment.

Let's look at an example method declaration:

func (MyReceiver) MyMethod(a *MyStruct, b int, c map[int]string, d ...string) (bool, []byte) {
	…
}

All arguments as well as the return values are of concrete types (non-interfaces). If MyMethod is cross-called, all arguments and return values will be transported by value. Most notably:

Values are transported using Go's gob package. Consecuently, values that are sent during cross-calls must be gob-encodable, otherwise a panic will be fired.

(If you are a gob expert, don't confuse the semantics of gob’s Encode and Decode methods with those of cross-call functional arguments and return values: nil values are properly transported by the circuit. If a function f takes a pointer argument (like *T, for example) and you pass in a nil value when cross-calling f, the receiver will correctly pass the nil value for the corresponding argument. In contrast, a nil value passed to gob.Encode, for example, will produce an error or panic. The same applies to slices and maps.)

Cross-interfaces

One of the key features that makes Go so flexible is the notion of interfaces. Passing an interface is like passing a functionality (or an API). In fact, it is easy to see that passing interfaces subsumes passing functional or channel types. Indeed, one can emulate passing a function by passing an interface whose singleton method implements the desired function. Or, instead of passing a channel, one can pass an interface with appropriately defined send, receive and close methods, for example,

type ChanInt interface {
	Send(value int)
	Receive() (value int, success bool)
	Close()
}

Passing interfaces is a powerful tool. E.g. one can use it to “pass open files” across different goroutines. One of the key accomplishments of the circuit's programming environment is the ability to pass interfaces in cross-calls without sacrificing any of the flexibility one is accustomed to within the Go programming environment. So let us see what this means and how it is done.

In Go an interface is a value which refers to another underlying value, and also specifies a list of callable methods of the underlying value. In fact, at a semantic level, one might say that a non-nil interface value is simply a list of callable functions with a underlying common state.

The Circuit augments the Go language with an additional type, dubbed a cross-interface, that has the same semantic interpretation as a normal interface, with the difference that it can refer to an underlying object that lives in a process other than the currently executing one. Within the programming environment, we use the type circuit.X to hold cross-interface values. So, for example, a function that requires a cross-interface argument and returns one, would be declared like so

func (MyReceiver) MyMethod(a string, b circuit.X) (circuit.X, bool) {
	…
}

The type circuit.X is a Go interface type. It provides methods for interacting with the underlying object, discussed later. Semantically, one can think of circuit.X as “the interface{} of cross-interfaces” — the broadest cross-interface type, having no required methods. (It is currently not possible to define cross-interface types that require specific methods. This feature will be introduced with the upcoming Circuit Compiler.)

We have now seen how to declare a function that uses cross-interfaces. The next section explains how to send (pass arguments or return values) cross-interface values, as well as how to use them on the receiving end.

Making cross-interfaces

Any Go type that has public methods, and such that all arguments and return values involved are gob-serializable or are interfaces, can be used as the underlying object of a cross-interface. The function circuit.Ref is used to create a new cross-interface with a given underlying native Go value. Usually native values are converted to cross-interfaces right before they are passed as arguments to cross-calls or returned as functional results.

Suppose we need to implemented a file system type, whose singleton method opens a local file and returns an object representing the open file. A typical Go implementation would like something like this:

func (fs *FileSystem) Open(name string) (*File, error) {
    …
    return file, nil
}

Now suppose we would like to be able to cross-call Open. If the implementation is left as is, the returned value *File will be flattened and serialized back to the caller. This is not what we want. Instead, we need to create a cross-interface to the local file object and return the former to the remote caller. The implementation requires a tiny change:

func (fs *FileSystem) Open(name string) (circuit.X, error) {
    …
    return circuit.Ref(file), nil
}

Using cross-interfaces

The native Go value circuit.X — which represents a cross-interface — provides a Call method for invoking the methods of the object underlying the cross-interface, wherever (remotely) it might be located. The signature of Call is as follows:

Call(proc string, in ...interface{}) []interface{}

The first argument proc holds the string name of the method (of the underlying object) you desire to invoke. Following are any number of arguments that will be passed on as arguments. It is your responsibility to make sure the types of the arguments supplied matches (or is assignable to) the types of the arguments expected. If you fail to do so, your mistake will be caught at runtime and reported with a panic.

Call blocks until the function completes and transports the return values over the network back to the invocation site, or an external (network, machine, etc.) failure condition occurs.

You will notice that Call does not return an error value. If an error condition occurs, it is returned in the form of a panic. It is up to you to recover from it, if it makes sense in your application. If a panic is thrown, there is no guarantee as to whether the invoked method was called at the destination worker. The principles behind our choice to handle errors in cross-calls via panics is explained in more detail in Handling errors.

On success, the values returned by the method invocation are placed inside the returned slice of Go interfaces. Their types will exactly match those in the method definition, so you can cast them confidently unless they could be nil within the normal logic of your application.

Recursive cross-values*

Thus far in this chapter (on Communicating across workers) we have seen that the circuit environment supports two types of functional arguments (and return values): concrete types and cross-interfaces. In fact, the circuit supports a “hybrid” as well. Consider this type definition:

type Hybrid struct{
	ValueField     []int
	CrossInterface circuit.X
}

When a value of type Hybrid (or *Hybrid) is passed as a functional argument, The struct itself will be sent (and received) by value, while the CrossInterface field will be handled as a cross-interface and the receiver will be able to use it as such (and call its methods).

This logic applies recursively also for slice and array element types, as well as for map value types (map key types cannot be cross-interfaces). Here's another example to re-enforce this:

type SliceElm struct {
	XInterface        circuit.X
	SliceOfXInterface []circuit.X
	OtherStuff        string
}

type TopLevel []*SliceElm

Cross-runtime garbage collection*

At this point of the present document, the careful mind should be asking:

What happens to the underlying objects that are solely referenced by cross-interfaces at remote workers that have died?

The short answer is: The right thing happens. The objects underlying cross-interfaces behave exactly the same way other (native) Go objects behave — they are garbage-collected when no more references to them remain. The circuit tracks all workers holding a reference to a particular underlying object. It sends regular heartbeats (in an efficient manner, piggybacking on other useful network calls) and makes note of dead workers. Live workers, on the other hand, are responsible for reporting back if the cross-interface values are garbage-collected locally.

In some advanced cases (discussed in Runtime linking and version compatibility*) it is benefitial to “anchor” an object underlying a cross-interface forever. In other words, we don't want that object garbage-collected throughout the life of the worker that is hosting it. For this purpose, the circuit supports a variation on cross-interfaces described next.

Permanent cross-interfaces

A permanent cross-interface is a cross-interface that, when created for an underlying object, marks the object as permanent, ensuring it will never be garbage-collected at its hosting worker and that outstanding permanent cross-interfaces held at other workers will always be able to find it. In particular, permanent cross-interfaces can be serialized to disk and revived later.

Permanent cross-interfaces are held inside values of the circuit.XPerm interface, which is narrower than and compatible with the circuit.X interface. In other words, a circuit.X variable can hold a circuit.XPerm value, but not the other way around. The function circuit.PermRef, applied to a Go object, returns a permanent cross-interface pointing to that object. In all other respects, circuit.XPerm behaves similarly to circuit.X.

Persisting cross-interfaces

Since the objects that underly permanent cross-interfaces are always valid — during the life of their hosting worker — they can be persisted by one worker and re-used later by another (as long as the underlying object is still alive). Persistence is enabled by two functions — circuit.Export and circuit.Import.

The former is declared as

Export(payload ...interface{}) (exported interface{})

It accepts a list of type-unrestricted arguments and returns a (Go-exportable) native Go object that can be serialized by any Go encoding package of your choosing, like encoding/json, encoding/gob and so forth. The same rules apply to the arguments passed to Export that apply to those passed in function cross-calls. The single exception is that non-permanent cross-interfaces, circuit.X, cannot be present in any of Export's arguments or within them at any recursive level.

The inverse of Export is declared as

Import(exported interface{}) (payload []interface{}, stackTrace string, err error)

Its interface is self-explanatory except perhaps for the second return value. The stackTrace return value, resulting from a successful nil-error invocation to Import, holds the stack trace of the worker that called Export to persist the payload, at the time of the call.

Cross-services

Permanent and non-permanent cross-interfaces alone are a sufficient abstraction for building complex distributed applications with temporally complex behavior. For example, using techniques (described later in Runtime linking and version compatibility*), one can implement systems that can be upgraded incrementally and with no downtime, eventually replacing all running workers from one generation of the system to the next. In such applications, permanent cross-interfaces can be used to persist the system-dependent state of a worker across a restart/update cycle, for example. (In fact, one could even achieve the same result by avoiding persistence and permanent cross-interfaces altogether, by utilizing only non-permanent cross-interfaces.)

Using cross-interfaces for communicating “pointers” to remote objects from one worker to another is not only powerful, but it also enables very precise information flow control (for debugging, profiling, tracing, security, etc.) due to the integration with the programming language itself (Go in this case). In contrast, traditional mechanisms based on global URLs, as in zookeeper3.datacenter.net:2081, conceal both:

Nevertheless, we find it practical to support a circuit mechanism that allows for using traditional service semantics. We describe this mechanism next. First, it is worth noting that nothing prevents a circuit worker from listening on a POSIX network port via the standard net package and, indeed, we often do this in practice. This technique is typically used to interface between circuit applications and external technologies.

The circuit supports an additional technique for building service semantics. This technique is applicable only for circuit-to-circuit communication, which allows it to benefit from type-safety and other features of the circuit programming environment. (In traditional IP-based services, type checking has to be programmed explicitly by the application developers, in addition to a long line of other boilerplate code that almost always needs manual attention — much unlike anything in the circuit.) A circuit worker can listen for requests incoming to a service identified by a string name. To start listening on a service, use circuit.Listen, which is declared as

Listen(serviceName string, receiver interface{})

The first parameter to Listen is the service name. The second is an instance of a native Go type, whose public methods are the API of the service. To connect to a service, use circuit.Dial, which is declared as

Dial(workerAddr circuit.Addr, serviceName string) (serviceReceiver circuit.X)

The first argument, workerAddr, specifies the worker we would like to connect to. Worker addresses are obtained as return results of circuit.Spawn or, alternatively, they can be discovered by reflecting on the circuit's anchor file system that, similarly to Linux' procfs, maintains knowledge of all live workers. The second argument specifies the service name. On success, Dial returns a cross-interface to the receiver object that is listening on that service name.

Failure is communicated to the programmer via a panic, thrown out of Dial, similarly to the way it is done in cross-calls. Since dialing services is usually done in higher-level application code, on occasion it is more convenient (in terms of writing error-handling code) to return dialing errors in the conventional way. For this purpose, we have included the utility function circuit.TryDial, declared

TryDial(workerAddr circuit.Addr, serviceName string) (serviceReceiver circuit.X, err error)

TryDial behaves identically to Dial except that it never panics. Instead it returns error conditions in err.

 


Built-in services

By default, every circuit worker exposes a built-in service called acid, after the Plan 9 debugger. E.g. you could access it using something like:

…
acidXInterface := circuit.Dial(workerAddr, "acid")
…

The acid receiver currently provides a few basic health/debugging/profiling facilities. They are documented in the godoc for package circuit/sys/acid. Here we mention some of them:

Ping()

Ping is the most basic one. A successful call to ping indicates that the destination worker was alive and healthy at some point between the cross-invocation of Ping and the time when it returned.

RuntimeProfile(name string, debug int) ([]byte, error)

RuntimeProfile exposes the runtime profile information provided by the runtime/pprof Go package at the destination work. This facility is immensely helpful. It has enabled us to partially debug in-production applications and discover the origin of some bugs in seconds, which otherwises would have required multiple re-deploys and weeks to find.

For example, “silent hangs” are some of the hardest bugs to find without on-the-fly instrumentation. They don't panic your process with a convenient trace to the bug's expression. Being able to inspect the stack traces of all processes comprising your application in a snappy and convenient manner saves non-trivial amount of time. The circuit distribution includes a few command-line tools (discussed in Command-line toolkit) that utilize the acid service to provide convenient interactive bug chasing.

CPUProfile(duration time.Duration) ([]byte, error)

CPUProfile is a proxy to the CPU profiling mechanism of package runtime/pprof. It starts CPU profiling for the specified duration and returns the resulting CPU profile in a format convenient to be fed directly into the GNU profiler. (Using the GNU profiler with dumps generated by Go programs is discussed in this article.) The command-line tool 4cpu uses CPUProfile to enable remote on-the-fly profiling of in-production services without any permanent effects on their performance.

The acid service provides a growing set of other facilities as well. Its current offerings are found in the godoc for package circuit/sys/acid. Like most aspects of the circuit, it is very easy to extend the built-in interface with functions that fit your needs. You could do this following the acid service code as a starting point.

Handling errors

A founding principle in the design of the circuit was to make cross-worker calls nearly identical, in linguistic semantics, to traditional in-process calls. Making them entirely identical is not possible and not desirable.

The key semantic difference between in-process and out-of-process (cross-) calls is that, from the point of view of the calling thread, the in-process callee will always return albeit possibly with a software panic. On the other hand, the invocation of an out-of-process callee can end in one additional way. All possible outcomes of an out-of-process call are as follows:

The common case is the former. For that reason, we favored the following linguistic/semantic principles for cross-calls.

Semantically, cross-calls should be identical to traditional Go function calls. A traditional Go call looks like this

retrn1, retrn2 := Receiver.Method(arg1, arg2)

A cross-call, under the current implementation of the circuit, looks like this:

retrns := XInterface.Call("Method", arg1, arg2)

Syntactically, they differ superficially. Semantically, they are the same. In particular:

For the above two outcomes, traditional and cross-calls behave identically. For the remaining outcomes (available only with cross-calls) of an external error that interferes with the cross-call network protocol, the invocation of Call also panics, however it does so with a panic object that is unique to this condition.

This design offers some natural benefits. We find that typically we structure our code so that monolithic blocks of code, consisting of possibly complicated or recursive functional paths, are responsible for cross-calling into a remote worker. It is then quite convenient to recover from external panic conditions at a single high-level point. This relieves deep, application-specific code from error handling logic pertaining to external circumstances.

This programming pattern has motivated one additional cross-call semantic:

In other words, once connectivity to a worker residing on a remote host is lost, that worker is considered permanently dead, regardless of the whether this is indeed the case. This makes the programming paradigm of the circuit clear to reason about.

This semantic does not actually preclude handling network outages gracefully, without restarting worker processes on all involved hosts. We postulate that the correct way to manage network outages is outside of the programming environment and inside the circuit runtime. The circuit kernel does not implement its own physical networking. Instead it requires a transport driver to provide a circuit.Transport interface which exposes facilities for listening on and dialing to opaquely-named endpoints. For example, the current circuit distribution comes with a “default” transport layer, based on persistent TCP connections.

We envision that the correct way to handle network outages would conceal the failing (or failing to reconnect) TCP connection from the circuit programming environment by blocking the outstanding read/write calls and thereby delaying a declaration of a failed worker. Two events would unblock outstanding calls:

We have taken great care to ensure that differently-versioned circuit binaries — and sometimes altogether different binaries — are “compatible” in that they can contact each other and invoke each other's cross-callable functions, without violating meaningful semantics of the programming environments that compiled to each binary, even though these environments (i.e. the source code) may not, for the most part, be aware of each other at the time of writing.

But first, let us understand in what situations this concern applies. Take as an example the circuit command-line tool 4stk. It (i) finds the worker named in its command-line arguments using the anchor file system, (ii) dials the acid service on that worker, and (iii) fetches the worker's current stack trace by calling the RuntimeProfile function of the acid service. To be able to do this, 4stk is built as a circuit application itself.

Note that the binaries of 4stk and your circuit application workers are not built from the same code base. Granted they both utilize the circuit packages, they might be based on different versions of the circuit sources. Nevertheless, we would like to be able to build a tool like 4stk once and use it going forward against ours, as well as third-party, circuit applications without having to rebuild it.

To make this possible, the circuit networking protocol utilizes a convention for encoding function identities that is not tied to — and, in fact, is entirely independent of — the particular build of any one binary, while at the same time being very space-efficient. In particular, every cross-callable function is uniquely determined by a 64-bit identifier that depends only on the functional signature. The signature includes the Go method name, as well as the types and the order of its arguments and return values.

This way a legacy 4stk binary will still be able to dial into a newer worker and invoke the acid service's RuntimeProfile function as long as the newer runtime has retained that function without changing its signature, while possibly changing its implementation.

From a programming standpoint the concern of connecting to a worker, whose binary might be built from a different code base, can occur in three cases of interaction with a remote worker:

  1. When dialing into a cross-service, using circuit.Dial: This is because the dial destination — which includes a worker address and a service name — can be saved to disk when the destination worker was born, for example, and later revived and used by a newer worker compiled post factum.
  2. When cross-calling the methods of a permanent cross-interface: In this case also, the destination worker could persist a parmenent cross-interface to disk, later to be revived by a newer worker.
  3. When cross-calling the methods of a non-permanent cross-interface: This case is a little different. Here if a connection to a differently versioned worker was established once (using one of the above two methods), then one could imagine, say, obtaining a non-permanent cross-interface from that worker with an underlying object on that same worker. Consequently, cross-calling the cross-interface would be interacting with a differently versioned worker.

The main importance of this observation is suggesting a good rule for implementing circuit applications that play well in heterogenous environments. Specifically, it suffices to handle version incompatibility errors during interactions of type (1) and (2), leaving invocations of type (3) unchecked for panics. This is because in order to be in position (3), either (1) or (2) must have happened prior.