Erlang style processes in Common Lisp

Mon Aug 3 15:12:36 UTC 2015

Creators of Erlang have a Lisp background, and one feature of the Erlang 
VM (BEAM) that I'd like back-ported into Common Lisp is their process.

An Erlang "process" is cheap to create, cheap to destroy, cheap when 
blocked, and upon exit performs bulk gc of its allocated memory; e.g., 
munmap().

Handling tens of thousands of requests per second per node isn't 
uncommon, and these often have *several* workers per request or 
connection: hundreds of thousands of processes.  Under such scenarios, 
anything less than this approach to lightweight processes might suffer 
from stalls during long gc runs that would be avoided or significantly 
reduced under Erlang's model.

How might we get equivalent cheap ephemeral processes into a 
contemporary Common Lisp implementation?

For those unfamiliar with it, what Erlang means by "process" stems from 
an implementation of Actor Model with caveats.  Each process is more 
co-routine than thread, and many may run within the same OS thread 
managed by the Erlang scheduler.  The BEAM VM pre-empts processes via 
"reduction" counting, which may be understood as unit of VM time.  Their 
famed tag line, "Let it crash," may be loosely understood in CL terms as 
an implicit HANDLER-CASE.

The open question here is to address a stated non-goal of CL-MUPROC, "we 
rely on the CL implementation's MP system" and "considerably heavier 
than Erlang processes".  [See presentation link from 
https://common-lisp.net/project/cl-muproc/ ]

Some Erlang-on-CL packages use OS threads or in the case of 
Erlang-in-Lisp, fork().  While easier for a library author to implement, 
these contradict the definition of cheap here.  Other such packages punt 
the issue altogether and instead focus only on pattern-matching aspects 
of Erlang the language.

Then there's Lisp-Flavoured-Erlang, and Elixir offers hygienic macros 
that manipulate the full AST properly-- all play nice on same Erlang VM 
at runtime-- for people simply looking to avoid Erlang syntax or 
semantics.  But let's focus on cheap Erlang style processes in Common 
Lisp here, please.

1. Semantics of library elements are straight-forward enough (e.g., 
SPAWN, EXIT) and may be borrowed wholesale and safely named within their 
own package.

2. Memory allocation & garbage collection:

Erlang BEAM VM doesn't begin to reclaim memory until very late, so an 
initial implementation here might assume ephemeral worker processes and 
omit gc until exit of a worker.  However, some processes are long-lived 
in practice.

One compromise might be acceptable: one nursery per process, but 
anything promoted to higher generations gets handled however your gc 
does it now.

This states nothing about use of shared versus multiple heaps across OS 
threads, so such matters may continue to be implementation-dependent.

3. Co-routines:

For something similar to Erlang's reductions, there would need to be a 
measure of runtime complexity per process.

However, I've used single thread co-routines for near realtime systems 
in C and CL with a loop of function pointers and ensuring that each 
referenced function executes under some threshold determined through 
experimentation and confirmed via test cases.  No pre-empting needed.  
While fragile and not easily portable across different hardware 
(including same architecture), this may be acceptable for a preliminary 
draft.

Using CL-MUPROC routines as an example and extending MUPROCN: perhaps 
its body becomes this top-level list of functions from which 
interleaving across processes may occur. Then add variations for playing 
well with DO or LOOP semantics.

4. Message-passing:

SBCL's sb-concurrency extension with Queue and Mailbox (implementation 
of "Optimistic FIFO Queue") can be the solution here too.  More aligned 
with CL principles, allowing for multiple mailboxes-- and therefore 
priority messaging-- would be a welcome advancement beyond Erlang.  
(Erlang allows only one mailbox per process: sequential but with pattern 
matching, nothing out-of-band...)

An important implementation detail that simplifies gc and increases 
cache locality across NUMA nodes: messages get duplicated when 
delivered-- each process only references its own copy!

5. (Almost) nothing shared:

Erlang enforces strict prohibition against shared memory, and common 
practice is to use an in-memory database (ETS) as key/value store in 
lieu of globals.  Scala allows but discourages shared memory.

A CL-inspired take on this might be to use SBCL's approach with thread 
creation: upon creating a new process, you get: A) global special 
values, but modify at own risk... B) LET bindings are local to each 
process; C) threads don't inherit dynamic bindings from parent.  i.e., 
http://sbcl.org/manual/index.html#Special-Variables

6. Scheduling processes on OS threads:

This is delicate in Erlang, and I've experienced specialized use cases 
interfering with their scheduler when under heavy load. Instead for our 
purposes, let the programmer handle the mapping of processes to OS 
threads.  Less is more here.

7. Finally, gen_server & OTP fiends may be implemented as their own 
packages... or skipped entirely!

Thoughts on feasibility?

Thanks,
-Daniel