[rucksack-devel] GC problems

Edi Weitz edi at agharta.de
Mon Jul 24 15:40:54 UTC 2006


Yeah, I realize I'm soliloquizing... :)

OK, I've thought about how to solve this a bit but I haven't succeeded
yet.  Here are two sketches of approaches I've tried but discarded.
Maybe someone else has an idea based on them.

In both cases I declare a mixin PERSISTENT-THING for PERSISTENT-DATA,
PERSISTENT-OBJECT, and PROXY.  The mixin provides the OBJECT-ID and
RUCKSACK slots.

1. Hook into the host (Lisp) GC (in an implementation-dependent way,
   obviously): Whenever a PERSISTENT-THING object is created, annotate
   its object ID in an IDS-IN-USE structure (for example a hash table)
   belonging to its rucksack.  During (Rucksack) GC, add the IDs in
   this structure to the roots.  When such an object is finalized by
   the host (Lisp), remove its ID from IDS-IN-USE.

   The idea is that (live) transient references to persistent objects
   are also traversed during the scanning phase of the GC.

2. Similar, but portable and coupled to transactions: Whenever a
   PERSISTENT-THING object is created, annotate its object ID in an
   IDS-IN-USE structure (for example a hash table) belonging to the
   current transaction.  During (Rucksack) GC, add the IDs in this
   structure to the roots.  IDS-IN-USE will obviously vanish once the
   transaction is finished.

   The idea is that transient references to persistent objects which
   were created in the current transaction are also traversed during
   the scanning phase of the GC.

Both "solutions" fail because objects are in a way written to disk too
late - even if an ID belongs to the root set examined by Rucksack's
GC, that does not necessarily imply that its descendants will be
marked.  At the very least, one would have to change the way the GC
scanning phase works, perhaps taking the dirty queue into account
somehow.

In addition, the first solution fails because TRANSACTION-COMMIT is
usually not called in the lexical contour that declared the transient
references.  I have no idea how and if this could be tackled.

Again, all this gets even more complicated if we think about
concurrent transactions...

Some "meta-ideas" about the GC:

A. We could abandon the idea of a concurrent GC and only run the GC
   when there's no transaction, similar to what Plob! does.

B. We could abandon the idea of roots and reachability and instead
   explicitely delete objects, similar to what AllegroCache does.

Hmmm....



More information about the rucksack-devel mailing list