[rucksack-devel] GC problems
Edi Weitz
edi at agharta.de
Mon Jul 24 15:40:54 UTC 2006
Yeah, I realize I'm soliloquizing... :)
OK, I've thought about how to solve this a bit but I haven't succeeded
yet. Here are two sketches of approaches I've tried but discarded.
Maybe someone else has an idea based on them.
In both cases I declare a mixin PERSISTENT-THING for PERSISTENT-DATA,
PERSISTENT-OBJECT, and PROXY. The mixin provides the OBJECT-ID and
RUCKSACK slots.
1. Hook into the host (Lisp) GC (in an implementation-dependent way,
obviously): Whenever a PERSISTENT-THING object is created, annotate
its object ID in an IDS-IN-USE structure (for example a hash table)
belonging to its rucksack. During (Rucksack) GC, add the IDs in
this structure to the roots. When such an object is finalized by
the host (Lisp), remove its ID from IDS-IN-USE.
The idea is that (live) transient references to persistent objects
are also traversed during the scanning phase of the GC.
2. Similar, but portable and coupled to transactions: Whenever a
PERSISTENT-THING object is created, annotate its object ID in an
IDS-IN-USE structure (for example a hash table) belonging to the
current transaction. During (Rucksack) GC, add the IDs in this
structure to the roots. IDS-IN-USE will obviously vanish once the
transaction is finished.
The idea is that transient references to persistent objects which
were created in the current transaction are also traversed during
the scanning phase of the GC.
Both "solutions" fail because objects are in a way written to disk too
late - even if an ID belongs to the root set examined by Rucksack's
GC, that does not necessarily imply that its descendants will be
marked. At the very least, one would have to change the way the GC
scanning phase works, perhaps taking the dirty queue into account
somehow.
In addition, the first solution fails because TRANSACTION-COMMIT is
usually not called in the lexical contour that declared the transient
references. I have no idea how and if this could be tackled.
Again, all this gets even more complicated if we think about
concurrent transactions...
Some "meta-ideas" about the GC:
A. We could abandon the idea of a concurrent GC and only run the GC
when there's no transaction, similar to what Plob! does.
B. We could abandon the idea of roots and reachability and instead
explicitely delete objects, similar to what AllegroCache does.
Hmmm....
More information about the rucksack-devel
mailing list