[elephant-cvs] CVS update: elephant/NOTES

blee at common-lisp.net blee at common-lisp.net
Sun Sep 19 17:39:59 UTC 2004


Update of /project/elephant/cvsroot/elephant
In directory common-lisp.net:/tmp/cvs-serv26826

Modified Files:
	NOTES 
Log Message:
updates

Date: Sun Sep 19 19:39:59 2004
Author: blee

Index: elephant/NOTES
diff -u elephant/NOTES:1.4 elephant/NOTES:1.5
--- elephant/NOTES:1.4	Mon Aug 30 23:37:36 2004
+++ elephant/NOTES	Sun Sep 19 19:39:59 2004
@@ -3,10 +3,10 @@
 GENERAL
 -------
 
-this has been optimized for use with CMUCL.  it has been
-tested and somewhat optimized for allegro.  SBCL and OpenMCL
-are definitely also desired targets.  Lispworks is a target
-as well but less so: i don't have access to it.
+this has been optimized for use with CMUCL / SBCL.  it has
+been tested and somewhat optimized for allegro.  OpenMCL is
+definitely also a target.  Lispworks is a target as well but
+less so: i don't have access to it.
 
 Theoretically one can port this to any lisp with a decent
 FFI and MOP.  However since those are two of the less
@@ -46,6 +46,10 @@
 slot-boundp-using-class inside of shared-initialize, which
 necessitates some work.
 
+CMUCL doesn't do non-standard allocation types correctly, so
+we've created our own slot definition keyword :transient.
+In the future this will change.
+
 Andrew will add some notes here in the future.
 
 -----------
@@ -89,8 +93,25 @@
 over ordinary hash-tables from the point of view of
 persistence.
 
-TODO: programmatic way to create secondary indicies
-(probably Lisp-level, since FFI callbacks are nasty.)
+There is a separate table for BTrees.  This is because we
+use a hand coded C function for sorting, which understands a
+little of the serialized data.  It can handle numbers (up to
+64-bit bignums -- they are approximated by floats) and
+strings (case-insensitive for 8-bit, code-point-order for
+16-bit Unicode.)  It should be fast but we don't want a
+performance penalty on objects.
+
+Secondary indices are mostly handled on the lisp side,
+because of our weird table layout (see below) and to avoid
+crossing FFI boundaries.  Some unscientific microbenchmarks
+indicated that there was no performance benefit on CMUCL /
+SBCL, and only minor benefit (asymptotically nil) on
+OpenMCL.  They have a separate table.  Actually two handles
+are opened on this table: one which is plain, and one which
+is associated to the primary btree table by a no-op indexing
+function.  Since we maintain the secondary keys ourselves,
+the associated handle is good for gets / cursor traversals.
+We use the unassociated handle for updates.
 
 ----------
 CONTROLLER
@@ -142,13 +163,15 @@
 
 OID + Slot ID
 
-Collections use
+Collections use 2 tables, one for primaries and one for
+secondaries (which supports duplicates.)  They are keyed on
 
 OID + key
 
-the root object is a btree with OID = 0.  Since keys are
+The root object is a btree with OID = 0.  Since keys are
 lexicographically ordered, this will create cache locality
-for items in the same persistent object / collection.
+for items in the same persistent object / collection.  We
+use a custom C sorter for the btree tables.
 
 Other layout options:
 
@@ -214,7 +237,7 @@
 CMUCL's consing dpb/ldb arithmetic means serializing bignums
 conses (but they shouldn't have to!)  Serializing everything
 else should not cons (with the exception of maybe symbols
-and pathnames.)
+and pathnames.)  SBCL seems much better with this.
 
 Deserialization of fixnums is non-consing.  floats appear to
 cons on CMUCL, i'm not sure if this is just because of
@@ -300,15 +323,17 @@
 pointer-arithmetic is bignum and therefore consing.
 
 TODO: write faster, lispier versions of the
-pointer-arithmetic functions.  (Definitely possible under
-OpenMCL; maybe possible using SAP arithmetic under CMUCL.
-Dunno about Allegro, Lispworks.)
+pointer-arithmetic functions.  This is done for CMUCL /
+SBCL.  (Definitely possible under OpenMCL.  Dunno about
+Allegro, Lispworks.)
 
 CMUCL et al can't do dynamic-extent buffers, so we use
 globals bound to specials, which should be thread-safe if
 properly initialized.  While we provide functions talk to
-the DB using strings, Elephant itself only uses foreign char
-buffers.
+the DB using strings, Elephant itself only uses
+"buffer-streams", which are structures which have a
+stream-like interface to foreign char buffers for reading /
+writing C datatypes.
 
 Lispworks is much happier passing back and forth statically
 allocated lisp arrays.  since the general string will almost





More information about the Elephant-cvs mailing list