[rucksack-cvs] CVS rucksack/doc

alemmens alemmens at common-lisp.net
Mon Feb 11 13:00:12 UTC 2008


Update of /project/rucksack/cvsroot/rucksack/doc
In directory clnet:/tmp/cvs-serv24432/doc

Added Files:
	do.txt done.txt glossary.txt internals.txt notes.txt 
	talk-eclm2006.txt 
Log Message:
Moved documentation files to doc directory.


--- /project/rucksack/cvsroot/rucksack/doc/do.txt	2008/02/11 13:00:12	NONE
+++ /project/rucksack/cvsroot/rucksack/doc/do.txt	2008/02/11 13:00:12	1.1
DO: 

- Make Rucksack crash proof.  (Use a copying GC?)

- Make sure that the GC gets rid of all obsolete object versions.

- Add export/import to s-expression format.  This is necessary
  for migrating existing rucksacks to a new version of Rucksack.

- Give each transaction its own commit file (the name can be generated
  from the transaction id).  That's one step towards avoiding locks
  on transaction commit.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
* MAYBE LATER
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

- Maybe signal a continuable error when the in-memory class definition does
  not correspond to the most recent schema.  If the user decides to
  continue, UPDATE-PERSISTENT-INSTANCE-... will be called when necessary.

- Think about non-persistent slots.  Should we initialize them during
  LOAD-OBJECT?  Do we need them at all?

- I'm not sure that :INCLUDE-SUBCLASSES NIL makes sense for
  RUCKSACK-MAP-SLOT.  Think about this.

- Deal with CHANGE-CLASS: call UPDATE-PERSISTENT-INSTANCE-FOR-DIFFERENT-CLASS
  when necessary.  (Maybe it's never necessary and we can just use the
  existing UPDATE-INSTANCE-FOR-DIFFERENT-CLASS mechanism?)
--- /project/rucksack/cvsroot/rucksack/doc/done.txt	2008/02/11 13:00:12	NONE
+++ /project/rucksack/cvsroot/rucksack/doc/done.txt	2008/02/11 13:00:12	1.1
* 2008-02-11 - version 0.1.16

Created new doc directory and added tutorial by Brad Beveridge.

Added P-PUSH and P-POP.

Improved btree efficiency by switching to a different data structure
for the bindings.  Instead of using a persistent cons for each key/
value pair, we now put the keys and values directly into the bnode
vector.  This speeds up most btree operations because it reduces
persistent consing when adding new values and it reduces indirections
when searching for keys.

Renamed BTREE-NODE to BNODE, BTREE-NODE-INDEX to BNODE-BINDINGS,
BTREE-NODE-INDEX-COUNT to BNODE-NR-BINDINGS, FIND-BINDING-IN-NODE to
FIND-KEY-IN-NODE.

Fix a missing argument bug in REMOVE-CLASS-INDEX.

Added a LAZY-CACHE which just clears the entire hash table whenever
the cache gets full.  This improves memory usage, because the normal
cache queue kept track of a lot of objects that for some reason
couldn't be cleaned up by the implementation's garbage collector.

Added the convenience macros RUCKSACK-DO-CLASS and RUCKSACK-DO-SLOT.

Made RUCKSACK-DELETE-OBJECT an exported symbol of the RUCKSACK
package.

Fix a bug in TEST-NON-UNIQUE-BTREE: it should call
CHECK-NON-UNIQUE-CONTENTS instead of CHECK-CONTENTS.


* 2008-02-02 - version 0.1.15

Fixed a garbage collector bug reported by Sean Ross. When the garbage
collector deletes object ids from the object table (because the
objects are dead and we may want to reuse their ids later for other
objects), it should also remove that object from the cache.  If it
doesn't, there's a possibility that the object id will be reused later
for a new object and the cache wil still refer to the old in-memory
object.


* 2008-01-31 - version 0.1.14

Class and slot indexes now map directly to objects instead of
object-ids.  This fixes a bug where the garbage collector forgot to
add all indexed objects to the root set. (Suggested by Sean Ross.)

Increase default cache size to 100,000 objects.



* 2008-01-23 - version 0.1.13

Add Brad Beveridge's basic unit test suite (modified to work with
lisp-unit instead of 5am).

Add Chris Riesbeck's lisp-unit library to help with creating
unit test suites.

Move all tests to their own directory.

Add P-NREVERSE and P-POSITION for persistent lists.

Fix bugs in P-REPLACE and P-MAPCAR.


* 2008-01-22 - version 0.1.12

Use (ARRAY-DIMENSION buffer 0) instead of LENGTH in LOAD-BUFFER,
because we want to ignore the fill pointer here.  Thanks to Sean Ross.


* 2008-01-22 - version 0.1.11

Fix bug caused by LEAF-DELETE-KEY.  Reported and fixed by Brad
Beveridge.

Fix some typos (:VALUE should be :VALUE=) in index.lisp.


* 2008-01-16 - version 0.1.10

When deleting a key from a btree, use the BTREE-KEY= function (not
P-EQL) to determine the position of the key.  Reported and fixed
by Leonid Novikov.


* 2007-08-12 - version 0.1.9

Fix btree bug during btree-delete: if we're deleting the biggest key
from a leaf, we should update the parents so they'll use the key that
has now become the biggest.  (Henrik Hjelte.)

Try to signal an error when an incompatible value is given to indexed
slots, e.g. trying to put a string into a slot with a :symbol-index.
(Takehiko Abe)

Signal an error during when putting duplicate values into a slot for
which duplicate values are not allowed.  (Takehiko Abe)

Use BTREE-VALUE-TYPE, not BTREE-KEY-TYPE, when type checking a value
during BTREE-INSERT.  (Takehiko Abe)

Wrap COMPILE-FILE calls in a WITH-COMPILATION-UNIT to prevent
superfluous warnings about undefined functions.


* 2007-03-13 - version 0.1.8

Fix a bug in LEAF-DELETE-KEY (thanks to Henrik Hjelte).

Add RUCKSACK-DELETE-OBJECT, RUCKSACK-DELETE-OBJECTS and
RUCKSACK-ROOT-P (suggested by Henrik Hjelte).  I haven't tested these
functions yet.


* 2007-01-22 - version 0.1.7

Get rid of two SBCL compiler warnings. (Reported by Cyrus Harmon.)


* 2007-01-21 - version 0.1.6

Added serializing/deserializing of structures.  Only works on SBCL.
(Thanks to Levente Mészáros.)


* 2006-11-30

- FLET MAP-INDEXES should be LABELS MAP-INDEXES (thanks to Cyrus Harmon).

- The :EQUAL parameter for MAP-INDEX-DATA wasn't handled correctly
  for indexes with non-unique keys (reported by Cyrus Harmon).


* 2006-09-04

- Take care of some differences between the MOP implementations of Lispworks
  and SBCL.  Lispworks doesn't call (setf slot-value-using-class) in
  SHARED-INITIALIZE, but SBCL does.  Lispworks calls FINALIZE-INHERITANCE
  after a class is redefined and a new instance is created, but SBCL
  doesn't.  All tests now work for Lispworks (5.0) and SBCL (0.9.16).

- Some work on a copying GC.

* 2006-09-03

- Handle updates of in-memory persistent objects by writing a method
  for Lisp's UPDATE-INSTANCE-FOR-REDEFINED-CLASS that marks the object
  as dirty and calls Rucksack's
  UPDATE-PERSISTENT-INSTANCE-FOR-REDEFINED-CLASS.


* 2006-09-01

- Get rid of the Lispworks-specific PROCESS-A-SLOT-OPTION stuff and handle
  the slot options in a way that's compatible with AMOP.

- Removed INITARGS argument for UPDATE-PERSISTENT-INSTANCE-FOR-REDEFINED-CLASS,
  because it turns out not to be necessary (see details in notes.txt).

- Add explanation to test-index-1a.lisp about the use of
    (eval-when (:compile-toplevel :load-toplevel :execute) ...)

- Replace *RUCKSACK* by RS in test-*.lisp.


* 2006-08-31

- Write test cases for schema updates and user defined methods for
  UPDATE-PERSISTENT-INSTANCE-FOR-REDEFINED-CLASS.

- Indexing: compare the specified slot/class indexes to the indexes that
  exist in the Rucksack, *not* to the indexes specified in the previous
  version of the class definition.  Otherwise we get inconsistencies
  when we recompile class definitions from scratch with a Rucksack that
  already exists.

- Write test case for slots with redefined indexes.  This also tests
  the default method for UPDATE-PERSISTENT-INSTANCE-FOR-REDEFINED-CLASS.

* 2006-08-30

- FINALIZE-INHERITANCE: Compute slot diffs for obsolete schemas.

- More work on UPDATE-PERSISTENT-INSTANCE-FOR-REDEFINED-CLASS.

* 2006-08-29

- Partial implementation of UPDATE-PERSISTENT-INSTANCE-FOR-REDEFINED-CLASS
  & friends.

* 2006-08-29

- Example-1: indexing should still work after recompiling.

- RUCKSACK-UPDATE-SLOT-INDEXES: Remove indexes for old slots that
  don't exist anymore.

- Some work on schema updates.

- Compute persistent slots at the right moment.


* 2006-08-26

- Make sure that indexing works correctly with subclasses.
- Fix some more indexing bugs.


* 2006-08

- The class and slot indexes were normal hash tables, but they should be
  persistent objects like everything else: I replaced them by btrees.

- Get process-lock and process-unlock working on SBCL (thanks to Geoff Cant).


* 2006-08

- Save and load the index tables when closing/opening a rucksack.

- Implement the :UNIQUE slot option.

- Improve predefined slot index specs.


* 2006-08

- Add a SERIAL-TRANSACTION-RUCKSACK class that allows for only one transaction
  at a time (by using a transaction lock).  This allows for a fast track
  towards a working Rucksack implementation.  Then parallel transactions
  can be added later.

- Don't do any GC at all while a transaction is writing objects to disk.
  Instead we keep track of the amount of disk space allocated by the committing
  transaction.  Then we do a (partial) GC immediately after committing the
  transaction.
--- /project/rucksack/cvsroot/rucksack/doc/glossary.txt	2008/02/11 13:00:12	NONE
+++ /project/rucksack/cvsroot/rucksack/doc/glossary.txt	2008/02/11 13:00:12	1.1
;; $Header: /project/rucksack/cvsroot/rucksack/doc/glossary.txt,v 1.1 2008/02/11 13:00:10 alemmens Exp $

* block

A free list block on disk.  Each block has a fixed size header
(currently 8 octets).  The header is followed by a serialized integer:
if this integer is positive, it is the id of the object whose contents
are serialized in this block.  If the integer is negative, the block
belongs to a free list and is not in use; the integer's absolute value
is the size of the block (the sweep phase of the garbage collector
needs this block size).

Also used as an abbreviation for a block's heap position.


* class designator

Either a class name (i.e. a symbol) or a class.  See the CLHS glossary.


* compatible object version

The object version that's compatible with a transaction T is the most
recent version that's not younger than T.

* index spec

A non-keyword symbol (the name of an indexing class) or a list
starting with a symbol (the name of an indexing class) followed by a
plist of keywords and values (initargs for the indexing class).

Examples: BTREE, (BTREE :KEY< <  :VALUE= P-EQL).


* index spec designator

Either an index spec or the name (i.e. a keyword) of an index spec
that has been defined with DEFINE-INDEX-SPEC.

Example: :STRING-INDEX.


* object version list

The list with committed object versions.  The list is ordered by
transaction timestamp of the transaction that created/modified the
object.  The ordering is most recent transaction first.


* open transaction

A transaction that hasn't rolled back or committed yet.


* partial transaction

This is shorthand for 'partially committed transaction', i.e. a
transaction that has started a commit operation but hasn't finished it
yet.

* root object

An object that's part of the root set.

* root set

The root set for a garbage collector is the set of objects from which
all other live objects can be reached.  Any object that can not be
reached from a root object is considered dead: its disk space may be
reused by another object if necessary.


* slot designator

Either a symbol (a slot name) or a slot-definition metaobject.

--- /project/rucksack/cvsroot/rucksack/doc/internals.txt	2008/02/11 13:00:12	NONE
+++ /project/rucksack/cvsroot/rucksack/doc/internals.txt	2008/02/11 13:00:12	1.1
RUCKSACK INTERNALS


* Free list heaps

A free-list-heap starts with an 8-byte address ('disk pointer') that
points to the end of the heap.  This is followed by as many
'disk-pointers' as the heap has free lists: each disk pointer points
to the first free block on that free list.

* Object table

The object table is a free-list-heap with exactly one free list, so it
contains one free list pointer.


* The real heap

The 'real' heap contains 32 free lists, with a smallest block size of
16 (i.e. 2^4) and a largest block size of 2^(4+31), i.e.  32 GB.


* Blocks and objects
 
The heap contains blocks of different sizes (currently the block sizes
are powers of 2; starting with blocks of 16 bytes).  Each block starts
with an 8-byte header.  If the block is unoccupied, the header
contains a pointer to the next block in the free list; otherwise it
contains the size of the block.
 
The header is followed by a serialized value which is either NIL, a
positive integer or a negative integer.  If it's NIL, the block is
occupied by an object of which there is exactly one version.  If it's
a positive integer, the block is occupied by an object and the integer
is a pointer to (the heap position of) the previously saved version of
the object.  If it's negative, the block belongs to a free list and is
not in use; the integer's absolute value is the size of the block (the
sweep phase of the garbage collector needs this block size).
 
[OCCUPIED BLOCK]:
  0- 8: block size
  8-15: pointer to previous version (nil or an integer)
   .. : transaction id
   .. : object id
   .. : nr of slots
   .. : schema id
   ...: serialized slots
   ...: maybe some free space
 
[FREE BLOCK]:
  0- 8: pointer to next free block
  ..  : the negative of the block size
  ... : free space

[9 lines skipped]
--- /project/rucksack/cvsroot/rucksack/doc/notes.txt	2008/02/11 13:00:12	NONE
+++ /project/rucksack/cvsroot/rucksack/doc/notes.txt	2008/02/11 13:00:12	1.1

[102 lines skipped]
--- /project/rucksack/cvsroot/rucksack/doc/talk-eclm2006.txt	2008/02/11 13:00:12	NONE
+++ /project/rucksack/cvsroot/rucksack/doc/talk-eclm2006.txt	2008/02/11 13:00:12	1.1

[1293 lines skipped]



More information about the rucksack-cvs mailing list