[elephant-cvs] CVS elephant/doc
ieslick
ieslick at common-lisp.net
Sun Apr 1 20:22:24 UTC 2007
Update of /project/elephant/cvsroot/elephant/doc
In directory clnet:/tmp/cvs-serv22287
Modified Files:
tutorial.texinfo user-guide.texinfo
Log Message:
Documentation changes, mostly to transaction section of tutorial
--- /project/elephant/cvsroot/elephant/doc/tutorial.texinfo 2007/04/01 14:33:29 1.11
+++ /project/elephant/cvsroot/elephant/doc/tutorial.texinfo 2007/04/01 20:22:24 1.12
@@ -732,6 +732,8 @@
transaction that performs all the updates atomically and thus
enforcing consistency.
+ at subsection Why do we need Transactions?
+
Most real applications will need to use explicit transactions rather
than relying on the primitives alone because you will want multiple
read-modify-update operations act as an atomic unit. A good example
@@ -815,6 +817,8 @@
And presto, we have an ACID compliant, thread-safe, persistent banking
system!
+ at subsection Using @code{with-transaction}
+
What is @code{with-transaction} really doing for us? It first starts
a new transaction, attempts to execute the body, and if successful
commit the transaction. If anywhere along the way there is a deadlock
@@ -823,14 +827,145 @@
to retry the transaction a fixed number of times by re-executing the
whole body.
-The other value transactions provide is the capability to delay
-flushing dirty data to disk. The most time-intensive part of
-persistent operations is flushing newly written data to disk. Using
-the default auto-commit behavior requires a flush on every operation
-which can become very expensive. Because a transaction caches values,
-all the values read or written are cached in memory until the
-transaction completes, dramatically decreasing the number of flushes
-and the total time taken.
+And this brings us to two important caveats: nested transactions and
+idempotent side-effects.
+
+ at subsection Nesting Transactions
+
+In general, you want to avoid nesting @code{with-transaction}
+statements. Nested transactions are valid for some data stores
+(namely Berkeley DB), but typically only a single transaction can be
+active at a time. The purpose of a nested transaction in data stores
+that provide it, is break a long transaction into chunks. This way if
+there is contention on a given subset of variables, only the inner
+transaction is restarted while the larger transaction can continue.
+When commit their results, those results become part of the outer
+transaction until it in turn commits.
+
+If you have transaction protected primitive operations (such as
+ at code{deposit} and @code{withdraw}) and you want to perform a group of
+such transactions, for example a transfer between accounts, you can
+use the macro @code{ensure-transaction} instead of @code{with-transaction}.
+
+ at lisp
+(defun deposit (account amount)
+ "Wrap the balance read and the setf with the new balance"
+ (ensure-transaction ()
+ (let ((balance (balance account)))
+ (setf (balance account)
+ (+ balance amount)))))
+
+(defun deposit (account amount)
+ "A more concise version with decf doing both read and write"
+ (ensure-transaction ()
+ (decf (balance account) amount)))
+
+(defun withdraw (account amount)
+ (ensure-transaction ()
+ (decf (balance account) amount)))
+
+(defun transfer (src dst amount)
+ "There are four primitive read/write operations
+ grouped together in this transaction"
+ (with-transaction ()
+ (withdraw src amount)
+ (deposit dst amount)))
+ at end lisp
+
+ at code{ensure-transaction} is exactly like @code{with-transaction}
+except it will reuse an existing transaction, if there is one, or
+create a new one. There is no harm, in fact, in using this macro all
+the time.
+
+Notice the use of @code{decf} and @code{incf} above. The primary
+reason to use Lisp is that it is good at hiding complexity using
+shorthand constructs just like this. This also means it is also going
+to be good at hiding data dependencies that should be captured in a
+transaction!
+
+ at subsection Idempotent Side Effects
+
+Within the body of a with-transaction, any non database operations
+need to be @emph{idempotent}. That is the side effects of the body
+must be the same no matter how many times the body is executed. This
+is done automatically for side effects on the database, but not for
+side effects like pushing a value on a lisp list, or creating a new
+standard object.
+
+ at lisp
+(defparameter *transient-objects* nil)
+
+(defun load-transients (n)
+ "This is the wrong way!"
+ (with-transaction ()
+ (loop for i from 0 upto n do
+ (push (get-from-root i) *transient-objects*))))
+ at end lisp
+
+In this contrived example we are pulling a set of standard objects
+from the database using an integer key and pushing them onto a list
+for later use. However, if there is a conflict where some other
+process writes a key-value pair to a matching key, the whole
+transaction will abort and the loop will be run again. In a heavily
+contended system you might see results like the following.
+
+ at lisp
+(defun test-list ()
+ (setf *transient-objects* nil)
+ (load-transients)
+ (length *transient-objects*))
+
+(test-list)
+=> 3
+
+(test-list)
+=> 5
+
+(test-list)
+=> 4
+ at end lisp
+
+So the solution is to make sure that the operation on the lisp
+parameters is atomic if the transaction completes.
+
+ at lisp
+(defun load-transients ()
+ "This is a better way"
+ (setq *transient-objects*
+ (with-transaction ()
+ (loop for i from 0 upto 3 collect
+ (get-from-root i)))))
+ at end lisp
+
+Of course we would need to use @code{nreverse} if we cared about the
+order of instances in @code{*transient-objects*}. The best rule of
+thumb is that transaction bodies should be purely functional as above,
+except for side effects to the persistent store such as persistent
+slot writes, adding to btrees, etc).
+
+If you do need side effects to lisp memory, such as writes to
+transient slots, make sure they are idempotent and that other
+processes will not be reading the written values until the transaction
+completes.
+
+ at subsection Transactions and Performance
+
+By now transactions almost look like more work than they are worth!
+Well there are still some significant benefits to be had. Part of how
+transactions are implemented is that they gather together all the
+writes that are supposed to made to the database and store them until
+the transaction commits, and then writes them atomically.
+
+The most time-intensive part of persistent operations is flushing
+newly written data to disk. Using the default auto-committing
+behavior requires a flush for every primitive write operation. This
+can become very expensive! Because all the values read or written are
+cached in memory until the transaction completes, the number of
+flushes can be dramatically reduced.
+
+But don't take my word for it, run the following statements and see
+for yourself the visceral impact transactions can have on system
+performance.
@lisp
(defpclass test ()
@@ -872,52 +1007,42 @@
thumb is to keep the number of objects touched in a transaction well
under 1000.
-And this brings us to the last caveat we'll introduce in this
-introductory tutorial: nested transactions.
-
-In general, avoid nesting transactions. Nested transactions are valid
-for some data stores (namely Berkeley DB), but typically only a single
-transaction is valid at a time. The purpose of a nested transaction
-is to allow a long transaction to be broken up into chunks. This way
-if there is contention on a given subset of variables, only the
-subtransaction is restarted while the larger transaction can continue.
-Subtransactions commit their results and they become part of the
-outer transaction until it in turn commits.
-
-If you have transaction protected primitive operations (such as
- at code{deposit} and @code{withdraw}) and you want to perform a group of
-such transactions, for example a transfer between accounts, you can
-use the macro @code{ensure-transaction} instead of @code{with-transaction}.
-
- at lisp
-(defun deposit (account amount)
- (ensure-transaction ()
- (let ((balance (balance account)))
- (setf (balance account)
- (+ balance amount)))))
-
-(defun withdraw (account amount)
- (ensure-transaction ()
- (decf (balance account) amount)))
-
-(defun transfer (src dst amount)
- (with-transaction ()
- (withdraw src amount)
- (deposit dst amount)))
- at end lisp
-
- at code{ensure-transaction} is exactly like @code{with-transaction}
-except it will reuse an existing transaction, if there is one, or
-create a new one. There is no harm, in fact, in using this macro all
-the time.
+ at subsection Transactions and Applications
Designing and tuning a transactional architecture can become quite
-complicated. The best strategy at the beginning is a conservative
-one, break things up into the smallest logical sets of primitive
-operations and only wrap higher level functions in transactions when
-they absolutely have to commit together. See @ref{Transaction Details}
-for the full details and @pxref{Usage Scenarios} for more examples of
-how systems can be designed and tuned using transactions.
+complex. Moreover, bugs in your system can be very difficult to find
+as they only show up when transactions are interleaved within a
+larger, multi-threaded application.
+
+In many cases, however, you can ignore transactions. For example,
+when you don't have any other concurrent processes running. In this
+case all operations are sequential and there is no chance of
+conflicts. You would only want to use transactions for write
+performance.
+
+You can also ignore transactions if your application can guarantee
+that concurrency won't generate any conflicts. For example, a web app
+that guarantees only one thread will write to objects in a particular
+session can avoid transactions altogether. However, it is good to be
+careful about making these assumptions. In the above example, a
+reporting function that iterates over sessions, users or other objects
+may still see partial updates (i.e. a user's id was written prior to
+the query, but not the name). However, if you don't care about these
+infrequent glitches, this case would still hold.
+
+If these cases don't apply to your application, or you aren't sure,
+you will fare best by programming defensively. Break your system into
+the smallest logical sets of primitive operations
+(i.e. @code{withdraw} and @code{deposit}) using
+ at code{ensure-transaction} and then wrap the highest level calls made
+to your system in with-transaction when the operations absolutely have
+to commit together or you need the extra performance. Try not to have
+more than two levels of transactional accesses with the top using
+with-transaction and the bottom using ensure-transaction.
+
+ at xref{Transaction Details} for more details and @pxref{Usage
+Scenarios} for examples of how systems can be designed and tuned using
+transactions.
@node Advanced Topics
@comment node-name, next, previous, up
--- /project/elephant/cvsroot/elephant/doc/user-guide.texinfo 2007/04/01 14:33:29 1.5
+++ /project/elephant/cvsroot/elephant/doc/user-guide.texinfo 2007/04/01 20:22:24 1.6
@@ -23,26 +23,6 @@
* Performance Tuning:: How to get the most from Elephant.
@end menu
- at node Persistent objects
- at comment node-name, next, previous, up
- at section Persistent Objects
-
-Finally, if you for some reason make an instance with a specified OID
-which already exists in the database, @code{initargs} take precedence
-over values in the database, which take precedences over
- at code{initforms}.
-
-Also currently there is a bug where
- at code{initforms} are always evaluated, so beware.
-(What is the current model here?)
-
-Readers, writers, accessors, and @code{slot-value-using-class} are
-employed in redirecting slot accesses to the database, so override
-these with care. Because @code{slot-value, slot-boundp,
-slot-makunbound} are not generic functions, they are not guaranteed by
-the specification to work properly with persistent slots. However the
-proper behavior has been verified on SBCL, Allegro and Lispworks.
-
@node The Store Controller
@comment node-name, next, previous, up
@section The Store Controller
@@ -90,6 +70,26 @@
Empty.
+ at node Persistent objects
+ at comment node-name, next, previous, up
+ at section Persistent Objects
+
+Finally, if you for some reason make an instance with a specified OID
+which already exists in the database, @code{initargs} take precedence
+over values in the database, which take precedences over
+ at code{initforms}.
+
+Also currently there is a bug where
+ at code{initforms} are always evaluated, so beware.
+(What is the current model here?)
+
+Readers, writers, accessors, and @code{slot-value-using-class} are
+employed in redirecting slot accesses to the database, so override
+these with care. Because @code{slot-value, slot-boundp,
+slot-makunbound} are not generic functions, they are not guaranteed by
+the specification to work properly with persistent slots. However the
+proper behavior has been verified on SBCL, Allegro and Lispworks.
+
@node Class Indices
@comment node-name, next, previous, up
@section Class Indices
@@ -141,6 +141,111 @@
@comment node-name, next, previous, up
@section Querying persistent instances
+
+
+A SQL select-like interface is in the works, but for now queries are
+limited to manual mapping over class instances or doing small queries
+with @code{get-instances-*} functions. One advantage of this is that
+it is easy to estimate the performance costs of your queries and to
+choose standard and derived indices that give you the ordering and
+performance you want.
+
+There is, however, a quick and dirty query API example that is not
+officially supported in the release but is intended to invite comment.
+This is an example of a full query system that would automatically
+perform joins, use the appropriate indices and perhaps even adaptively
+suggest or add indices to facilitate better performance on common
+queries.
+
+There are two functions @ref{Function elephant:get-query-instances}
+and @ref{Function elephant:map-class-query} which accept a set of
+constraints instead of the familiar value or range arguments.
+
+We'll use the classes @code{person} and @code{department} to
+illustrate how to perform queries over a set of objects that may be
+constrainted by their relationships to other objects.
+
+ at lisp
+(defpclass person ()
+ ((name :initarg :name :index t)
+ (salary :initarg :salary :index t)
+ (department :initarg :dept)))
+
+(defmethod print-object ((p person) stream)
+ (format stream "#<PERS: ~A>" (slot-value p 'name)))
+
+(defun print-name (inst)
+ (format t "Name: ~A~%" (slot-value inst 'name)))
+
+(defpclass department ()
+ ((name :initarg :name)
+ (manager :initarg :manager)))
+
+(defmethod print-object ((d department) stream)
+ (format stream "#<DEPT ~A, mgr = ~A>"
+ (slot-value d 'name)
+ (when (slot-boundp d 'manager)
+ (slot-value (slot-value d 'manager) 'name))))
+ at end lisp
+
+Here we have a simple employee database with managers (also of type
+person) and departments. This simple system will provide fodder for
+some reasonably complex constraints. Let's create a few departments.
+
+ at lisp
+(setf marketing (make-instance 'department :name "Marketing"))
+(setf engineering (make-instance 'department :name "Engineering"))
+(setf sales (make-instance 'department :name "Sales"))
+ at end lisp
+
+And manager @code{people} for the departments.
+
+ at lisp
+(make-instance 'person :name "George" :salary 140000 :department marketing)
+(setf (slot-value marketing 'manager) *)
+
+(make-instance 'person :name "Sally" :salary 140000 :department engineering)
+(setf (slot-value engineering 'manager) *)
+
+(make-instance 'person :name "Freddy" :salary 180000 :department sales)
+(setf (slot-value sales 'manager) *)
+ at end lisp
+
+And of course we need some folks to manage
+
+ at lisp
+(defparameter *names*
+ '("Jacob" "Emily" "Michael" "Joshua" "Andrew" "Olivia" "Hannah" "Christopher"))
+
+(defun random-element (list)
+ "Choose a random element from the list and return it"
+ (nth (random (length list)) list))
+
+(with-transaction ()
+ (loop for i from 0 upto 40 do
+ (make-instance 'person
+ :name (format nil "~A~A" (random-elephant *names*) i)
+ :salary (floor (+ (* (random 1000) 100) 30000))
+ :department (case (random 3)
+ (0 marketing)
+ (1 engineering)
+ (2 sales)))))
+ at end lisp
+
+Due to the random allocation of
+In the follwoing examples below, the results will be different due to the random
+allocation of employee names, etc. However, these examples are
+illustrative of what you should see if you run the same code.
+
+
+
+For those familiar with SQL, if an instance of @code{person} has a
+pointer to an instance of @code{department} then that relation can be
+used to perform a join. Of course joins in the object world won't
+return a table, instead they will return conjunctions of objects that
+satisfy a mutual set of constraints.
+
+
@node Using BTrees
@comment node-name, next, previous, up
@section Using BTrees
@@ -174,6 +279,14 @@
@comment node-name, next, previous, up
@section Transaction Details
+You can trace @code{elephant::execute-transaction} to see the sequence
+of calls to @code{execute-transaction} that occur dynamically and
+detect where transactions are and are not happening. We may add some
+transaction diagnosis and tracing tools in the future, such as
+throwing a condition when @code{with-transaction} forms are nested
+dynamically.
+
+
;; Transaction architecture:
;;
;; User and designer considerations:
More information about the Elephant-cvs
mailing list