From nikodemus at random-state.net Wed May 17 15:54:09 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Wed, 17 May 2006 18:54:09 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Fri, 12 May 2006 17:20:57 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> Message-ID: <878xp0zly6.fsf@logxor.random-state.net> Getting back to this after pulling my head out of my rectum... "Arthur Lemmens" writes: > Now that I think about it, maybe I don't understand you correctly when > you say that the younger transaction should be aborted. I thought that > you meant that it should be aborted at step 2 in your scenario, but > maybe you're saying that it should only be aborted at step 3? *That* > sounds like it could be a reasonable solution. It makes use of multiple > versions as long as possible and comes up with a simple non-locking > conflict resolution when there really is no other choice left. > I think I'll go for that, unless you, Luke or Edi has a good reason why > I shouldn't. (Luke, do you think this also solves your left-to-right > vs. right-to-left scenario? I think that was basically the same thing > that Nikodemus describes here, but maybe I'm missing something? I said earlier that this was what I ment, but that is wrong. Scenario: C (counter) has value 0. A (transaction) enters. A increments C -> 1. B (transaction) enters. B sees the pristine C (as A hasn't committed yet), and increments it C -> 1. at this point only one of the two can commit, as their views aren't consistent anymore. After both have succesfully committed C should b 2. Since rollbacks are expensive the abort should happen as soon as possible, and it needs to be B so that younger transactions cannot starve older ones. Ergo, B should be aborted when it tries to increment C. So in this scenario copy-on-write doesn't buy much. :/ (If this was what _you_ ment, then everything is dandy and I was just being blind.) Other stuff: * Rucksacks: There are currently provisions for multiple Rucksacks, which I find good, but a single object can only belong to a single Rucksack (which I find good too). I am, however, a bit confused on what is the intention in the following cases: (defclass x () ((y :persistence t :accessor y-of)) (:metaclass persistent-class)) (let ((x (make-instance 'x))) ;; OK so far -- X may is potentially persitent, but not reachable ;; from any root-set or index, so no problems. (with-rucksack (r *my-rucksack*) (with-transaction () (add-rucksack-root x r))) ;; Still OK -- X has been saved, but that's all there is to it... ;; Untill we try the following: (setf (y-of x) 'y) ;; If this works, then I believe the instance in *my-rucksack* should be ;; updated -- otherwise persistence is a very precarious thing! ;; ;; If this fails, then what we have is a "dangling object"... not nice. ;; ;; In either case, another hairy case follows: (with-rucksack (r *other-rucksack*) (with-transaction () (add-rucksack-root x r))) ;; ...either this should signal an error because the rucksack is wrong, ;; transport the object from *my-rucksack* to *other-rucksack*, or ;; multiple Rucksacks per object should be allowed -- in which case ;; later pulling X from *my-rucksack* and updating it should also update ;; the X in *other-rucksack*. ;; ;; A combination of the above: (with-rucksack (r *my-rucksack*) (let (x1) (with-transaction () (add-rucksack-root (setf x1 (make-instance 'x)))) (with-rucksack (r *other-rucksack*) (with-tranaction () ;; In which Rucksack is X1? (add-rucksack-root (make-instance 'x :y x1) r)))))) I think there are many valid answers to the questions posed above (and the implicit questions about P-EQL in the above cases), but I think they boil down mostly to a meta-question: Is a single Rucksack a "universe" or a "container"? If a Rucksack is a universe, then it would make sense to enforce certain restrictions on that universe: eg. you cannot assign objects belonging to other universes to persistent slots. If a Rucksack is seen as a container, then giving both Rucksacks their own copies would be fine, but so (I think) would be allowing cross-rucksack references (so that *my-rucksack* could hold a pointer to an object in *other-rucksack*). In the universe case there is a relatively simple option that might untangle some things: make Rucksack a property of a class. (defclass foo-class (persistent-class) () (:default-initargs :rucksack "/var/rucksacks/foo/")) (defclass foo () (:metaclass foo-class)) or (defclass bar () () (:metaclass persisten-class) (:rucksack ...)) This would have implications on the whole API of course, but most that come to mind seem like simplifications -- not all though: eg. "What to do with PERSISTENT-DATA?" * Default transactions: are transactions intended to be explicit, or is the intention that eventually a simple (setf some-slot) would generate its own transaction unless there already was one? There is a pleasing clarity to explicit transactions, but they also mean that persistent objects cannot be manipulated by functions who don't know they are persistent. I think in at least 90% of the cases it would be natural for the called to provide a surrounding transaction contexts, but I'm not sure how universal that is. * GC of live-in-lisp but dead-on-disk objects. I should just read the code, I suppose, but what is the intended result here: (with-rucksack ... (let (x) (with-transaction () (setf x (make-instance 'persistent-thing))) ... do enough stuff to cause X to be GC'd from the file... (add-rucksack-root x ...))) Does X get a new object-id, or what? I realize this is long and long-winded. No rush on any of this: I mostly just wanted to get this downloaded from my head to list archives. ;-) Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Wed May 17 18:47:52 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Wed, 17 May 2006 20:47:52 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <878xp0zly6.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> Message-ID: [Luke: I keep you on CC now, but if you want to keep following this you'd better subscribe to rucksack-devel at common-lisp.net, which was created yesterday.] Nikodemus wrote: > Getting back to this after pulling my head out of my rectum... I'm not sure what this refers to? > Scenario: > > C (counter) has value 0. > > A (transaction) enters. > A increments C -> 1. > B (transaction) enters. > B sees the pristine C (as A hasn't committed yet), and increments it C -> 1. > > at this point only one of the two can commit, as their views aren't > consistent anymore. After both have succesfully committed C should b > 2. > > Since rollbacks are expensive the abort should happen as soon as > possible, and it needs to be B so that younger transactions cannot > starve older ones. Ergo, B should be aborted when it tries to > increment C. No, with my multiple versioning scheme this is not true. The important point here is that it does not (and should not) make any difference whether A has committed or not. The only difference between a committed transaction and an 'open' (not yet committed) transaction is that you can be sure that the committed transaction has written its changes to disk. 'Multiple versioning' is used both in memory and on disk, and the rule for fetching values is the same. (Although the implementation is quite different: in memory, you find a dirty object from its ID by finding the relevant transaction first and then looking the object up in the transaction's dirty-objects hash-table. On disk you find the most recently committed object version and then follow pointers to previously committed object versions if necessary.) The rule for fetching values ('objects' is probably a better word here) is: use the object version that has been modified by the youngest transaction that is older than (or the same as) the current transaction. So let's do your scenario again: > C (counter) has value 0. > > A (transaction) enters. > A increments C -> 1. > B (transaction) enters. > B sees the pristine C (as A hasn't committed yet) No, this part is wrong. According to the rule above, B will see the version of C modified by A. > and increments it C -> 1. No, it will increment it to 2. But A should still see C=1; if A tries to change C again (after B has changed it), B will be aborted (like we discussed earlier). Now that we're talking about this, I realize that there's a bug in my implementation of all this: at the moment I don't actually make in-memory copies of modified objects; I only do that for committed (on disk) versions. So I should revisit CACHE-TOUCH-OBJECT (and the functions that call CACHE-TOUCH-OBJECT) to make sure that this happens at the right moment. That's the scheme I had in mind. Does that sound reasonable to you? Arthur (I'll reply to your other points in another email. This one is long enough as it is.) From alemmens at xs4all.nl Wed May 17 19:16:26 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Wed, 17 May 2006 21:16:26 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <878xp0zly6.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > * Default transactions: are transactions intended to be explicit, > or is the intention that eventually a simple (setf some-slot) would > generate its own transaction unless there already was one? My plan is to add some kind of AUTO-COMMIT flag (with default value NIL) to rucksacks. Then I intend to implement something like this: IF a persistent object is modified AND we're not inside a transaction THEN IF the auto-commit flag is T THEN create a transaction change the object commit the transaction ELSE signal an error. > There is a pleasing clarity to explicit transactions, but they > also mean that persistent objects cannot be manipulated by functions > who don't know they are persistent. Yep. > I think in at least 90% of the cases it would be natural for the > called to provide a surrounding transaction contexts, but I'm not > sure how universal that is. Does the above sound like a reasonable solution? Arthur From alemmens at xs4all.nl Wed May 17 19:47:59 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Wed, 17 May 2006 21:47:59 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <878xp0zly6.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > * GC of live-in-lisp but dead-on-disk objects. I should just read the > code, I suppose, but what is the intended result here: > > (with-rucksack ... > (let (x) > (with-transaction () > (setf x (make-instance 'persistent-thing))) > ... do enough stuff to cause X to be GC'd from > the file... > (add-rucksack-root x ...))) > > Does X get a new object-id, or what? Good point; I discussed something similar with Martin Simmons during the ECLM. The short answer is that X doesn't get collected yet. Here's a longer answer: * Object table entries One byte of each entry in the object table is reserved for garbage collector information. This garbage collector byte contains one of the following serializer markers: ** free-block (#xB0) This means that the corresponding object id is not in use at the moment, and the block 'belongs to' the free list. ** live-object (#xB1) This means that the corresponding object can be reached from one of the garbage collector roots. ** dead-object (#xB2) In the mark phase of a mark-and-sweep garbage collection, all live objects are temporarily marked as dead. If the scanner can reach the object from one of the roots, it will be marked as alive during the scan phase. Otherwise, it will remain marked as dead until the sweeper reaches the block; at that point, the block will be returned to the free list and it will be marked as a free-block. ** reserved-object (#xB3) Used for entries in the object table that belong to objects that haven't been committed to disk yet. The X in your example is a RESERVED-OBJECT as long as the transaction hasn't committed yet, so it won't be collected by the GC. When the transaction commits, the marker will change to LIVE-OBJECT. (At least that's the plan; I think I haven't written the part that changes the marker to LIVE-OBJECT at the moment.) Only at the start phase of the *following* garbage collection round will X be treated like a normal object: seen as alive if it can be traced from the roots and as dead otherwise. (It wouldn't surprise me if there were other problems similar to this one that I haven't thought about yet. This correspondence between in-memory and on-disk stuff is rather tricky, and it's easy to forget some subtle point.) Arthur From alemmens at xs4all.nl Wed May 17 20:49:14 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Wed, 17 May 2006 22:49:14 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <878xp0zly6.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > I am, however, a bit confused on what is the intention in the > following cases: > > (defclass x () > ((y :persistence t :accessor y-of)) > (:metaclass persistent-class)) > > (let ((x (make-instance 'x))) > ;; OK so far -- X may is potentially persitent, but not reachable > ;; from any root-set or index, so no problems. Actually, I think this is not OK so far. I think you shouldn't MAKE-INSTANCE a persistent object unless there is an open current-rucksack (which is created by WITH-RUCKSACK, or maybe by (SETF (CURRENT-RUCKSACK) (OPEN-RUCKSACK ...)). When X is created, the RUCKSACK slot of X is bound to this rucksack. And Rucksack should probably signal an error at this point if there is no current rucksack. > (with-rucksack (r *my-rucksack*) > (with-transaction () > (add-rucksack-root x r))) > ;; Still OK If R is the same rucksack as the RUCKSACK slot of X, there's no problem here. Otherwise, I think an error should be signaled at this point. > ;; In either case, another hairy case follows: > (with-rucksack (r *other-rucksack*) > (with-transaction () > (add-rucksack-root x r))) > ;; ...either this should signal an error because the rucksack is wrong, Yes, I think that's what should happen (see above). > ;; transport the object from *my-rucksack* to *other-rucksack* I do want to provide for copying objects from one rucksack to another, but I don't think that this should happen automatically, because it's not a trivial thing to do. Do we want a deep copy (so we could copy an entire rucksack by just copying the roots) or a shallow copy, for example? We'd probably want a function like (defgeneric rucksack-import-object (rucksack persistent-object &key deep-copy)) But this is not on the top of my list. > ;; multiple Rucksacks per object should be allowed -- in which case > ;; later pulling X from *my-rucksack* and updating it should also update > ;; the X in *other-rucksack*. That sounds like a nightmare to me. But maybe that's just because I haven't thought about scenarios like that yet. > Is a single Rucksack a "universe" or a "container"? > > If a Rucksack is a universe, then it would make sense to enforce > certain restrictions on that universe: eg. you cannot assign objects > belonging to other universes to persistent slots. Yes, that's what I have in mind. I do think that it should be possible to work with multiple rucksacks at the same time, but I think there shouldn't be persistent connections between rucksacks. You *can* have non-persistent connections, for example a non-persistent object that contains objects from different rucksacks. And at a later stage I want to provide for some functions to copy/migrate objects from one rucksack to another. > In the universe case there is a relatively simple option that might > untangle some things: make Rucksack a property of a class. > > (defclass foo-class (persistent-class) () > (:default-initargs :rucksack "/var/rucksacks/foo/")) Hehe. Interesting idea. But then you can't migrate or copy objects between rucksacks anymore, can you? Interesting questions, these. I'm not saying that my answers are the best ones, I'm just explaining what I had in mind for Rucksack. If you think that another approach would work better, please let me know. By the way: something that I *would* like to do sooner or later (probably later ;-)) is to have distributed rucksacks: one rucksack distributed over more than one machine. Arthur From nikodemus at random-state.net Wed May 17 23:03:42 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 02:03:42 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Wed, 17 May 2006 20:47:52 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> Message-ID: <87fyj8z229.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > I'm not sure what this refers to? Me being obtuse and not seeing clearly, due to my head being stuck (metaphorically speaking) in an unpleasant location. ;) > No, with my multiple versioning scheme this is not true. The > important point here is that it does not (and should not) make any > difference whether A has committed or not. The only difference > between a committed transaction and an 'open' (not yet committed) > transaction is that you can be sure that the committed transaction > has written its changes to disk. So in Rucksack transactions are about consistency and durability, not isolation and atomicity? If so then the rest is moot. > The rule for fetching values ('objects' is probably a better word > here) is: use the object version that has been modified by the > youngest transaction that is older than (or the same as) the current > transaction. > No, this part is wrong. According to the rule above, B will see > the version of C modified by A. > >> and increments it C -> 1. > > No, it will increment it to 2. But A should still see C=1; if A > tries to change C again (after B has changed it), B will be > aborted (like we discussed earlier). I'm almost with you there. Assuming that transactions are supposed to be isolated and atomic, what about this: (A and B are transactions as before, C is a counter from 0.) Scenario 1. A enters. A increments C -> 1. B enters. B increments C -> 2. A aborts (rollback). B commits. Scenerio 2. A enters. A increments C -> 1. B enters. B increments C -> 2. B commits. A aborts (rollback). Are these possible timelines, or is there a conflict somewhere? If they are possible, what is the value of C? I'm fairly sure it should be 1, but I don't see how you can guarantee that in both cases. Am I just missing the obvious here. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Wed May 17 23:04:25 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 02:04:25 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Wed, 17 May 2006 21:16:26 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> Message-ID: <87ac9gz212.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > My plan is to add some kind of AUTO-COMMIT flag (with default value NIL) > to rucksacks. Then I intend to implement something like this: > > IF a persistent object is modified AND we're not inside a transaction > THEN IF the auto-commit flag is T > THEN create a transaction > change the object > commit the transaction > ELSE signal an error. > Does the above sound like a reasonable solution? Perfectly. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Thu May 18 07:30:32 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 10:30:32 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> Message-ID: <87ac9fiycn.fsf@logxor.random-state.net> "Arthur Lemmens" writes: >> * GC of live-in-lisp but dead-on-disk objects. I should just read the >> code, I suppose, but what is the intended result here: > ** reserved-object (#xB3) > > Used for entries in the object table that belong to objects that > haven't been committed to disk yet. > > The X in your example is a RESERVED-OBJECT as long as the transaction > hasn't committed yet, so it won't be collected by the GC. When the > transaction commits, the marker will change to LIVE-OBJECT. (At > least that's the plan; I think I haven't written the part that changes > the marker to LIVE-OBJECT at the moment.) > > Only at the start phase of the *following* garbage collection round > will X be treated like a normal object: seen as alive if it can be > traced from the roots and as dead otherwise. Good. This answers one of the things I forgot to ask about. ;-) Unfortunately I think I messed up my example, as it was about a different case. Like so: (with-rucksack (s ...) (let (x) (with-transction () (setf x (make-instance 'my-persistent-thing))) ;; Transaction is now over, and committed -- so if X live it is ;; being retained without a connection to the roots. (do-stuff-to-cause-a-gc-of-dead-x) (with-transaction () (add-rucksack-root x s)))) > (It wouldn't surprise me if there were other problems similar to this > one that I haven't thought about yet. This correspondence between > in-memory and on-disk stuff is rather tricky, and it's easy to forget > some subtle point.) I got an immediate headache thinking about it. ,-) It seems to me that for persistent objects (once they are actually stored on disk!) the "real" definition of identity is the object on disk, and the objects in memory are just "echoes". ...unless of course there are both persistent and non-persistent slots, in which case you really have two instances sharing the same shadowy body: for the persistent slots the definition of identity that matters is mediated by the Rucksack, but for non-persistent slots it is the plain old EQ -- which is a matter of luck depending on that cache. Interesting. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Thu May 18 07:56:08 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 10:56:08 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Wed, 17 May 2006 22:49:14 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> Message-ID: <874pznix5z.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > Actually, I think this is not OK so far. I think you shouldn't > MAKE-INSTANCE a persistent object unless there is an open current-rucksack > (which is created by WITH-RUCKSACK, or maybe by > (SETF (CURRENT-RUCKSACK) (OPEN-RUCKSACK ...)). Fair enough. > We'd probably want a function like > > (defgeneric rucksack-import-object (rucksack persistent-object &key deep-copy)) Sounds like a good plan. I assume this would be accompanied by a provided function IMPORT-OBJECT-COPYING-SLOTS, but users would need to explicitly defer to that? (Like with MAKE-LOAD-FORM.) >> Is a single Rucksack a "universe" or a "container"? >> >> If a Rucksack is a universe, then it would make sense to enforce >> certain restrictions on that universe: eg. you cannot assign objects >> belonging to other universes to persistent slots. > > Yes, that's what I have in mind. I do think that it should be possible > to work with multiple rucksacks at the same time, but I think there > shouldn't be persistent connections between rucksacks. You *can* have Right. This gets back to the question of identity: universe (rucksack) is what desides whether A bleeds when B is cut. >> In the universe case there is a relatively simple option that might >> untangle some things: make Rucksack a property of a class. > Hehe. Interesting idea. But then you can't migrate or copy objects > between rucksacks anymore, can you? It would be mildly tricky at least. ;-) But the class still needs to at least know to which Rucksacks its instances belong: otherwise it cannot update schemas of multiple rucksacks when it is redefined. ...and then you have another nightmare scenario: Rucksacks A and B open. Persistent Class FOO is defined. Instances of FOO are stored in Rucksack A. Other instances of FOO are stored in Rucksack B. Rucksack A closes. FOO is redefined, schema in B updates. Rucksack B closes. Lisp exists. New lisp sesssion. Rucksack A is opened and instances of FOO are fetched. Rucksack B is opened and instances of FOO are fetched. Which definition of FOO is in effect? Which instances are updated and when? If this is something that users should not do, then the Rucksack seems to me to effectively be a property of the class already! The only way to avoid that seems to prohibit the redefinition of class unless all interested Rucksacks are open... which will then require that Rucksacks know about each other, so that the requirement can be propagated to the next session. > Interesting questions, these. I'm not saying that my answers are the > best ones, I'm just explaining what I had in mind for Rucksack. If you > think that another approach would work better, please let me know. The reason to all this pestering is mostly to find out what you have in mind. ;-) > By the way: something that I *would* like to do sooner or later > (probably later ;-)) is to have distributed rucksacks: one rucksack > distributed over more than one machine. Cool stuff! So you'd be effectively building a distributed object system on top of Rucksack? Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Thu May 18 08:15:25 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 10:15:25 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <87ac9fiycn.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87ac9fiycn.fsf@logxor.random-state.net> Message-ID: [Taking Luke out of CC. He can read the archives if he wants to.] Nikodemus wrote: > (with-rucksack (s ...) > (let (x) > > (with-transction () > (setf x (make-instance 'my-persistent-thing))) > > ;; Transaction is now over, and committed -- so if X live it is > ;; being retained without a connection to the roots. > > (do-stuff-to-cause-a-gc-of-dead-x) > > (with-transaction () > (add-rucksack-root x s)))) Hmm. This sounds like a "don't do that then" scenario to me. I suppose we *could* automatically create a new object ID for X in the second transaction and commit it to disk, but I don't think that's a good idea. Changing object IDs behind the scenes sounds like asking for big trouble. Probably the best approach would be to have the GC automatically mark the in-memory copy of X as dead. Then we could signal an error (how about NECROPHILIA-NOT-ALLOWED-HERE?) for code that tries to do something with a dead object. So yes, in a sense Rucksack still can't eliminate all dangling pointers. I don't think there's much we can do about this specific case, except document the kind of things that you shouldn't do. > It seems to me that for persistent objects (once they are actually > stored on disk!) the "real" definition of identity is the object > on disk, and the objects in memory are just "echoes". Yes, that's the basic philosophy. > ...unless of course there are both persistent and non-persistent slots, > in which case you really have two instances sharing the same shadowy > body: for the persistent slots the definition of identity that matters > is mediated by the Rucksack, but for non-persistent slots it is the > plain old EQ -- which is a matter of luck depending on that cache. Yes. I'm not big on mixing persistent with non-persistent slots. I'm inclined to see non-persistent slots mostly as a way to cache some computations based on the values of persistent slots. But I would be interested in seeing other useful scenarios for non-persistent slots. Arthur From alemmens at xs4all.nl Thu May 18 08:49:16 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 10:49:16 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <874pznix5z.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <874pznix5z.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: >> (defgeneric rucksack-import-object (rucksack persistent-object &key deep-copy)) > > Sounds like a good plan. I assume this would be accompanied by a provided > function IMPORT-OBJECT-COPYING-SLOTS, but users would need to explicitly > defer to that? (Like with MAKE-LOAD-FORM.) Yes, I suppose so. But I'm not going to think very hard about that until I've got the basics working. > Rucksacks A and B open. > Persistent Class FOO is defined. > Instances of FOO are stored in Rucksack A. > Other instances of FOO are stored in Rucksack B. > Rucksack A closes. > FOO is redefined, schema in B updates. > Rucksack B closes. > Lisp exists. > > New lisp sesssion. > > Rucksack A is opened and instances of FOO are fetched. > Rucksack B is opened and instances of FOO are fetched. I *knew* you'd come up with something like that when I wrote my reply ;-) OK, let's see... I think the basic guideline should be that rucksacks never need to know about each other. And another assumption is that the most recent class definition comes from your program source, not from a rucksack. The schemas in a rucksack are just a way to make sure that Rucksack can adapt old instances to the current class definition in your program. So when a rucksack is opened and fetches an instance of FOO for the first time, Rucksack should probably do the same thing as when a class is redefined: compare the current class definition with the most recent schema, and create a new schema if there's a difference between the two. The new schema will be added to the current rucksack; other rucksacks are irrelevant at this point. Back to your scenario: when A is opened and fetches the first instance of FOO, Rucksack will add a new FOO schema to A. Afterwards, it will run the UPDATE-PERSISTENT-OBJECT-FOR-REDEFINED-CLASS (is that a new long name record?) function whenever A fetches an instance of FOO. When B is opened, Rucksack compares B's most recent FOO schema to the current class definition in the program. In your scenario, the schema probably doesn't differ from the class definition, so B doesn't need to create a new schema. Whenever B fetches an instance of FOO, it just creates a FOO according to the current class definition; it doesn't need to run UPDATE-PERSISTENT-OBJECT-FOR-REDEFINED-CLASS. So all in-memory instances of FOO will always match the class definition that comes from your program. And rucksacks don't need to know about each other. Does that sound reasonable, or am I missing something? > The only way to avoid that seems to prohibit the redefinition of class > unless all interested Rucksacks are open I hope I've shown that this prohibition is not necessary. >> By the way: something that I *would* like to do sooner or later >> (probably later ;-)) is to have distributed rucksacks: one rucksack >> distributed over more than one machine. > > Cool stuff! So you'd be effectively building a distributed object > system on top of Rucksack? That would be the idea. But I haven't though about this in any detail yet, so I'm not sure that Rucksack's current design would actually allow for this. Arthur From nikodemus at random-state.net Thu May 18 08:56:11 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 11:56:11 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87ac9fiycn.fsf@logxor.random-state.net> Message-ID: <87lkszhftg.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > Hmm. This sounds like a "don't do that then" scenario to me. I suppose > we *could* automatically create a new object ID for X in the second > transaction and commit it to disk, but I don't think that's a good idea. > Changing object IDs behind the scenes sounds like asking for big trouble. I was about to say something about Lisp and Rucksack identities of objects, but I just realized what the _real_ trouble with this is: slot values. If X was initialized as (make-instance 'my-p-thing :slot (make-instance 'my-p-thing-2)), then by the time we tried to add the root neither Lisp nor Rucksack would have the P-THING-2 around... > Probably the best approach would be to have the GC automatically mark > the in-memory copy of X as dead. Then we could signal an error (how > about NECROPHILIA-NOT-ALLOWED-HERE?) for code that tries to do something > with a dead object. I assume you mean "committing the transaction", not GC? If legality of the add-root bit depends on the amount of garbage collected in between. But given the trouble with slot-values, yes, I think you are right. I wonder if there should be a way to keep transient roots, though (eg. P-LET, which informs Rucksack about the variables it binds). Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Thu May 18 09:10:39 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 12:10:39 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Thu, 18 May 2006 10:49:16 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <874pznix5z.fsf@logxor.random-state.net> Message-ID: <87fyj7hf5c.fsf@logxor.random-state.net> "Arthur Lemmens" writes: >> Rucksacks A and B open. >> Persistent Class FOO is defined. >> Instances of FOO are stored in Rucksack A. >> Other instances of FOO are stored in Rucksack B. >> Rucksack A closes. >> FOO is redefined, schema in B updates. >> Rucksack B closes. >> Lisp exists. >> >> New lisp sesssion. >> >> Rucksack A is opened and instances of FOO are fetched. >> Rucksack B is opened and instances of FOO are fetched. > I think the basic guideline should be that rucksacks never need to know > about each other. And another assumption is that the most recent class > definition comes from your program source, not from a rucksack. The > schemas in a rucksack are just a way to make sure that Rucksack can adapt > old instances to the current class definition in your program. Right. What it there is no definition for class FOO in the new session? A LOAD-YOUR-STUFF-RIGHT-NOW-ERROR is signalled, I assume. > So when a rucksack is opened and fetches an instance of FOO for the first > time, Rucksack should probably do the same thing as when a class is > redefined: compare the current class definition with the most recent Makes sense. > Does that sound reasonable, or am I missing something? I can't think of anything offhand. ;-) The only point remaining is objects whose Rucksack has been closed: I assume they are effectively dead, and touching them isn't allowed? > I hope I've shown that this prohibition is not necessary. Yes. I'll do a short writeup on the desired semantics (based on this discussion) -- both so that I have chance to think this through again, and so that we can see if what I have understood is what you have ment. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Thu May 18 09:21:20 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 11:21:20 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <87fyj7hf5c.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <874pznix5z.fsf@logxor.random-state.net> <87fyj7hf5c.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: >> I think the basic guideline should be that rucksacks never need to know >> about each other. And another assumption is that the most recent class >> definition comes from your program source, not from a rucksack. The >> schemas in a rucksack are just a way to make sure that Rucksack can adapt >> old instances to the current class definition in your program. > > Right. What it there is no definition for class FOO in the new session? > A LOAD-YOUR-STUFF-RIGHT-NOW-ERROR is signalled, I assume. Yep. > The only point remaining is objects whose Rucksack has been closed: I > assume they are effectively dead, and touching them isn't allowed? I see no reason to prohibit reading them, but yes: changing them is not allowed. (I should probably add a check for that.) > I'll do a short writeup on the desired semantics (based on this > discussion) -- both so that I have chance to think this through again, > and so that we can see if what I have understood is what you have > ment. Great. Arthur From nikodemus at random-state.net Thu May 18 09:36:54 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 12:36:54 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Thu, 18 May 2006 11:21:20 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <874pznix5z.fsf@logxor.random-state.net> <87fyj7hf5c.fsf@logxor.random-state.net> Message-ID: <873bf7ad3d.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > I see no reason to prohibit reading them, but yes: changing them > is not allowed. (I should probably add a check for that.) Reading them is effectively prohibited anyways: if the Rucksack is closed the proxy cannot fetch the value... Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From mb at bese.it Thu May 18 09:34:30 2006 From: mb at bese.it (Marco Baringer) Date: Thu, 18 May 2006 11:34:30 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Thu, 18 May 2006 11:21:20 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <874pznix5z.fsf@logxor.random-state.net> <87fyj7hf5c.fsf@logxor.random-state.net> Message-ID: "Arthur Lemmens" writes: >> The only point remaining is objects whose Rucksack has been closed: I >> assume they are effectively dead, and touching them isn't allowed? > > I see no reason to prohibit reading them, but yes: changing them > is not allowed. (I should probably add a check for that.) i don't think you should be able to read those objects. 1) what happens if the rucksack is modified by someone else who has, at a later time, opened it? 2) what happens when i want to read a "forward-pointer"? i think it would be better, simpler to implement and reason about, to require all interaction with persistent objects to happen within a transaction. p.s. i may be jumping into this discussion late and i may have missed something in one of the previous mails. -- -Marco Ring the bells that still can ring. Forget the perfect offering. There is a crack in everything. That's how the light gets in. -Leonard Cohen From alemmens at xs4all.nl Thu May 18 09:30:40 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 11:30:40 +0200 Subject: [rucksack-devel] CVS repository for Rucksack Message-ID: Hi, There's a CVS repository for Rucksack on common-lisp.net now. See the CVS download instructions at common-lisp.net, or just select "Download tarball" at http://common-lisp.net/cgi-bin/viewcvs.cgi/?root=rucksack (I haven't tested either of these; if you have problems, let me know.) According to http://common-lisp.net/project-intro.shtml, any CVS commits will be sent to rucksack-cvs at common-lisp.net, but that doesn't seem to work at the moment. Erik, could you tell me if I need to do something to enable this? Thanks. If you want write access to the CVS repository, let me know and I'll try to figure out how to enable this. Arthur From alemmens at xs4all.nl Thu May 18 09:44:30 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 11:44:30 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <87lkszhftg.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87ac9fiycn.fsf@logxor.random-state.net> <87lkszhftg.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > I was about to say something about Lisp and Rucksack identities of objects, > but I just realized what the _real_ trouble with this is: slot values. I'm not sure that I understand what you mean here. > If X was initialized as (make-instance 'my-p-thing :slot > (make-instance 'my-p-thing-2)), then by the time we tried to add the > root neither Lisp nor Rucksack would have the P-THING-2 around... Well... Lisp would still have the P-THING-2 around, if I understand your example correctly. >> Probably the best approach would be to have the GC automatically mark >> the in-memory copy of X as dead. Then we could signal an error (how >> about NECROPHILIA-NOT-ALLOWED-HERE?) for code that tries to do something >> with a dead object. > > I assume you mean "committing the transaction", not GC? No, I meant GC. The transaction has no way of knowing if X is dead or not. The only way to know that for sure is by doing the complete trace phase of the garbage collector (just checking the roots is not enough). That's obviously not something you want to do for each transaction commit. > If legality of the add-root bit depends on the amount of garbage > collected in between. Yeah, that's unfortunate but there's nothing we can do about that. That's why I really think it's a "don't do that then" scenario. We can try to do some cheap checks to prevent this kind of stuff from happening, but we can't give any guarantee that you're protected from this. It's just too expensive. Unless I'm missing something, of course. > I wonder if there should be a way to keep transient roots, though (eg. > P-LET, which informs Rucksack about the variables it binds). Interesting idea. You mean we'd have to keep some kind of P-LET binding stack for a rucksack? And all objects on that stack would be part of the root set? Sounds like a neat hack, but I'd like to see some kind of example where this is actually useful before implementing this. Arthur From alemmens at xs4all.nl Thu May 18 09:58:27 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 11:58:27 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <873bf7ad3d.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <874pznix5z.fsf@logxor.random-state.net> <87fyj7hf5c.fsf@logxor.random-state.net> <873bf7ad3d.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: >> I see no reason to prohibit reading them, but yes: changing them >> is not allowed. (I should probably add a check for that.) > > Reading them is effectively prohibited anyways: if the Rucksack > is closed the proxy cannot fetch the value... Well... in principle something like (with-rucksack (...) (let (x) (with-transaction () (setq x (make-instance 'persistent-something :foo "xyz"))) (do-something-with-foo))) might work, because DO-SOMETHING-WITH-FOO wouldn't need to dereference any proxies in this case. But I agree with Marco that we just shouldn't allow this, even though it may work in some cases. Arthur From alemmens at xs4all.nl Thu May 18 09:53:01 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 11:53:01 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <874pznix5z.fsf@logxor.random-state.net> <87fyj7hf5c.fsf@logxor.random-state.net> Message-ID: Marco wrote: >>> The only point remaining is objects whose Rucksack has been closed: I >>> assume they are effectively dead, and touching them isn't allowed? >> >> I see no reason to prohibit reading them, but yes: changing them >> is not allowed. (I should probably add a check for that.) > > i don't think you should be able to read those objects. > > 1) what happens if the rucksack is modified by someone else who has, > at a later time, opened it? > > 2) what happens when i want to read a "forward-pointer"? > > i think it would be better, simpler to implement and reason about, to > require all interaction with persistent objects to happen within a > transaction. Yes, I think you're right. I'm not sure that we can always enforce this automatically, but we should document that you're not supposed to have any references to persistent objects outside of a transaction. That would also prohibit Nikodemus' trouble scenario. Arthur From nikodemus at random-state.net Thu May 18 10:45:09 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 13:45:09 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Thu, 18 May 2006 11:44:30 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87ac9fiycn.fsf@logxor.random-state.net> <87lkszhftg.fsf@logxor.random-state.net> Message-ID: <87ves38vd6.fsf@logxor.random-state.net> "Arthur Lemmens" writes: >> If X was initialized as (make-instance 'my-p-thing :slot >> (make-instance 'my-p-thing-2)), then by the time we tried to add the >> root neither Lisp nor Rucksack would have the P-THING-2 around... > > Well... Lisp would still have the P-THING-2 around, if I understand > your example correctly. My intention was that the SLOT was proxied -- so Lisp would not have a direct reference to it. >>> Probably the best approach would be to have the GC automatically mark >>> the in-memory copy of X as dead. Then we could signal an error (how >>> about NECROPHILIA-NOT-ALLOWED-HERE?) for code that tries to do something >>> with a dead object. >> >> I assume you mean "committing the transaction", not GC? > > No, I meant GC. The transaction has no way of knowing if X is dead or not. > The only way to know that for sure is by doing the complete trace phase of > the garbage collector (just checking the roots is not enough). That's > obviously not something you want to do for each transaction commit. Right. (Unless we maintained back-pointers for all pointers on disk, but that sounds like a horrible idea.) >> If legality of the add-root bit depends on the amount of garbage >> collected in between. > > Yeah, that's unfortunate but there's nothing we can do about that. That's > why I really think it's a "don't do that then" scenario. We can try to > do some cheap checks to prevent this kind of stuff from happening, but > we can't give any guarantee that you're protected from this. It's just > too expensive. Unless I'm missing something, of course. I'm not sure. Starting from the top: to do anything at all with a persistent object (ignoring non-persistent slots) you need to 1. Have the rucksack the object belongs to open. 2. Be in a transaction. Could this "anything" be extended to holding a reference? So that when a transaction exits, all in-memory objects with that transaction-id become invalid, and trying to use them in the context of another transaction would signal an error. Then, the only way another transaction can legally get a hold of the same object is by getting it afresh from the Rucksack. That provides for the clear semantics and determinism, but is somewhat inconvenient. No problem: indexed objects can be exempted from this invalidation, as they are always reachable (and it is cheap to check if an object is directly indexed). For other cases P-LET (call WITH-ROOTS) can provide the same guarantee and convenience: (WITH-ROOTS (X Y) ...) => (LET (X Y) (LET ((#:TAIL-THUNK *ROOT-THUNK*) (*ROOT-THUNK* (LAMBDA () (LIST* X Y (FUNCALL #:TAIL-THUNK))))) ...)) Now, both GC and transactions have access to additional in-memory roots through the *ROOT-THUNK*. GC can use them just as regular roots. Transactions just need to check if an object with an old transaction-id is in the in-memory roots: if so, then using it is fine. If not, an error is signalled. Does that make sense? (Strictly speaking, I think WITH-ROOTS needs to jump through a couple of extra hoops to ensure single-assignment semantics, as otherwise you can still get non-deterministic behaviour by removing an object from the roots for the duration of GC and then putting it back. That won't be too hard, though: nothing a symbol-macro can't handle.) Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Thu May 18 10:43:56 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 12:43:56 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <87fyj8z229.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87fyj8z229.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: >> No, with my multiple versioning scheme this is not true. The >> important point here is that it does not (and should not) make any >> difference whether A has committed or not. The only difference >> between a committed transaction and an 'open' (not yet committed) >> transaction is that you can be sure that the committed transaction >> has written its changes to disk. > > So in Rucksack transactions are about consistency and durability, not > isolation and atomicity? I'm not totally sure why you put it like that. Let me try to explain what I expect from transactions: - Once a transaction has committed, you should have a guarantee that your changes are made persistent: when you reboot Rucksack after a commit, it should find all chose changes. That's the durability part. - Transactions should be all-or-nothing. If an error occurs during a transaction and a transaction is aborted (rolled back), all of the changes made by that transaction should be undone. That's the atomicity part. - Parallel transactions should behave as if they happened sequentially. They should have the same effect as if the oldest transaction has started and completed its job before all younger transactions. That's the isolation part I think. (Maybe it's also the consistency part; but it always seemed to me that consistency must be guaranteed by the application, not just by the persistence library.) > Scenario 1. > > A enters. > A increments C -> 1. > B enters. > B increments C -> 2. > A aborts (rollback). > B commits. Good point. I think B shouldn't be allowed to commit as is, because it depends on a value that was modified by A. The simplest solution seems to be to abort B when A is aborted. In general, we could abort a transaction if it depends on a value that was created/modified by a transaction that's being aborted. I think this is doable in practice; Rucksack already keeps track of all the information that's necessary to do this. This seems to be consistent with my transaction requirements above. > Scenerio 2. > > A enters. > A increments C -> 1. > B enters. > B increments C -> 2. > B commits. > A aborts (rollback). This one is more problematic. After B commits, the application should have the guarantee that B's values won't change anymore. But when A aborts, the correct solution seems to be to abort B too (because B has a value that depends on A). The only solution I can come up with is to prevent B from committing until A has committed. That means it would have to block until A has committed, but that may destroy part of the performance advantage that we could get from multiple versioning. Hmmm. Do you see other options here? Arthur From nikodemus at random-state.net Thu May 18 11:23:51 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 14:23:51 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Thu, 18 May 2006 12:43:56 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87fyj8z229.fsf@logxor.random-state.net> Message-ID: <87psib8tko.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > I'm not totally sure why you put it like that. Let me try to explain > what I expect from transactions: Right. Fits my expectations too. >> Scenario 1. >> >> A enters. >> A increments C -> 1. >> B enters. >> B increments C -> 2. >> A aborts (rollback). >> B commits. > > Good point. I think B shouldn't be allowed to commit as is, because it > depends on a value that was modified by A. The simplest solution seems > to be to abort B when A is aborted. > This seems to be consistent with my transaction requirements above. Right. >> Scenerio 2. >> >> A enters. >> A increments C -> 1. >> B enters. >> B increments C -> 2. >> B commits. >> A aborts (rollback). > > This one is more problematic. After B commits, the application should > have the guarantee that B's values won't change anymore. But when A > aborts, the correct solution seems to be to abort B too (because B has > a value that depends on A). The only solution I can come up with is > to prevent B from committing until A has committed. That would work, I think. > That means it would have to block until A has committed, but that > may destroy part of the performance advantage that we could get from > multiple versioning. What if transaction were allowed to choose the isolation level they wanted? a) Uncommited changes from other transactions visible. The moment B sees an uncommitted change from A the transactions it becomes dependent on it, as above (even if it doesn't modify the C). b) Only committed changes from other transactions visible. No dependencies as above, but more conflicts (B needs to be aborted when it tries to increment C). Better parallelism in some cases. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Thu May 18 11:32:21 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 14:32:21 +0300 Subject: [rucksack-devel] Re: [admin] CVS repository for Rucksack In-Reply-To: (Arthur Lemmens's message of "Thu, 18 May 2006 11:30:40 +0200") References: Message-ID: <87iro38t6i.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > (I haven't tested either of these; if you have problems, let me know.) Anoncvs is not there yet, but developer CVS seems to work fine. (Any clnet registered developer has read-only access the developer CVS.) > If you want write access to the CVS repository, let me know and I'll try > to figure out how to enable this. The usual clnet thing is to add people to the rucksack group (and keep the CVS g+w). I'm fine without write-access for now -- but once you get tired of merging my patches I'm fine with it too. ;-) Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From mb at bese.it Thu May 18 11:46:49 2006 From: mb at bese.it (Marco Baringer) Date: Thu, 18 May 2006 13:46:49 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Thu, 18 May 2006 12:43:56 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87fyj8z229.fsf@logxor.random-state.net> Message-ID: "Arthur Lemmens" writes: > I'm not totally sure why you put it like that. Let me try to explain > what I expect from transactions: > > - Once a transaction has committed, you should have a guarantee that > your changes are made persistent: when you reboot Rucksack after > a commit, it should find all chose changes. That's the durability > part. > > - Transactions should be all-or-nothing. If an error occurs during a > transaction and a transaction is aborted (rolled back), all of the > changes made by that transaction should be undone. That's the > atomicity part. > > - Parallel transactions should behave as if they happened sequentially. > They should have the same effect as if the oldest transaction has > started and completed its job before all younger transactions. That's > the isolation part I think. (Maybe it's also the consistency part; > but it always seemed to me that consistency must be guaranteed by > the application, not just by the persistence library.) this is the isolation part. the consistency part is generally used when talking about foreign key references, which i don't think really apply to rucksack (except for the constraint that if object A has a reference to object B then B must exist). however i disagree with "They should have the same effect as if the oldest transaction has started and completed its job before all younger transactions" i believe it should read: "They should have the same effect as if all the operations happend in the same instant as the commits, and they should have no affect at all in the case of an abort or a rollback." the difference is that an older transaction has no implicit priority over a younger one, the only thing that matters is the order of the commits (or aborts). the problem of a long running transaction getting starved by a constant flow of short transactions remains, as does the problem of a long running transaction blocking all other transactions. this is a well known problem and has various solutions (atm i can't remeber all the details), the one solution i believe is best is to randomly choose the victim (but giving lower kill probabilities to old transactions). >> Scenario 1. >> >> A enters. >> A increments C -> 1. >> B enters. >> B increments C -> 2. this is wrong. >> A aborts (rollback). >> B commits. > > Good point. I think B shouldn't be allowed to commit as is, because it > depends on a value that was modified by A. The simplest solution seems > to be to abort B when A is aborted. In general, we could abort a > transaction if it depends on a value that was created/modified by a > transaction that's being aborted. I think this is doable in practice; > Rucksack already keeps track of all the information that's necessary > to do this. At the time of B's commit we have a history of: read(C,A); write(C,A); read(C,B); write(C,B); abort(A); commit(B) This is a serializable history, and it must be equivalent to: read(C,B); write(C,B); commit(B) due to the isolation constraint. so we need to end up with C -> 1. since write(C,A) creates a new C, thanks to multiple versioning, I don't see a problem with implementing this. > This seems to be consistent with my transaction requirements above. except for the fact that C should be 1 (in both B and A's transactions) I agree. >> Scenerio 2. >> >> A enters. >> A increments C -> 1. >> B enters. >> B increments C -> 2. >> B commits. >> A aborts (rollback). > > This one is more problematic. After B commits, the application should > have the guarantee that B's values won't change anymore. But when A > aborts, the correct solution seems to be to abort B too (because B has > a value that depends on A). The only solution I can come up with is > to prevent B from committing until A has committed. That means it would > have to block until A has committed, but that may destroy part of the > performance advantage that we could get from multiple versioning. nb: as above I believe B should have set C to 1, not 2. since transactions occur in complete isolation we don't look at conflicts until the transactions commit and only in that instant should you decide who wins and who loses. therefore this history isn't any more problematic than the first and A needs to abort. Since A and B would conflict it may be neccessary to block B's commit for a certain amount of time so that we keep the option of aborting B (assuming A is in an important, but long runnig, transaction) open. I hope this makes sense and i hope i'm not making too great of a fool of myself. -- -Marco Ring the bells that still can ring. Forget the perfect offering. There is a crack in everything. That's how the light gets in. -Leonard Cohen From nikodemus at random-state.net Thu May 18 12:23:32 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 15:23:32 +0300 Subject: [rucksack-devel] Rucksack philosophy Message-ID: <87d5eb8qt7.fsf@logxor.random-state.net> Taking the high-level view here (describing the ideal state of things). Semantic constraints * A persistent object always belongs to exactly one rucksack, which must be open for the instance to be valid. * The rucksack the object belongs to is the sole arbitrer of its identity: EQ is meaningless when talking about persistent objects. * References to non-root persistent objects outside transactions are forbidden. Root-objects are (1) objects in RUCKSACK-ROOTS, (2) indexed objects [maybe (3) lexical roots declared with WITH-ROOTS]. * Reading and writing a persistent slot outside a transaction is forbidden. * Transaction provide basic ACID properties, though the level of isolation may be in variance with the usual definition. Implementation * Enforcement of semantic constraints is preferable when possible without extreme costs. * Correctness is paramount. * Flexibility is more important then extreme speed. * Portable Common Lisp. (Accepted extras limited to threading & MOP.) Am I on the right page? Open questions * Nested transactions: do we want them? (My gut feeling would be to prohibit them initially: they can always be added later.) * Is the serialization API part of Rucksacks public interface? Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From mb at bese.it Thu May 18 12:33:13 2006 From: mb at bese.it (Marco Baringer) Date: Thu, 18 May 2006 14:33:13 +0200 Subject: [rucksack-devel] openmcl patch Message-ID: the attached, trivial, patch adds openmcl support. i have one question re serialization of pathnames: rucksack used to use host-namestring for grabbing the host part of a pathname. however on openmcl this returns "" which, when used to with make-pathname :host, creates a logical pathname. pathname-host on the other hand returns :unspecified which can be passed back to make-pathname to get an object which is equal to the original (all the other tests passed without modification). is there a reason for using host-namestring i'm not aware of? -- -Marco Ring the bells that still can ring. Forget the perfect offering. There is a crack in everything. That's how the light gets in. -Leonard Cohen From nikodemus at random-state.net Thu May 18 12:39:06 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 15:39:06 +0300 Subject: [rucksack-devel] openmcl patch In-Reply-To: (Marco Baringer's message of "Thu, 18 May 2006 14:33:13 +0200") References: Message-ID: <873bf78q39.fsf@logxor.random-state.net> Marco Baringer writes: > rucksack used to use host-namestring for grabbing the host part of a > pathname. however on openmcl this returns "" which, when used to with > make-pathname :host, creates a logical pathname. pathname-host on the > other hand returns :unspecified which can be passed back to > make-pathname to get an object which is equal to the original (all the > other tests passed without modification). is there a reason for using > host-namestring i'm not aware of? Sorry about that. I did it for SBCL: SBCL returns # which isn't serializable. This is, I think, a bug in SBCL, but I haven't double-checked it yet. If I'm correct it can be reverted to pathname-host, and I'll patch SBCL. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Thu May 18 12:39:35 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 15:39:35 +0300 Subject: [rucksack-devel] openmcl patch In-Reply-To: (Marco Baringer's message of "Thu, 18 May 2006 14:33:13 +0200") References: Message-ID: <87wtcj7bi0.fsf@logxor.random-state.net> Marco Baringer writes: > the attached, trivial, patch adds openmcl support. Uh... no attachment. ;-) Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Thu May 18 12:37:40 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 14:37:40 +0200 Subject: [rucksack-devel] openmcl patch In-Reply-To: References: Message-ID: Marco wrote: > the attached, trivial, patch adds openmcl support. Except that it's not attached ;-) > rucksack used to use host-namestring for grabbing the host part of a > pathname. however on openmcl this returns "" which, when used to with > make-pathname :host, creates a logical pathname. pathname-host on the > other hand returns :unspecified which can be passed back to > make-pathname to get an object which is equal to the original (all the > other tests passed without modification). is there a reason for using > host-namestring i'm not aware of? I'll let Nikodemus answer this one. I used to have pathname-host, but Nikodemus changed it to host-namestring. I was too lazy to ask Nikodemus to ask why and just merged his patch. Arthur From mb at bese.it Thu May 18 12:42:45 2006 From: mb at bese.it (Marco Baringer) Date: Thu, 18 May 2006 14:42:45 +0200 Subject: [rucksack-devel] openmcl patch In-Reply-To: (Arthur Lemmens's message of "Thu, 18 May 2006 14:37:40 +0200") References: Message-ID: "Arthur Lemmens" writes: > Marco wrote: > >> the attached, trivial, patch adds openmcl support. > > Except that it's not attached ;-) ... -------------- next part -------------- A non-text attachment was scrubbed... Name: openmcl.patch Type: text/x-patch Size: 3888 bytes Desc: not available URL: -------------- next part -------------- -- -Marco Ring the bells that still can ring. Forget the perfect offering. There is a crack in everything. That's how the light gets in. -Leonard Cohen From alemmens at xs4all.nl Thu May 18 12:58:01 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 14:58:01 +0200 Subject: [rucksack-devel] openmcl patch In-Reply-To: References: Message-ID: Marco wrote: > the attached, trivial, patch adds openmcl support. Thanks. Your changes are in CVS now. (That includes reverting to pathname-host, so Nikodemus will have to patch SBCL.) Arthur From alemmens at xs4all.nl Thu May 18 13:14:01 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 15:14:01 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87fyj8z229.fsf@logxor.random-state.net> Message-ID: Marco wrote: > this is the isolation part. the consistency part is generally used > when talking about foreign key references Ah, right. > which i don't think really apply to rucksack (except for the > constraint that if object A has a reference to object B then > B must exist). In Rucksack, it's impossible for A to have a reference to B if B does not exist. > however i disagree with "They should have the same effect as if the > oldest transaction has started and completed its job before all > younger transactions" i believe it should read: > > "They should have the same effect as if all the operations happend in > the same instant as the commits That's not the approach I've taken (see my answer to Nikodemus in http://common-lisp.net/pipermail/rucksack-devel/2006-May/000001.html), but maybe the approach you describe is better. I need to think about the consequences. I'll reply to the rest of your message later. Arthur From mb at bese.it Thu May 18 13:36:30 2006 From: mb at bese.it (Marco Baringer) Date: Thu, 18 May 2006 15:36:30 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Thu, 18 May 2006 15:14:01 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87fyj8z229.fsf@logxor.random-state.net> Message-ID: "Arthur Lemmens" writes: >> however i disagree with "They should have the same effect as if the >> oldest transaction has started and completed its job before all >> younger transactions" i believe it should read: >> >> "They should have the same effect as if all the operations happend in >> the same instant as the commits > > That's not the approach I've taken (see my answer to Nikodemus in > http://common-lisp.net/pipermail/rucksack-devel/2006-May/000001.html), > but maybe the approach you describe is better. I need to think about > the consequences. i'd missed that bit. so rucksack would be just a persistent lisp heap, as opposed to transactional lisp object database, works for me :) considering that the approach i described is difficult to implement and slower, it may very well not be worth it. > I'll reply to the rest of your message later. the rest of my message was based on the assumption that rucksack needed to provide pure level-4 isolation. read-commited isolation is fine and if that's the route rucksack takes there's no need to respond to the rest of my mail. -- -Marco Ring the bells that still can ring. Forget the perfect offering. There is a crack in everything. That's how the light gets in. -Leonard Cohen From erik.enge at gmail.com Thu May 18 14:01:58 2006 From: erik.enge at gmail.com (Erik Enge) Date: Thu, 18 May 2006 10:01:58 -0400 Subject: [rucksack-devel] Re: [admin] CVS repository for Rucksack In-Reply-To: References: Message-ID: <58f839b70605180701k68b65343keca2569758785711@mail.gmail.com> On 5/18/06, Arthur Lemmens wrote: > According to http://common-lisp.net/project-intro.shtml, any CVS commits > will be sent to rucksack-cvs at common-lisp.net, but that doesn't seem to > work at the moment. Erik, could you tell me if I need to do something > to enable this? Thanks. Sorry, not sure why this had broken all of a sudden. I have fixed it manually in your repository and will find the real bug. > If you want write access to the CVS repository, let me know and I'll try > to figure out how to enable this. Just email admin at common-lisp.net with their common-lisp.net username or fullname and GPG key if they don't have one. Thanks, Erik. From nikodemus at random-state.net Thu May 18 14:34:57 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Thu, 18 May 2006 17:34:57 +0300 Subject: [rucksack-devel] Persistent identity Message-ID: <87u07nl7u6.fsf@logxor.random-state.net> Here's a small identity example / test-case: (defclass p-test () ((slot :persistence t :initarg :slot :accessor slot-of)) (:metaclass persistent-class)) (let (result) (with-rucksack (r "/tmp/test-rucksack/" :if-exists :supersede) (let (a) (with-transaction () (add-rucksack-root (setf a (make-instance 'p-test :slot 0)) r)) (with-transaction () (incf (slot-of a))) (with-transaction () (incf (slot-of (car (rucksack-roots r))))) (with-transaction () (setf result (slot-of a))))) result) I believe RESULT should be 2 in the end. Is that right? (Right now this breaks at the first INCF. If you comment that one out the second one works, but the result is 0.) If so, then I believe (SETF SLOT-VALUE-USING-CLASS) will need to set up proxies, and not just deserialization. (Of course, this violates the constraint of holding a reference to an object outside transaction. If that is accepted as a strict limit, then the case is meaningless. I'm working on the assumption that holding on to a root object is legal.) Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Thu May 18 14:30:22 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 16:30:22 +0200 Subject: [rucksack-devel] Re: [admin] CVS repository for Rucksack In-Reply-To: <58f839b70605180701k68b65343keca2569758785711@mail.gmail.com> References: <58f839b70605180701k68b65343keca2569758785711@mail.gmail.com> Message-ID: Erik wrote: > Sorry, not sure why this had broken all of a sudden. I have fixed it > manually in your repository and will find the real bug. OK, thanks. >> If you want write access to the CVS repository, let me know and I'll try >> to figure out how to enable this. > > Just email admin at common-lisp.net with their common-lisp.net username > or fullname and GPG key if they don't have one. Noted, thanks. Arthur From edi at agharta.de Thu May 18 14:44:11 2006 From: edi at agharta.de (Edi Weitz) Date: Thu, 18 May 2006 16:44:11 +0200 Subject: [rucksack-devel] Webpage Message-ID: Hi! Unfortunately, I don't have enough time right now to deal with the technical details of Rucksack (although I'd really like to), but I just wanted to note that - given the fact that Rucksack is still called "vaporware" in some circles - it's probably a good idea to link to the CVS repository from the homepage. Some people might be surprised... :) Cheers, Edi. From edi at agharta.de Thu May 18 15:26:12 2006 From: edi at agharta.de (Edi Weitz) Date: Thu, 18 May 2006 17:26:12 +0200 Subject: [rucksack-devel] GC bug? Message-ID: I /think/ there's a bug in the garbage collector where MAX-HEAP-END sometimes isn't initialized. At least it broke when I ran some tests. Maybe the following is an acceptable fix. I don't understand enough of Rucksack yet to really be sure about it. Cheers, Edi. -------------- next part -------------- A non-text attachment was scrubbed... Name: rucksack.diff Type: text/x-patch Size: 804 bytes Desc: not available URL: From edi at agharta.de Thu May 18 15:28:20 2006 From: edi at agharta.de (Edi Weitz) Date: Thu, 18 May 2006 17:28:20 +0200 Subject: [rucksack-devel] Webpage In-Reply-To: (Edi Weitz's message of "Thu, 18 May 2006 16:44:11 +0200") References: Message-ID: On Thu, 18 May 2006 16:44:11 +0200, Edi Weitz wrote: > it's probably a good idea to link to the CVS repository from the > homepage And to the mailing list. From alemmens at xs4all.nl Thu May 18 15:47:18 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 17:47:18 +0200 Subject: [rucksack-devel] GC bug? In-Reply-To: References: Message-ID: Edi wrote: > I /think/ there's a bug in the garbage collector where MAX-HEAP-END > sometimes isn't initialized. At least it broke when I ran some tests. > > Maybe the following is an acceptable fix. Looks reasonable to me, thanks. Committed. > I don't understand enough of Rucksack yet to really be sure about it. I don't either ;-) Arthur From alemmens at xs4all.nl Thu May 18 16:57:41 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 18:57:41 +0200 Subject: [rucksack-devel] Webpage In-Reply-To: References: Message-ID: Edi wrote: >> it's probably a good idea to link to the CVS repository from the >> homepage > > And to the mailing list. Done. Arthur From edi at agharta.de Thu May 18 18:56:48 2006 From: edi at agharta.de (Edi Weitz) Date: Thu, 18 May 2006 20:56:48 +0200 Subject: [rucksack-devel] Another GC patch Message-ID: Here's another one I'm not sure about: In SWEEP-SOME-HEAP-BLOCKS the comments explicitely say that BLOCK-START can be NIL. However, the code after the "Reclaim dead blocks" comment seems to assume that BLOCK-START is a positive integer when it calls BLOCK-ALIVE-P. Now, what does it mean if BLOCK-START is NIL? Does it mean that the block is not dead? In that case my tiny patch might be correct. Otherwise, it's up to someone else... :) Cheers, Edi. -------------- next part -------------- A non-text attachment was scrubbed... Name: rucksack.diff Type: text/x-patch Size: 596 bytes Desc: not available URL: From alemmens at xs4all.nl Thu May 18 21:31:36 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Thu, 18 May 2006 23:31:36 +0200 Subject: [rucksack-devel] Another GC patch In-Reply-To: References: Message-ID: Edi wrote: > Here's another one I'm not sure about: In SWEEP-SOME-HEAP-BLOCKS the > comments explicitely say that BLOCK-START can be NIL. However, the > code after the "Reclaim dead blocks" comment seems to assume that > BLOCK-START is a positive integer when it calls BLOCK-ALIVE-P. Yes. Like I said to Nikodemus, there are some places in the garbage collector that need to be adapted to the 'new' object layout (the old one didn't have pointers to previous versions and transaction ids for each object). This is probably one such place. Here's a description of blocks: ** Blocks The heap contains blocks of different sizes (currently the block sizes are powers of 2; starting with blocks of 16 bytes). Each block starts with an 8-byte header. If the block is unoccupied, the header contains a pointer to the next block in the free list; otherwise it contains the size of the block. The header is followed by a serialized value which is either NIL, a positive integer or a negative integer. If it's NIL, the block is occupied by an object of which there is exactly one version. If it's a positive integer, the block is occupied by an object and the integer is a pointer to (the heap position of) the previously saved version of the object. If it's negative, the block belongs to a free list and is not in use; the integer's absolute value is the size of the block (the sweep phase of the garbage collector needs this block size). [OCCUPIED BLOCK]: 0- 8: block size 8-15: pointer to previous version (nil or an integer) .. : transaction id .. : object id .. : nr of slots .. : schema id ...: serialized slots ...: maybe some free space [FREE BLOCK]: 0- 8: pointer to next free block .. : the negative of the block size ... : free space > Now, what does it mean if BLOCK-START is NIL? BLOCK-START points to byte 8 of the block. So the block is occupied by the one and only version of an object. > Does it mean that the block is not dead? It's not really related to live or dead: that's indicated by the garbage collector info byte in the object table. If you have an object id, you can find that byte. > In that case my tiny patch might be correct. It probably isn't (I haven't looked closely). And even if it is correct, it's not enough: the function BLOCK-ALIVE-P needs to be adapted too. It's still based on the old block layout, where BLOCK-START pointed to an object id. Arthur From edi at agharta.de Thu May 18 21:47:12 2006 From: edi at agharta.de (Edi Weitz) Date: Thu, 18 May 2006 23:47:12 +0200 Subject: [rucksack-devel] Queue patch Message-ID: This probably won't affect the rest Rucksack but QUEUE-PEEK seems to have bugs. Patch attached. Cheers, Edi. -------------- next part -------------- A non-text attachment was scrubbed... Name: rucksack.diff Type: text/x-patch Size: 1105 bytes Desc: not available URL: From edi at agharta.de Thu May 18 22:11:15 2006 From: edi at agharta.de (Edi Weitz) Date: Fri, 19 May 2006 00:11:15 +0200 Subject: [rucksack-devel] Another GC patch In-Reply-To: (Arthur Lemmens's message of "Thu, 18 May 2006 23:31:36 +0200") References: Message-ID: On Thu, 18 May 2006 23:31:36 +0200, "Arthur Lemmens" wrote: > It probably isn't (I haven't looked closely). And even if it is > correct, it's not enough: the function BLOCK-ALIVE-P needs to be > adapted too. It's still based on the old block layout, where > BLOCK-START pointed to an object id. Ah, OK, thanks for the explanation. So, if I've understood that correctly, then maybe this patch does it. -------------- next part -------------- A non-text attachment was scrubbed... Name: rucksack.diff Type: text/x-patch Size: 1008 bytes Desc: not available URL: From alemmens at xs4all.nl Thu May 18 22:17:36 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 00:17:36 +0200 Subject: [rucksack-devel] Queue patch In-Reply-To: References: Message-ID: Edi wrote: > This probably won't affect the rest Rucksack but QUEUE-PEEK seems to > have bugs. Patch attached. Thanks. Committed (but I had to add END to the WITH-SLOTS). (No, QUEUE-PEEK isn't used anywhere so it doesn't affect Rucksack at the moment.) Arthur From edi at agharta.de Thu May 18 22:24:52 2006 From: edi at agharta.de (Edi Weitz) Date: Fri, 19 May 2006 00:24:52 +0200 Subject: [rucksack-devel] Queue patch In-Reply-To: (Arthur Lemmens's message of "Fri, 19 May 2006 00:17:36 +0200") References: Message-ID: On Fri, 19 May 2006 00:17:36 +0200, "Arthur Lemmens" wrote: > (but I had to add END to the WITH-SLOTS). Ugh, OK. The next time I'll test it before I send a patch... :) From alemmens at xs4all.nl Thu May 18 22:36:36 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 00:36:36 +0200 Subject: [rucksack-devel] Another GC patch In-Reply-To: References: Message-ID: Edi wrote: >> It probably isn't (I haven't looked closely). And even if it is >> correct, it's not enough: the function BLOCK-ALIVE-P needs to be >> adapted too. It's still based on the old block layout, where >> BLOCK-START pointed to an object id. > > Ah, OK, thanks for the explanation. So, if I've understood that > correctly, then maybe this patch does it. That's a lot better, thanks. In fact, this patch was good enough to let the TEST-CREATE test run without crashing the GC. (It's still not good enough though; the test in BLOCK-ALIVE-P is too simple, because it assumes that all versions except the most recently saved one are dead. That will work for simple cases, but I think there are cases where this assumption is not correct.) If you feel like following this path, you could now try to keep fixing TEST-LOAD until it doesn't crash anymore. And then TEST-UPDATE ;-) Committed (you can subscribe to rucksack-cvs if you want the follow the CVS commits, by the way). Arthur From nikodemus at random-state.net Fri May 19 00:20:46 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Fri, 19 May 2006 03:20:46 +0300 Subject: [rucksack-devel] Two patches Message-ID: <874pzmj25d.fsf@logxor.random-state.net> * clos.patch ** implements merging of slot-properties in compute-effective-slot-definition. (If any ancestor is persistent the new one is too, if exactly one ancestor is indexed the new one is too.) ** adds a common ancestor PERSISTENT for PERSISTENT-DATA and PERSISTENT-OBJECT. ** refactors PRINT-OBJECT and P-EQL slighly. * cache.patch ** refactors cache-touch-object to accept objects instead of object-ids, which means cache-touch-object on clean in-memory objects can work before the object has been deserialized. -------------- next part -------------- A non-text attachment was scrubbed... Name: clos.patch Type: text/x-patch Size: 7981 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cache.patch Type: text/x-patch Size: 3821 bytes Desc: not available URL: -------------- next part -------------- Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Fri May 19 10:09:11 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 12:09:11 +0200 Subject: [rucksack-devel] transactions and distributed objects Message-ID: [I should be doing real work, but this is too much fun...] I've been thinking about our discussion about transactions this week. In a sense, the current implementation of Rucksack contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of the NAMOS system by David Reed. (See section 7.12.4 ("Time Domain Addressing") of the "Transaction Processing" book by Gray and Reuter, and Reed's original thesis at http://www.lcs.mit.edu/publications/specpub.php?id=773.) So maybe it's a good idea to go all the way from the start. I've just started reading Reed's thesis (highly recommended, by the way; an interesting detail from his bio is that he was involved with the implementation of MACLISP), so it's not very clear to me yet what the implementation strategy should be. The two big advantages would be that Rucksack would get complete isolation (instead of just protection from 'lost updates') and that it would be possible to have distributed rucksacks. Disadvantages are that it's probably slower and that the implementation may be more complicated. But that may be worth it in the long run. I'd be interested in your opinions about this. Arthur From nikodemus at random-state.net Fri May 19 10:55:56 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Fri, 19 May 2006 13:55:56 +0300 Subject: [rucksack-devel] transactions and distributed objects In-Reply-To: (Arthur Lemmens's message of "Fri, 19 May 2006 12:09:11 +0200") References: Message-ID: <877j4i2shv.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > The two big advantages would be that Rucksack would get complete > isolation (instead of just protection from 'lost updates') and that > it would be possible to have distributed rucksacks. Disadvantages > are that it's probably slower and that the implementation may be > more complicated. But that may be worth it in the long run. IF the transaction processing can be mediated by generic functions dispatching on the class of the transaction, then it seems to me that having multiple transaction classes would be "simple". In any case, I think complete isolation to be worth quite a bit of trouble. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From mb at bese.it Fri May 19 11:03:00 2006 From: mb at bese.it (Marco Baringer) Date: Fri, 19 May 2006 13:03:00 +0200 Subject: [rucksack-devel] transactions and distributed objects In-Reply-To: <877j4i2shv.fsf@logxor.random-state.net> (Nikodemus Siivola's message of "Fri, 19 May 2006 13:55:56 +0300") References: <877j4i2shv.fsf@logxor.random-state.net> Message-ID: Nikodemus Siivola writes: > IF the transaction processing can be mediated by generic > functions dispatching on the class of the transaction, > then it seems to me that having multiple transaction > classes would be "simple". i really don't think you want user extensible transactions. i believe that implementing a transaction object will require enough knowledge of rucksack's internals that it will be indistunguishable from a change to rucksack itself. should rucksack offer different isolation levels? that's a completly different question (and the answer is probably yes). -- -Marco Ring the bells that still can ring. Forget the perfect offering. There is a crack in everything. That's how the light gets in. -Leonard Cohen From nikodemus at random-state.net Fri May 19 11:19:28 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Fri, 19 May 2006 14:19:28 +0300 Subject: [rucksack-devel] Et tu, prevalence? Message-ID: <87wtci1cu7.fsf@logxor.random-state.net> I've been trying to wrap my head around persitent object identity in relation to parallelism and transactions. Everywhere I look, I see endless pain for allowing references to persistent objects outside transactions. * Assume references to the same (EQ) persitent objects in two different transactions. Assume transactions update any instances referred to within their scope on first access. BANG -- the other transaction breaks as the object is EQ. * Assume two P-EQL but not EQ persistent objects, X0 and X1. Assume transactions _don't_ update objects on first access. First a transaction sets a slot in X0 to a different (non P-EQL) object, and commits. Then another transaction starts, and reads the same slot from X1, which it didn't get from cache/disk, but through a reference held outside a transaction. BANG -- the value is not the same one committed, as the proxy in X1 points to wrong object-id. Damned if you do, damned if you don't. Workarounds can be built (instead of proxying the slot value proxy a cell that holds the value to fix case 2, abort transaction in case 1) -- but I'm not sure the pain is worth the gain. I think these are symptoms of me trying to see two different things in Rucksack as once -- (1) a solid transactional system suitable for parallelism and distribution, and (2) a persistent lisp-heap. I'd really like to have both. They match different needs, but they also _have_ different needs. Here's the idea: Perhaps what is needed is prevalent-class/object in addition to persistent-class/object? (Referring to "lightweight" in-memory prevalence with write-ahead logging and recovery: it isn't as suitable for server/client division or distribution like "proper persistence", and isn't as good with parallelism... but you can hold on to references outside transactions, and they are always EQ.) I can certainly understand if this is deemed a bad idea / outside the scope of Rucksack, but I for one would find such a pairing natural along with providing a low-level serialization API. What do you think? (In any case, I've come to my senses and have no desire to hold references to persistent-objects outside transactions anymore.) Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Fri May 19 11:25:28 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Fri, 19 May 2006 14:25:28 +0300 Subject: [rucksack-devel] transactions and distributed objects In-Reply-To: (Marco Baringer's message of "Fri, 19 May 2006 13:03:00 +0200") References: <877j4i2shv.fsf@logxor.random-state.net> Message-ID: <87r72q1ck7.fsf@logxor.random-state.net> Marco Baringer writes: > i really don't think you want user extensible transactions. i believe > that implementing a transaction object will require enough knowledge > of rucksack's internals that it will be indistunguishable from a > change to rucksack itself. > > should rucksack offer different isolation levels? that's a completly > different question (and the answer is probably yes). To clarify, I didn't mean user-extensible transactions. I ment that (1) if dispatching on the class of the transactions is a workable implementation stratefy, then different isolation levels should be "simple" to achieve, and (2) if we only have a single isolation level, then I'd perfer it to be to be fully serializable. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Fri May 19 11:40:44 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 13:40:44 +0200 Subject: [rucksack-devel] transactions and distributed objects In-Reply-To: <87r72q1ck7.fsf@logxor.random-state.net> References: <877j4i2shv.fsf@logxor.random-state.net> <87r72q1ck7.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > To clarify, I didn't mean user-extensible transactions. I agree with Marco that user-extensible transactions probably don't make much sense. > I ment that (1) if dispatching on the class of the transactions > is a workable implementation stratefy, then different isolation > levels should be "simple" to achieve Erm, yes. Dispatching on the transaction class is probably OK. Of course the interesting question is how to *implement* isolation levels, not how to find them. Reed's strategy looks like a very elegant solution to me, but I'd need to work out the details. One 'detail' is that he doesn't seem to mention garbage collection anywhere, which is a bit odd because his approach 'conses' like hell. > (2) if we only have a single isolation level, then I'd perfer it to > be to be fully serializable. Yes, I think I agree. Arthur From alemmens at xs4all.nl Fri May 19 11:59:24 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 13:59:24 +0200 Subject: [rucksack-devel] Et tu, prevalence? In-Reply-To: <87wtci1cu7.fsf@logxor.random-state.net> References: <87wtci1cu7.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > Perhaps what is needed is prevalent-class/object in addition to > persistent-class/object? (Referring to "lightweight" in-memory > prevalence with write-ahead logging and recovery: it isn't as suitable > for server/client division or distribution like "proper persistence", > and isn't as good with parallelism... but you can hold on to > references outside transactions, and they are always EQ.) Would these prevalent objects be in the same rucksack as persistent objects or would prevalence vs. persistence be a choice that you make when you create a rucksack? > I can certainly understand if this is deemed a bad idea / outside > the scope of Rucksack, but I for one would find such a pairing > natural along with providing a low-level serialization API. > > What do you think? Sounds OK to me. I built a prevalence-like system before I started working on Rucksack, and I think it would be relatively easy to add prevalence to Rucksack (as long as prevalent and persistent objects don't need to coexist in the same rucksack). I wouldn't want to reuse my existing prevalence library for this because some of the design choices don't combine well with Rucksack. But with the serializer and MOP hooks already in place, it wouldn't be too much work to serialize all changes to a log file and 'replay' the logfile when loading a prevalent rucksack. My library didn't handle 'schema evolution' very well: it just ignored references to slots that didn't exist anymore in the current class definition and gave new slots a value based on their initform. If you want to implement some kind of UPDATE-PREVALENT-OBJECT-FOR-REDEFINED-CLASS protocol (which would be the proper way to handle class changes, I think), you'd have to save class definitions in the log file too. > (In any case, I've come to my senses and have no desire to hold > references to persistent-objects outside transactions anymore.) Good ;-) Arthur From nikodemus at random-state.net Fri May 19 12:16:38 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Fri, 19 May 2006 15:16:38 +0300 Subject: [rucksack-devel] Et tu, prevalence? In-Reply-To: (Arthur Lemmens's message of "Fri, 19 May 2006 13:59:24 +0200") References: <87wtci1cu7.fsf@logxor.random-state.net> Message-ID: <87lksy1a6x.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > Would these prevalent objects be in the same rucksack as persistent > objects or would prevalence vs. persistence be a choice that you > make when you create a rucksack? I was thinking of making the stores fundamentally different, but (create-rucksack ... :type :prevalent) seems nicer. > Sounds OK to me. I built a prevalence-like system before I started > working on Rucksack, and I think it would be relatively easy to > add prevalence to Rucksack (as long as prevalent and persistent > objects don't need to coexist in the same rucksack). What issues do you see with having both kinds in a single rucksack? (I assume shared object-id's would create problems, but I'm not sure.) > definition and gave new slots a value based on their initform. If > you want to implement some kind of UPDATE-PREVALENT-OBJECT-FOR-REDEFINED-CLASS > protocol (which would be the proper way to handle class changes, I > think), you'd have to save class definitions in the log file too. Right. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Fri May 19 14:54:34 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 16:54:34 +0200 Subject: [rucksack-devel] Et tu, prevalence? In-Reply-To: <87lksy1a6x.fsf@logxor.random-state.net> References: <87wtci1cu7.fsf@logxor.random-state.net> <87lksy1a6x.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > What issues do you see with having both kinds in a single rucksack? (I > assume shared object-id's would create problems, but I'm not sure.) I feel uneasy about references between prevalent and persistent objects, for one thing. Another thing is: if you just specify a :prevalent flag when creating a rucksack and can't have both kinds in one rucksack, you wouldn't even need to make separate classes for prevalent objects; instances of persistent-object could just be treated differently for prevalent rucksacks. So you wouldn't have to have separate (defclass foo () () (:metaclass persistent-class) and (defclass bar () () (:metaclass prevalent-class) definitions in your program. And you wouldn't need to change from one to the other when you decide to change from persistence to prevalence or vice versa. The third thing is: I don't see a good reason for wanting to have both kinds of objects in the same rucksack. I have no intentions of actually writing this prevalent variant, by the way. But I think it would be a useful addition to Rucksack and if you feel like writing it, I'd be happy to integrate it. Arthur From alemmens at xs4all.nl Fri May 19 15:06:26 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 17:06:26 +0200 Subject: [rucksack-devel] Two patches In-Reply-To: <874pzmj25d.fsf@logxor.random-state.net> References: <874pzmj25d.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > ** implements merging of slot-properties in > compute-effective-slot-definition. (If any ancestor is > persistent the new one is too, if exactly one ancestor is > indexed the new one is too.) Do you have a reason for doing it this way? I'm not too sure about the indexed slots, for example. Indexing is very expensive, so I can (vaguely) imagine situations where you'd want a slot of a parent class to be indexed but you don't want that for the same slot of a child class. I suppose we'd have to write out some example cases (as realistic as possible) to get a feeling for the best solution here. At the moment I'm not convinced that your new way is better than my old way. I'm not saying it's worse either; I just don't know. > ** adds a common ancestor PERSISTENT for PERSISTENT-DATA and > PERSISTENT-OBJECT. Hehe. I actually used to have it this way, but split them up at a certain point. Unfortunately, I can't remember why I did that; I only remember that I thought I had a good reason for doing it at the time. I'll try to find if I've written down an explanation somewhere. > ** refactors cache-touch-object to accept objects instead of > object-ids That looks OK to me... > which means cache-touch-object on clean in-memory objects can > work before the object has been deserialized. ...but I don't really understand what you're saying here. Do you mean 'serialized' or do you really mean 'deserialized'? Why would you want to touch a clean in-memory object? If it's clean, you shouldn't touch it, right? Arthur From alemmens at xs4all.nl Fri May 19 15:13:06 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 17:13:06 +0200 Subject: [rucksack-devel] Rucksack philosophy In-Reply-To: <87d5eb8qt7.fsf@logxor.random-state.net> References: <87d5eb8qt7.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > * A persistent object always belongs to exactly one rucksack, > which must be open for the instance to be valid. Yes, I agree. > * The rucksack the object belongs to is the sole arbitrer of > its identity: EQ is meaningless when talking about persistent > objects. Yes, I agree. > * References to non-root persistent objects outside transactions are > forbidden. Root-objects are (1) objects in RUCKSACK-ROOTS, (2) > indexed objects [maybe (3) lexical roots declared with WITH-ROOTS]. Like you wrote in another mail today, I think it's better not to have any references to persistent objects outside transactions at all. > * Reading and writing a persistent slot outside a transaction is > forbidden. Yes. > * Transaction provide basic ACID properties, though the level of > isolation may be in variance with the usual definition. Yes. > * Enforcement of semantic constraints is preferable when possible > without extreme costs. Yes. > * Correctness is paramount. Yes. > * Flexibility is more important then extreme speed. Yes. > * Portable Common Lisp. (Accepted extras limited to threading & MOP.) Yes. > Am I on the right page? Yes. > * Nested transactions: do we want them? (My gut feeling would be > to prohibit them initially: they can always be added later.) I agree. > * Is the serialization API part of Rucksacks public interface? Hmm... My idea was that you'd have different levels of 'public interface'. The highest level is for programmers who just want to use the damn thing and don't want to be bothered with details about indexes, garbage collectors, caches or whatever. At a lower level you'd have protocols for extending Rucksack. Defining a new kind of index, for example. Or maybe creating a persistent sparse matrix as a subclass of persistent-data. Or whatever. At this level, I suppose you may want to use some of the serialization functions. So yes, I think that may be useful. Arthur From alemmens at xs4all.nl Fri May 19 15:19:23 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 17:19:23 +0200 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: <87ves38vd6.fsf@logxor.random-state.net> References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87ac9fiycn.fsf@logxor.random-state.net> <87lkszhftg.fsf@logxor.random-state.net> <87ves38vd6.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > Starting from the top: to do anything at all with a persistent object > (ignoring non-persistent slots) you need to > > 1. Have the rucksack the object belongs to open. > 2. Be in a transaction. > > Could this "anything" be extended to holding a reference? > > So that when a transaction exits, all in-memory objects with that > transaction-id become invalid, and trying to use them in the context > of another transaction would signal an error. Yes, the transaction knows which objects it has committed so I think it could invalidate those in some way as part of the commit (by setting a flag or something). > Then, the only way another transaction can legally get a hold of the > same object is by getting it afresh from the Rucksack. Yep. > That provides for the clear semantics and determinism, but is somewhat > inconvenient. I'm not sure why you think that's inconvenient. But you've given up on the idea of having references to persistent objects outside of transactions, right? Then I'll skip the details here. Arthur From alemmens at xs4all.nl Fri May 19 15:22:59 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 17:22:59 +0200 Subject: [rucksack-devel] Persistent identity In-Reply-To: <87u07nl7u6.fsf@logxor.random-state.net> References: <87u07nl7u6.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > (with-rucksack (r "/tmp/test-rucksack/" :if-exists :supersede) > (let (a) > (with-transaction () > (add-rucksack-root (setf a (make-instance 'p-test :slot 0)) > r)) > (with-transaction () > (incf (slot-of a))) > (with-transaction () > (incf (slot-of (car (rucksack-roots r))))) > (with-transaction () > (setf result (slot-of a))))) > result) So can I skip this example too by just saying "don't do that then"? Or do you still think that this example is relevant? Arthur From nikodemus at random-state.net Fri May 19 19:36:07 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Fri, 19 May 2006 22:36:07 +0300 Subject: [rucksack-devel] Two patches In-Reply-To: (Arthur Lemmens's message of "Fri, 19 May 2006 17:06:26 +0200") References: <874pzmj25d.fsf@logxor.random-state.net> Message-ID: <87fyj524ew.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > Nikodemus wrote: >> which means cache-touch-object on clean in-memory objects can >> work before the object has been deserialized. > > ...but I don't really understand what you're saying here. Do > you mean 'serialized' or do you really mean 'deserialized'? Why > would you want to touch a clean in-memory object? If it's > clean, you shouldn't touch it, right? That was in response to retaining the object outside a transaction: in the second transaction it was clean, but not in cache -- so you tried to touch it, it wasn't in the cache, which broke things. By passing objects instead of object-ids to the cache-touch object it was no longer necessary to get the object from the cache in order to mark it dirty in the transaction, so the internal-error which I hit due to the out-of-transaction reference could be avoided. So my motivation wasn't quite correct there -- but I think the patch should still be correct. If you wish the same consistency checking as prior, the internal-error needs to be kept. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Fri May 19 19:37:37 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Fri, 19 May 2006 22:37:37 +0300 Subject: [rucksack-devel] Re: Rucksack, ECLM In-Reply-To: (Arthur Lemmens's message of "Fri, 19 May 2006 17:19:23 +0200") References: <87lktk96iz.fsf@logxor.random-state.net> <878xp0zly6.fsf@logxor.random-state.net> <87ac9fiycn.fsf@logxor.random-state.net> <87lkszhftg.fsf@logxor.random-state.net> <87ves38vd6.fsf@logxor.random-state.net> Message-ID: <87ac9d24ce.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > I'm not sure why you think that's inconvenient. But you've given up on > the idea of having references to persistent objects outside of transactions, > right? Then I'll skip the details here. Right. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Fri May 19 19:43:50 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Fri, 19 May 2006 22:43:50 +0300 Subject: [rucksack-devel] Et tu, prevalence? In-Reply-To: (Arthur Lemmens's message of "Fri, 19 May 2006 16:54:34 +0200") References: <87wtci1cu7.fsf@logxor.random-state.net> <87lksy1a6x.fsf@logxor.random-state.net> Message-ID: <874pzl2421.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > I feel uneasy about references between prevalent and persistent objects, > for one thing. Potential quicksand there, I can believe. > Another thing is: if you just specify a :prevalent flag when creating a > rucksack and can't have both kinds in one rucksack, you wouldn't even > need to make separate classes for prevalent objects; instances of > persistent-object could just be treated differently for prevalent > rucksacks. So you wouldn't have to have separate > (defclass foo () () (:metaclass persistent-class) and > (defclass bar () () (:metaclass prevalent-class) definitions in your > program. And you wouldn't need to change from one to the other when > you decide to change from persistence to prevalence or vice versa. The reason why I think it is important to have them as separate metaclasses (with their own root-classes) is that they will have somewhat different semantics. * holding on to a reference to a prevalent-object outside a transaction is not a problem. * reading a slot of a prevalent-object outside a transaction could be permitted. * two prevalent-objects with same object-id's are always EQ. Making the distinction at class-level makes things easier for user-level code, and probably keeps the implementation cleaner too. > I have no intentions of actually writing this prevalent variant, by the > way. But I think it would be a useful addition to Rucksack and if you > feel like writing it, I'd be happy to integrate it. Already started. ,-) Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Fri May 19 19:44:30 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Fri, 19 May 2006 22:44:30 +0300 Subject: [rucksack-devel] Persistent identity In-Reply-To: (Arthur Lemmens's message of "Fri, 19 May 2006 17:22:59 +0200") References: <87u07nl7u6.fsf@logxor.random-state.net> Message-ID: <87y7wxztnl.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > So can I skip this example too by just saying "don't do that then"? > Or do you still think that this example is relevant? Dead and buried, with relief. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Fri May 19 19:55:27 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Fri, 19 May 2006 21:55:27 +0200 Subject: [rucksack-devel] Et tu, prevalence? In-Reply-To: <874pzl2421.fsf@logxor.random-state.net> References: <87wtci1cu7.fsf@logxor.random-state.net> <87lksy1a6x.fsf@logxor.random-state.net> <874pzl2421.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > The reason why I think it is important to have them as separate metaclasses > (with their own root-classes) is that they will have somewhat different > semantics. > > * holding on to a reference to a prevalent-object outside a > transaction is not a problem. > > * reading a slot of a prevalent-object outside a transaction > could be permitted. > > * two prevalent-objects with same object-id's are always EQ. > > Making the distinction at class-level makes things easier for > user-level code, and probably keeps the implementation cleaner too. OK, fine. So you'll define a different prevalent metaclass, but you don't assume that one rucksack can contain both prevalent and persistent object, right? Sounds reasonable to me. Arthur From nikodemus at random-state.net Fri May 19 21:30:20 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Sat, 20 May 2006 00:30:20 +0300 Subject: [rucksack-devel] Et tu, prevalence? In-Reply-To: (Arthur Lemmens's message of "Fri, 19 May 2006 21:55:27 +0200") References: <87wtci1cu7.fsf@logxor.random-state.net> <87lksy1a6x.fsf@logxor.random-state.net> <874pzl2421.fsf@logxor.random-state.net> Message-ID: <87sln5zor7.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > OK, fine. So you'll define a different prevalent metaclass, but you > don't assume that one rucksack can contain both prevalent and persistent > object, right? Sounds reasonable to me. Right. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From nikodemus at random-state.net Sat May 20 10:13:58 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Sat, 20 May 2006 13:13:58 +0300 Subject: [rucksack-devel] WITH-TRANSACTION Message-ID: <87zmhdypeh.fsf@logxor.random-state.net> The current WITH-TRANSACTION looks wrong to me: non-local exists leave the transaction hanging. Below is a sketch for a more robust one, that also as a convenience returns the primary value of the BODY as its second value. (defmacro with-transaction ((&rest args &key (rucksack '(current-rucksack)) &allow-other-keys) &body body) (let ((committed (gensym "COMMITTED")) (transaction (gensym "TRANSACTION")) (result (gensym "RESULT"))) `(let ((,transaction nil)) (loop named ,transaction (with-simple-restart (retry "Retry ~S" ,transaction) (let ((,committed nil) (,result nil)) (unwind-protect (progn ;; Use a local variable for the transaction so that nothing ;; can replace it from underneath us, and only then bind ;; it to *TRANSACTION*. (setf ,transaction (transaction-start :rucksack ,rucksack , at args)) (let ((*transaction* ,transaction)) (with-simple-restart (abort "Abort ~S" ,transaction) (setf ,result , at body) (transaction-commit ,transaction) (setf ,committed t))) ;; Normal exit from the WITH-SIMPLE-RESTART above -- either ;; everything went well or we aborted -- the ,COMMITTED will tell ;; us. In either case we jump out of the RETRY loop. (return-from ,transaction (values ,committed ,result))) (unless ,committed (transaction-rollback ,transaction))))) ;; Normal exit from the above block -- we selected the RETRY restart. )))) Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Sat May 20 10:52:01 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Sat, 20 May 2006 12:52:01 +0200 Subject: [rucksack-devel] WITH-TRANSACTION In-Reply-To: <87zmhdypeh.fsf@logxor.random-state.net> References: <87zmhdypeh.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > The current WITH-TRANSACTION looks wrong to me: non-local exists leave the > transaction hanging. Below is a sketch for a more robust one Thanks. Looks good to me. Committed with two minor changes: - I put the (REMF ARGS :RUCKSACK) line back in. (Doesn't really make any difference, but feels a bit cleaner to me.) - I added a DO after (LOOP NAMED ,TRANSACTION Without the DO it's not standard CL, I think. (Or, if it is, Lispworks doesn't recognize it as such.) Arthur From alemmens at xs4all.nl Sat May 20 15:15:16 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Sat, 20 May 2006 17:15:16 +0200 Subject: [rucksack-devel] Two patches In-Reply-To: <87fyj524ew.fsf@logxor.random-state.net> References: <874pzmj25d.fsf@logxor.random-state.net> <87fyj524ew.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > That was in response to retaining the object outside a transaction: in > the second transaction it was clean, but not in cache -- so you tried > to touch it, it wasn't in the cache, which broke things. Ah, I see. > but I think the patch should still be correct. Yes, I think so too. Committed. Arthur From edi at agharta.de Sat May 20 15:49:41 2006 From: edi at agharta.de (Edi Weitz) Date: Sat, 20 May 2006 17:49:41 +0200 Subject: [rucksack-devel] Two patches In-Reply-To: (Arthur Lemmens's message of "Sat, 20 May 2006 17:15:16 +0200") References: <874pzmj25d.fsf@logxor.random-state.net> <87fyj524ew.fsf@logxor.random-state.net> Message-ID: On Sat, 20 May 2006 17:15:16 +0200, "Arthur Lemmens" wrote: > Committed. Looks like you didn't apply the patch fully. See attachment. -------------- next part -------------- A non-text attachment was scrubbed... Name: rucksack.diff Type: text/x-patch Size: 465 bytes Desc: not available URL: From alemmens at xs4all.nl Sat May 20 20:34:07 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Sat, 20 May 2006 22:34:07 +0200 Subject: [rucksack-devel] Two patches In-Reply-To: References: <874pzmj25d.fsf@logxor.random-state.net> <87fyj524ew.fsf@logxor.random-state.net> Message-ID: Edi wrote: > Looks like you didn't apply the patch fully. Oops. Hopefully I did it right this time. Arthur From nikodemus at random-state.net Sun May 21 11:41:12 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Sun, 21 May 2006 14:41:12 +0300 Subject: [rucksack-devel] Transactions: abort early? Message-ID: <87verzvc4n.fsf@logxor.random-state.net> In doing the prevalent stuff, I've gotten to the point of fiddling with alternative transaction implementations: (I'm not taking the classic write-ahead logging route, since I think what I have is nicer and likely to perform better). Situtation: Transaction T1 starts. T1 does stuff, but doesn't read or write object O. Transaction T2 starts. T2 touches O. T2 commits. T1 reads 0. Option 1a: * We no longer have access to O that matches the rest of the T1. Let T1 read O, but make it fail on commit due to the inconsistence. Option 1b: * We no longer have access to O that matches the rest of the T1. Abort T1 immediately. Option 2: * We _have_ a copy of O that matches T1. Let T1 read O, but make it fail on commit as its assumptions are no longer true. My personal preference is 1b, and I was under the impression that aborting early was in the spirit of Rucksack generally, so I assume this is OK? Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Sun May 21 20:17:47 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Sun, 21 May 2006 22:17:47 +0200 Subject: [rucksack-devel] Transactions: abort early? In-Reply-To: <87verzvc4n.fsf@logxor.random-state.net> References: <87verzvc4n.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > In doing the prevalent stuff, I've gotten to the point of fiddling with > alternative transaction implementations: This may sound strange coming from me, but for prevalent objects I'd seriously consider using two-phase locking (acquire a lock for each object when you access it and release all locks at the end of the transaction). Two-phase locking is relatively easy to implement I think (I've never done it). And because prevalent transactions only use in-memory objects, they can be *much* faster than 'persistent transactions', so the chance that two-phase locking slows down the system to an unacceptable degree will be much smaller than for persistent transactions. > Transaction T1 starts. > T1 does stuff, but doesn't read or write object O. > Transaction T2 starts. > T2 touches O. > T2 commits. > T1 reads 0. > > Option 1a: > > * We no longer have access to O that matches the rest of the T1. > Let T1 read O, but make it fail on commit due to the inconsistence. > > Option 1b: > > * We no longer have access to O that matches the rest of the T1. > Abort T1 immediately. > > Option 2: > > * We _have_ a copy of O that matches T1. Let T1 read O, but make it > fail on commit as its assumptions are no longer true. > > My personal preference is 1b, and I was under the impression that > aborting early was in the spirit of Rucksack generally, so I assume > this is OK? If you decide not to do two-phase locking, option 1b sounds good to me. (Not that that means very much; this whole transaction implementation business is quite new to me.) Arthur From nikodemus at random-state.net Mon May 22 05:56:49 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Mon, 22 May 2006 08:56:49 +0300 Subject: [rucksack-devel] Transactions: abort early? In-Reply-To: (Arthur Lemmens's message of "Sun, 21 May 2006 22:17:47 +0200") References: <87verzvc4n.fsf@logxor.random-state.net> Message-ID: <877j4evbz2.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > This may sound strange coming from me, but for prevalent objects I'd > seriously consider using two-phase locking (acquire a lock for each object > when you access it and release all locks at the end of the transaction). >> Transaction T1 starts. >> T1 does stuff, but doesn't read or write object O. >> Transaction T2 starts. >> T2 touches O. >> T2 commits. >> T1 reads 0. >> Option 1b: >> >> * We no longer have access to O that matches the rest of the T1. >> Abort T1 immediately. If I understand correctly two-phase locking comes down to this, actually. At the time when T1 reads O it has been changed by a younger transaction, and no longer matches the rest of T1. (Assuming serializable transactions are the goal.) Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From alemmens at xs4all.nl Mon May 22 19:32:47 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Mon, 22 May 2006 21:32:47 +0200 Subject: [rucksack-devel] Transactions: abort early? In-Reply-To: <877j4evbz2.fsf@logxor.random-state.net> References: <87verzvc4n.fsf@logxor.random-state.net> <877j4evbz2.fsf@logxor.random-state.net> Message-ID: Nikodemus wrote: > Transaction T1 starts. > T1 does stuff, but doesn't read or write object O. > Transaction T2 starts. > T2 touches O. > T2 commits. > T1 reads 0. > > Option 1b: > > * We no longer have access to O that matches the rest of the T1. > Abort T1 immediately. > > If I understand correctly two-phase locking comes down to this, > actually. I don't think so. As far as I understand it, the basic idea of two-phase locking is that a transaction should acquire a shared (read-only) lock for every object it wants to read and an exclusive (read/write) lock for each object it wants to change. And it may not release those locks until it commits. (It's called two-phase because there's a phase when it only acquires locks and a phase where it releases locks, and those two phases may not interleave.) So I think your scenario would go like this: 1. T1 starts: not really relevant 2. T1 does stuff, but doesn't read or write object O: not relevant 3. T2 starts: not relevant 4a. T2 aquires an exclusive lock on O 4b. T2 writes O 5. T2 commits, releasing the exclusive lock on O 6a. T1 acquires a shared lock on O 6b. T1 reads O. I don't think there's a problem here. T1 is now free to change O and/or commit, and there still won't be a problem. But maybe I'm missing something? Arthur From nikodemus at random-state.net Tue May 23 06:25:09 2006 From: nikodemus at random-state.net (Nikodemus Siivola) Date: Tue, 23 May 2006 09:25:09 +0300 Subject: [rucksack-devel] Transactions: abort early? In-Reply-To: (Arthur Lemmens's message of "Mon, 22 May 2006 21:32:47 +0200") References: <87verzvc4n.fsf@logxor.random-state.net> <877j4evbz2.fsf@logxor.random-state.net> Message-ID: <87zmh9cl6i.fsf@logxor.random-state.net> "Arthur Lemmens" writes: > I don't think there's a problem here. T1 is now free to change O and/or > commit, and there still won't be a problem. > > But maybe I'm missing something? I don't think so -- I see where my think'o was. I was assuming there were some (invisible to the transaction system) constraints between O and the rest of the world, that would be violated if T1 saw the changed O -- which is nonsense. (Eg: every time you touch O increment C.) If the constraints have been violated, there is nothing the transaction system can do about it. If they have not been violated, then everything is fine. (And if both C and O are prevalent then the conflict is real, of course.) Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." From aycan.irican at core.gen.tr Sat May 27 01:07:15 2006 From: aycan.irican at core.gen.tr (Aycan iRiCAN) Date: Sat, 27 May 2006 04:07:15 +0300 Subject: [rucksack-devel] rucksack-map-class Message-ID: <87lksotgvw.fsf@core.gen.tr> Hi, First of all, thank you all for this great project. Your approach is similar to elephant project that I was using but lots of nice and new features that I'm waiting for. I believe a persistence mechanism is needed by all CL developers and rucksack is going to serve this important purpose. I'm new to rucksack's internals and I have a little question. I created a persistent class and added it to rucksack root. CDB> (with-coredb (rucksack-roots ru)) T (#>) After that, I tried to describe all the instances of this class. (WITH-RUCKSACK (RU *DB-LOCATION*) (WITH-TRANSACTION NIL (RUCKSACK-MAP-CLASS RU (FIND-CLASS 'META-CORETAL) (LAMBDA (X) (DESCRIBE X))))) But I'm getting an error: There is no applicable method for the generic function # when called with arguments (#). [Condition of type SIMPLE-ERROR] I think this is due to rucksack-map-class tries to cache a standard-rucksack instead of getting the cache that belongs to the standard-rucksack. Here is a tiny patch. Index: rucksack.lisp =================================================================== RCS file: /project/rucksack/cvsroot/rucksack/rucksack.lisp,v retrieving revision 1.6 diff -r1.6 rucksack.lisp 552c552 < (cache (cache rucksack))) --- > (cache (rucksack-cache rucksack))) Kind Regards. -- Aycan iRiCAN C0R3 Computer Security Group http://www.core.gen.tr -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 190 bytes Desc: not available URL: From alemmens at xs4all.nl Sun May 28 11:33:28 2006 From: alemmens at xs4all.nl (Arthur Lemmens) Date: Sun, 28 May 2006 13:33:28 +0200 Subject: [rucksack-devel] rucksack-map-class In-Reply-To: <87lksotgvw.fsf@core.gen.tr> References: <87lksotgvw.fsf@core.gen.tr> Message-ID: Aycan iRiCAN wrote: > I think this is due to rucksack-map-class tries to cache a > standard-rucksack instead of getting the cache that belongs to the > standard-rucksack. Yes, that's right. (I haven't tested the class and slot indexing parts of Rucksack yet, so it's very likely that you'll find other bugs. Please let me know if you do.) > Here is a tiny patch. Committed. Thanks. Arthur Lemmens