[rucksack-devel] rucksack performance

Cyrus Harmon ch-rucksack at bobobeach.com
Thu Jan 11 22:15:11 UTC 2007


On Jan 11, 2007, at 12:00 PM, Arthur Lemmens wrote:

> Cyrus Harmon wrote:
>
>> Following up on my post from yesterday, even when I can coerce the
>> "lots of transactions" into working, at some point performance breaks
>> down rather severely.
>
> I haven't looked at this in detail, but my first guess would be that
> the garbage collector settings and/or performance is critical here.

Hmm... back to the

The value
   #<RUCKSACK:PERSISTENT-ARRAY #178397 in #<STANDARD-CACHE of size  
10000, heap #P"/Users/sly/projects/cyrusharmon.org/cl-bio/rucksack/ 
heap" and 7481 objects in memory.>>
is not of type
   (OR NULL RUCKSACK:PERSISTENT-CONS).
    [Condition of type TYPE-ERROR]

error, which is interesting as it looks like we're trying to do a p- 
car of an array, and it's getting the array be accessing the last  
element in the other (legitimate) array, but I'm getting distracted...

>> yes, along the way there is some fluctuation, as, I imagine, the
>> indices and caches grow, etc... but we reach a threshold where it
>> takes roughly .5 sec and 3M of consing for every object.
>
> 3M of consing per object is ridiculously much of course.  I'm pretty
> sure that it should be possible to reduce this a lot by tracing some
> of the garbage collector routines and looking at how much work they  
> do.
>
> One thing you could consider is to turn the garbage collector off
> during the phase where you're creating very many objects (initializing
> your database maybe?).  In fact, you could just turn it off, period.
> As long as your disk is big enough, of course...
>
> Let me know if turning the GC off doesn't help.

How do I do this? I commented out the collect-some-garbage in  
transaction, but that didn't seem to fix the problem.


>> And it would be nice if this approach worked as well.
>
> Yes.  Having a separate transaction for each created object is not the
> most efficient way and should not be necessary, but obviously it  
> should
> work and it shouldn't be ridiculously slow.

Agreed. Of course I only went down this route as performance became  
unacceptable with a really big transaction too, so perhaps the  
transaction/gc thing is a red herring.

Cyrus




More information about the rucksack-devel mailing list