[rucksack-devel] rucksack performance

Cyrus Harmon ch-rucksack at bobobeach.com
Thu Jan 11 23:13:08 UTC 2007


you probably already knew this, but it seems as though most of the  
allocation is happening down in btree-insert.

Cyrus

On Jan 11, 2007, at 2:15 PM, Cyrus Harmon wrote:

>
> On Jan 11, 2007, at 12:00 PM, Arthur Lemmens wrote:
>
>> Cyrus Harmon wrote:
>>
>>> Following up on my post from yesterday, even when I can coerce the
>>> "lots of transactions" into working, at some point performance  
>>> breaks
>>> down rather severely.
>>
>> I haven't looked at this in detail, but my first guess would be that
>> the garbage collector settings and/or performance is critical here.
>
> Hmm... back to the
>
> The value
>   #<RUCKSACK:PERSISTENT-ARRAY #178397 in #<STANDARD-CACHE of size  
> 10000, heap #P"/Users/sly/projects/cyrusharmon.org/cl-bio/rucksack/ 
> heap" and 7481 objects in memory.>>
> is not of type
>   (OR NULL RUCKSACK:PERSISTENT-CONS).
>    [Condition of type TYPE-ERROR]
>
> error, which is interesting as it looks like we're trying to do a p- 
> car of an array, and it's getting the array be accessing the last  
> element in the other (legitimate) array, but I'm getting distracted...
>
>>> yes, along the way there is some fluctuation, as, I imagine, the
>>> indices and caches grow, etc... but we reach a threshold where it
>>> takes roughly .5 sec and 3M of consing for every object.
>>
>> 3M of consing per object is ridiculously much of course.  I'm pretty
>> sure that it should be possible to reduce this a lot by tracing some
>> of the garbage collector routines and looking at how much work  
>> they do.
>>
>> One thing you could consider is to turn the garbage collector off
>> during the phase where you're creating very many objects  
>> (initializing
>> your database maybe?).  In fact, you could just turn it off, period.
>> As long as your disk is big enough, of course...
>>
>> Let me know if turning the GC off doesn't help.
>
> How do I do this? I commented out the collect-some-garbage in  
> transaction, but that didn't seem to fix the problem.
>
>
>>> And it would be nice if this approach worked as well.
>>
>> Yes.  Having a separate transaction for each created object is not  
>> the
>> most efficient way and should not be necessary, but obviously it  
>> should
>> work and it shouldn't be ridiculously slow.
>
> Agreed. Of course I only went down this route as performance became  
> unacceptable with a really big transaction too, so perhaps the  
> transaction/gc thing is a red herring.
>
> Cyrus
>
> _______________________________________________
> rucksack-devel mailing list
> rucksack-devel at common-lisp.net
> http://common-lisp.net/cgi-bin/mailman/listinfo/rucksack-devel




More information about the rucksack-devel mailing list