[rucksack-devel] rucksack performance

Cyrus Harmon ch-rucksack at bobobeach.com
Fri Jan 12 00:03:39 UTC 2007


Ok, well there are probably still some issues here, but removing the  
string-index on the slot that had about 12 distinct values (over .5M  
objects) really seems to help performance! careful with that axe,  
Eugene!

crossing my fingers,

Cyrus

On Jan 11, 2007, at 3:13 PM, Cyrus Harmon wrote:

> you probably already knew this, but it seems as though most of the  
> allocation is happening down in btree-insert.
>
> Cyrus
>
> On Jan 11, 2007, at 2:15 PM, Cyrus Harmon wrote:
>
>>
>> On Jan 11, 2007, at 12:00 PM, Arthur Lemmens wrote:
>>
>>> Cyrus Harmon wrote:
>>>
>>>> Following up on my post from yesterday, even when I can coerce the
>>>> "lots of transactions" into working, at some point performance  
>>>> breaks
>>>> down rather severely.
>>>
>>> I haven't looked at this in detail, but my first guess would be that
>>> the garbage collector settings and/or performance is critical here.
>>
>> Hmm... back to the
>>
>> The value
>>   #<RUCKSACK:PERSISTENT-ARRAY #178397 in #<STANDARD-CACHE of size  
>> 10000, heap #P"/Users/sly/projects/cyrusharmon.org/cl-bio/rucksack/ 
>> heap" and 7481 objects in memory.>>
>> is not of type
>>   (OR NULL RUCKSACK:PERSISTENT-CONS).
>>    [Condition of type TYPE-ERROR]
>>
>> error, which is interesting as it looks like we're trying to do a  
>> p-car of an array, and it's getting the array be accessing the  
>> last element in the other (legitimate) array, but I'm getting  
>> distracted...
>>
>>>> yes, along the way there is some fluctuation, as, I imagine, the
>>>> indices and caches grow, etc... but we reach a threshold where it
>>>> takes roughly .5 sec and 3M of consing for every object.
>>>
>>> 3M of consing per object is ridiculously much of course.  I'm pretty
>>> sure that it should be possible to reduce this a lot by tracing some
>>> of the garbage collector routines and looking at how much work  
>>> they do.
>>>
>>> One thing you could consider is to turn the garbage collector off
>>> during the phase where you're creating very many objects  
>>> (initializing
>>> your database maybe?).  In fact, you could just turn it off, period.
>>> As long as your disk is big enough, of course...
>>>
>>> Let me know if turning the GC off doesn't help.
>>
>> How do I do this? I commented out the collect-some-garbage in  
>> transaction, but that didn't seem to fix the problem.
>>
>>
>>>> And it would be nice if this approach worked as well.
>>>
>>> Yes.  Having a separate transaction for each created object is  
>>> not the
>>> most efficient way and should not be necessary, but obviously it  
>>> should
>>> work and it shouldn't be ridiculously slow.
>>
>> Agreed. Of course I only went down this route as performance  
>> became unacceptable with a really big transaction too, so perhaps  
>> the transaction/gc thing is a red herring.
>>
>> Cyrus
>>
>> _______________________________________________
>> rucksack-devel mailing list
>> rucksack-devel at common-lisp.net
>> http://common-lisp.net/cgi-bin/mailman/listinfo/rucksack-devel
>
> _______________________________________________
> rucksack-devel mailing list
> rucksack-devel at common-lisp.net
> http://common-lisp.net/cgi-bin/mailman/listinfo/rucksack-devel




More information about the rucksack-devel mailing list