[rucksack-devel] rucksack performance

Cyrus Harmon ch-rucksack at bobobeach.com
Thu Jan 11 23:40:18 UTC 2007


so it looks like the p-find call down inside leaf-insert is slow and  
conses a lot. what's the point of doing a sequential scan every time  
we add something to the index? sounds like something is screwy here.

Cyrus

On Jan 11, 2007, at 3:13 PM, Cyrus Harmon wrote:

> you probably already knew this, but it seems as though most of the  
> allocation is happening down in btree-insert.
>
> Cyrus
>
> On Jan 11, 2007, at 2:15 PM, Cyrus Harmon wrote:
>
>>
>> On Jan 11, 2007, at 12:00 PM, Arthur Lemmens wrote:
>>
>>> Cyrus Harmon wrote:
>>>
>>>> Following up on my post from yesterday, even when I can coerce the
>>>> "lots of transactions" into working, at some point performance  
>>>> breaks
>>>> down rather severely.
>>>
>>> I haven't looked at this in detail, but my first guess would be that
>>> the garbage collector settings and/or performance is critical here.
>>
>> Hmm... back to the
>>
>> The value
>>   #<RUCKSACK:PERSISTENT-ARRAY #178397 in #<STANDARD-CACHE of size  
>> 10000, heap #P"/Users/sly/projects/cyrusharmon.org/cl-bio/rucksack/ 
>> heap" and 7481 objects in memory.>>
>> is not of type
>>   (OR NULL RUCKSACK:PERSISTENT-CONS).
>>    [Condition of type TYPE-ERROR]
>>
>> error, which is interesting as it looks like we're trying to do a  
>> p-car of an array, and it's getting the array be accessing the  
>> last element in the other (legitimate) array, but I'm getting  
>> distracted...
>>
>>>> yes, along the way there is some fluctuation, as, I imagine, the
>>>> indices and caches grow, etc... but we reach a threshold where it
>>>> takes roughly .5 sec and 3M of consing for every object.
>>>
>>> 3M of consing per object is ridiculously much of course.  I'm pretty
>>> sure that it should be possible to reduce this a lot by tracing some
>>> of the garbage collector routines and looking at how much work  
>>> they do.
>>>
>>> One thing you could consider is to turn the garbage collector off
>>> during the phase where you're creating very many objects  
>>> (initializing
>>> your database maybe?).  In fact, you could just turn it off, period.
>>> As long as your disk is big enough, of course...
>>>
>>> Let me know if turning the GC off doesn't help.
>>
>> How do I do this? I commented out the collect-some-garbage in  
>> transaction, but that didn't seem to fix the problem.
>>
>>
>>>> And it would be nice if this approach worked as well.
>>>
>>> Yes.  Having a separate transaction for each created object is  
>>> not the
>>> most efficient way and should not be necessary, but obviously it  
>>> should
>>> work and it shouldn't be ridiculously slow.
>>
>> Agreed. Of course I only went down this route as performance  
>> became unacceptable with a really big transaction too, so perhaps  
>> the transaction/gc thing is a red herring.
>>
>> Cyrus
>>
>> _______________________________________________
>> rucksack-devel mailing list
>> rucksack-devel at common-lisp.net
>> http://common-lisp.net/cgi-bin/mailman/listinfo/rucksack-devel
>
> _______________________________________________
> rucksack-devel mailing list
> rucksack-devel at common-lisp.net
> http://common-lisp.net/cgi-bin/mailman/listinfo/rucksack-devel




More information about the rucksack-devel mailing list