[elephant-devel] Lisp Btrees: design considerations

Ian Eslick eslick at media.mit.edu
Fri May 16 15:33:31 UTC 2008


I checked, Rucksack writes dirty objects to disk on each commit - so  
should have the same performance profile as our existing backends, of  
course with different constants...

Ian

On May 16, 2008, at 11:26 AM, Ian Eslick wrote:

>>
>
> I just wanted to clarify a few points in this interesting discussion.
>
>> but if we are using high-level language like Lisp, it's much more  
>> likely that CPU overhead will be large comparing to memory latency.
>> and as compression, even simpiest/fastest one will only add to CPU  
>> overhead, i'm very skeptical about it's benefits.
>
> I doubt that using a Lisp with a modern compiler affects CPU time in  
> any interesting way; I've observed that algorithm implementation  
> with reasonable data structure choices is within 10's of % points of  
> the same algorithm in C.  I don't care how fast your CPU is, it's  
> the data access patterns that dominate time unless you're working on  
> small or highly local datasets, then the inner loop speed _might_  
> matter.
>
>> by the way, interesting fact: some time ago i've found that even  
>> with cache enabled in elehant/postmodern
>> slots reads are relatively slow -- on 10 thousands per second scale  
>> or so. it was quite surprising
>> that abusing function is form-slot-key (that is used in CL-SQL too):
>
> At least in Allegro, the profiler only counts thread time and that  
> much of the delay shown by 'time' vs. the profiler is taken up  
> waiting on cache-miss IO.  It may be that thread time is only a  
> handful of percentage points of the total time?
>
>> (defun form-slot-key (oid name)
>> (format nil "~A ~A" oid name))
>
> Format is notoriously slow.  Still, I find it hard to believe that  
> it matters that much.  (See above)
>
>> and even more surprising was to find out that there is no portable  
>> and fast way to convert interger to string.
>> so i had to use SBCL's internal function called sb-impl::quick- 
>> integer-to-string -- apparently SBCL
>> developers knew about this problem so they made this function for  
>> themselves.
>
> For fun, I did a test with princ and for integers only it's 2x  
> faster, for the above idiom it's only about 50% faster, but some of  
> that is the with-string-stream idea.   If you do it manually as they  
> did and write it to an array one character at a time it can get  
> pretty fast, but probably only 3x-4x.
>
>> so i think that subtle effects like I/O bandwidth will be only seen  
>> in we'll hardcore optimize elephant and applications themselves.
>> but as it is now, storage backend won't be of a big influence of  
>> overal performance and just needs to be moderately good
>
> I agree with this in general.  However I feel like this whole  
> discussion is premature optimization being done while blindfolded.   
> I do think that over time it would be nice to give people tools like  
> compression to enable/disable for their particular applications, but  
> it's probably not worth all the debate until we solve some of the  
> bigger problems in our copious free time.
>
> By the way, if you are caching your data in memory and using the  
> most conservative transaction model of BDB (blocking commits) you  
> are write bandwidth limited.  At the end of each little transaction,  
> you have to write a log entry to disk, then flush a 16kb page to  
> disk, then wait for the controller to say OK, then write the log  
> again, then return to your regularly scheduled programming.  In  
> prevalence, you only write the log and occasionally dump to disk.  I  
> suspect that the SQL back ends function similarly.  When you finish  
> a transaction you have to wait for this to all happen over a socket...
>
> Prevalence is much, much faster because you don't have to flush data  
> structures on each commit, so cl-prevalence performance with  
> Elephant's data and transaction abstractions would be a really nice  
> design point.  I wonder if we would get some of this benefit from a  
> Rucksack adaptation?
>
> Ian
>
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel




More information about the elephant-devel mailing list