[elephant-devel] Stored procedures

Wed Apr 11 14:01:56 UTC 2007

I suspect the raw IO costs are going to dominate the query compilation
time, and for straight scans and point reads, which I imagine to be the
majority of the queries, the optimizers will find a good plan trivially.
An interesting experiment would be to take a simple benchmark that
stores a million objects of size x, and compare it to the same benchmark
storing objects of size (* 2 x).

For the way that I'm using Elephant, it's just plain raw I/O --- and
that is why BerkeleyDB is faster.  Of course, in my personally usage, I
use DCM to do a huge amount of data caching...

One of the things that interests me is the inclusion of a performance
benchmark that we can share with the Relational people.  For example, we
might be able to support  osdb: http://osdb.sourceforge.net/  (Part of
this desire is driven by my desire, which Ian shares, of having a
powerful example application as part of our documentation.)

The real problem here is that Elephant is so different, and so much more
powerful and convenient than a relational system, that one really needs
to use an "application" benchmark, not a "relational" benchmark --- if
we use a relational benchmark, we are tying our hands behind our back.

Once we release 0.9, the integration of the "postmodern" backend is
going to be my highest priority.  The model I will move towards is:
CLSQL-backend for a general databases and as a starting point for
writing database-specific backends, but a (hopefully) collection of
database-specific backends that can provide better performance.

The one thing I don't ever want to do is give up backend-independence.
I hope I'm not the only person who sees tremendous benefit in being able
to switch your backend implementation choice at any time.

On Wed, 2007-04-11 at 08:16 -0400, Ian Eslick wrote:

> It's interesting that there is only a little performance advantage.   
> Have you done some profiling to see where the time is going?  It  
> sounds like either the queries were simple enough that the  
> compilation step was trivial or that we're seeing Ahmdal's law and  
> the SQL costs are swamped by some other activity.
> 
> On Apr 11, 2007, at 4:30 AM, Henrik Hjelte wrote:
> 
> > Regarding stored procedures, I agree with Ian that the main  
> > performance
> > advantage that come from them is that the query planning is  
> > prepared in
> > advance. This is also done if you use prepared sql statements, so they
> > give the same advantage. Stored procedures can however be faster if  
> > they
> > involve several steps, then you won't have to send intermediate  
> > results
> > to the client and then back to the server. What you should avoid for
> > performance reasons is repeatedly sending strings to parse and  
> > execute.
> >
> > I have really tried to optimize the postmodern backend for speed,  
> > still
> > it is slower than BerkeleyDB. The postmodern backend uses prepared
> > statements for almost everything "simple", I could not measure any
> > performance advantage with using stored procedures for this. There is
> > one stored procedure left because it involves several steps, so in
> > theory it can be faster (compared to a couple of prepared statements),
> > but I haven't actually measured if and how much faster.
> >
> > Negative: stored procedures for the clsql backend will definitely  
> > remove
> > portability between databases. Positive: a little faster. But I am
> > totally convinced that stored procedures will not bring clsql even  
> > close
> > to the performance of BerkeleyDB.
> >
> > /Henrik Hjelte
> >
> >
> > On Tue, 2007-04-03 at 19:31 +0200, Pierre THIERRY wrote:
> >> Scribit Robert L. Read dies 03/04/2007 hora 11:08:
> >>> Stored procedures tend to not be very portable; therefore to put  
> >>> them
> >>> in the current "postgres" backend, which should really be called a
> >>> "clsql" backend, would make it less likely to work with MySQL.
> >>
> >> I was thinking at having some PostgreSQL-specific bits within the  
> >> clsql
> >> backend. That would apply to MySQL or any other DB that can use  
> >> stored
> >> procedures to make some queries faster.
> >>
> >>> However, this raises and interesting question:  Is performance a
> >>> significant problem (at least for the Postgres users?)  If you had a
> >>> "wish list" for Elephant features, would better performance be at  
> >>> the
> >>> top?
> >>
> >> I just don't want to be limiting. The only way to go seemed to me  
> >> to be
> >> to benchmark various uses of stored procedures. On the other hand,
> >> having a cache for read queries, as was discussed earlier, could well
> >> make the stored procedure useless. Or not. Well, we need to measure.
> >>
> >> Doubtfully,
> >> Pierre
> >> _______________________________________________
> >> elephant-devel site list
> >> elephant-devel at common-lisp.net
> >> http://common-lisp.net/mailman/listinfo/elephant-devel
> >
> > _______________________________________________
> > elephant-devel site list
> > elephant-devel at common-lisp.net
> > http://common-lisp.net/mailman/listinfo/elephant-devel
> 
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/elephant-devel/attachments/20070411/e41865ce/attachment.html>