<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">

<HTML>

<HEAD>

  <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">

  <META NAME="GENERATOR" CONTENT="GtkHTML/3.3.2">

</HEAD>

<BODY>

I suspect the raw IO costs are going to dominate the query compilation time, and for straight scans and point reads, which I imagine to be the majority of the queries, the optimizers will find a good plan trivially.  An interesting experiment would be to take a simple benchmark that stores a million objects of size x, and compare it to the same benchmark storing objects of size (* 2 x).<BR>

<BR>

<BR>

For the way that I'm using Elephant, it's just plain raw I/O --- and that is why BerkeleyDB is faster.  Of course, in my personally usage, I use DCM to do a huge amount of data caching...<BR>

<BR>

One of the things that interests me is the inclusion of a performance benchmark that we can share with the Relational people.  For example, we might be able to support  osdb: <A HREF="http://osdb.sourceforge.net/">http://osdb.sourceforge.net/</A>  (Part of this desire is driven by my desire, which Ian shares, of having a powerful example application as part of our documentation.)<BR>

<BR>

The real problem here is that Elephant is so different, and so much more powerful and convenient than a relational system, that one really needs to use an "application" benchmark, not a "relational" benchmark --- if we use a relational benchmark, we are tying our hands behind our back.<BR>

<BR>

Once we release 0.9, the integration of the "postmodern" backend is going to be my highest priority.  The model I will move towards is: CLSQL-backend for a general databases and as a starting point for writing database-specific backends, but a (hopefully) collection of database-specific backends that can provide better performance.<BR>

<BR>

The one thing I don't ever want to do is give up backend-independence.  I hope I'm not the only person who sees tremendous benefit in being able to switch your backend implementation choice at any time.<BR>

<BR>

On Wed, 2007-04-11 at 08:16 -0400, Ian Eslick wrote:

<BLOCKQUOTE TYPE=CITE>

<PRE>

<FONT COLOR="#000000">It's interesting that there is only a little performance advantage.   </FONT>

<FONT COLOR="#000000">Have you done some profiling to see where the time is going?  It  </FONT>

<FONT COLOR="#000000">sounds like either the queries were simple enough that the  </FONT>

<FONT COLOR="#000000">compilation step was trivial or that we're seeing Ahmdal's law and  </FONT>

<FONT COLOR="#000000">the SQL costs are swamped by some other activity.</FONT>


<FONT COLOR="#000000">On Apr 11, 2007, at 4:30 AM, Henrik Hjelte wrote:</FONT>


<FONT COLOR="#000000">> Regarding stored procedures, I agree with Ian that the main  </FONT>

<FONT COLOR="#000000">> performance</FONT>

<FONT COLOR="#000000">> advantage that come from them is that the query planning is  </FONT>

<FONT COLOR="#000000">> prepared in</FONT>

<FONT COLOR="#000000">> advance. This is also done if you use prepared sql statements, so they</FONT>

<FONT COLOR="#000000">> give the same advantage. Stored procedures can however be faster if  </FONT>

<FONT COLOR="#000000">> they</FONT>

<FONT COLOR="#000000">> involve several steps, then you won't have to send intermediate  </FONT>

<FONT COLOR="#000000">> results</FONT>

<FONT COLOR="#000000">> to the client and then back to the server. What you should avoid for</FONT>

<FONT COLOR="#000000">> performance reasons is repeatedly sending strings to parse and  </FONT>

<FONT COLOR="#000000">> execute.</FONT>

<FONT COLOR="#000000">></FONT>

<FONT COLOR="#000000">> I have really tried to optimize the postmodern backend for speed,  </FONT>

<FONT COLOR="#000000">> still</FONT>

<FONT COLOR="#000000">> it is slower than BerkeleyDB. The postmodern backend uses prepared</FONT>

<FONT COLOR="#000000">> statements for almost everything "simple", I could not measure any</FONT>

<FONT COLOR="#000000">> performance advantage with using stored procedures for this. There is</FONT>

<FONT COLOR="#000000">> one stored procedure left because it involves several steps, so in</FONT>

<FONT COLOR="#000000">> theory it can be faster (compared to a couple of prepared statements),</FONT>

<FONT COLOR="#000000">> but I haven't actually measured if and how much faster.</FONT>

<FONT COLOR="#000000">></FONT>

<FONT COLOR="#000000">> Negative: stored procedures for the clsql backend will definitely  </FONT>

<FONT COLOR="#000000">> remove</FONT>

<FONT COLOR="#000000">> portability between databases. Positive: a little faster. But I am</FONT>

<FONT COLOR="#000000">> totally convinced that stored procedures will not bring clsql even  </FONT>

<FONT COLOR="#000000">> close</FONT>

<FONT COLOR="#000000">> to the performance of BerkeleyDB.</FONT>

<FONT COLOR="#000000">></FONT>

<FONT COLOR="#000000">> /Henrik Hjelte</FONT>

<FONT COLOR="#000000">></FONT>

<FONT COLOR="#000000">></FONT>

<FONT COLOR="#000000">> On Tue, 2007-04-03 at 19:31 +0200, Pierre THIERRY wrote:</FONT>

<FONT COLOR="#000000">>> Scribit Robert L. Read dies 03/04/2007 hora 11:08:</FONT>

<FONT COLOR="#000000">>>> Stored procedures tend to not be very portable; therefore to put  </FONT>

<FONT COLOR="#000000">>>> them</FONT>

<FONT COLOR="#000000">>>> in the current "postgres" backend, which should really be called a</FONT>

<FONT COLOR="#000000">>>> "clsql" backend, would make it less likely to work with MySQL.</FONT>

<FONT COLOR="#000000">>></FONT>

<FONT COLOR="#000000">>> I was thinking at having some PostgreSQL-specific bits within the  </FONT>

<FONT COLOR="#000000">>> clsql</FONT>

<FONT COLOR="#000000">>> backend. That would apply to MySQL or any other DB that can use  </FONT>

<FONT COLOR="#000000">>> stored</FONT>

<FONT COLOR="#000000">>> procedures to make some queries faster.</FONT>

<FONT COLOR="#000000">>></FONT>

<FONT COLOR="#000000">>>> However, this raises and interesting question:  Is performance a</FONT>

<FONT COLOR="#000000">>>> significant problem (at least for the Postgres users?)  If you had a</FONT>

<FONT COLOR="#000000">>>> "wish list" for Elephant features, would better performance be at  </FONT>

<FONT COLOR="#000000">>>> the</FONT>

<FONT COLOR="#000000">>>> top?</FONT>

<FONT COLOR="#000000">>></FONT>

<FONT COLOR="#000000">>> I just don't want to be limiting. The only way to go seemed to me  </FONT>

<FONT COLOR="#000000">>> to be</FONT>

<FONT COLOR="#000000">>> to benchmark various uses of stored procedures. On the other hand,</FONT>

<FONT COLOR="#000000">>> having a cache for read queries, as was discussed earlier, could well</FONT>

<FONT COLOR="#000000">>> make the stored procedure useless. Or not. Well, we need to measure.</FONT>

<FONT COLOR="#000000">>></FONT>

<FONT COLOR="#000000">>> Doubtfully,</FONT>

<FONT COLOR="#000000">>> Pierre</FONT>

<FONT COLOR="#000000">>> _______________________________________________</FONT>

<FONT COLOR="#000000">>> elephant-devel site list</FONT>

<FONT COLOR="#000000">>> <A HREF="mailto:elephant-devel@common-lisp.net">elephant-devel@common-lisp.net</A></FONT>

<FONT COLOR="#000000">>> <A HREF="http://common-lisp.net/mailman/listinfo/elephant-devel">http://common-lisp.net/mailman/listinfo/elephant-devel</A></FONT>

<FONT COLOR="#000000">></FONT>

<FONT COLOR="#000000">> _______________________________________________</FONT>

<FONT COLOR="#000000">> elephant-devel site list</FONT>

<FONT COLOR="#000000">> <A HREF="mailto:elephant-devel@common-lisp.net">elephant-devel@common-lisp.net</A></FONT>

<FONT COLOR="#000000">> <A HREF="http://common-lisp.net/mailman/listinfo/elephant-devel">http://common-lisp.net/mailman/listinfo/elephant-devel</A></FONT>


<FONT COLOR="#000000">_______________________________________________</FONT>

<FONT COLOR="#000000">elephant-devel site list</FONT>

<FONT COLOR="#000000"><A HREF="mailto:elephant-devel@common-lisp.net">elephant-devel@common-lisp.net</A></FONT>

<FONT COLOR="#000000"><A HREF="http://common-lisp.net/mailman/listinfo/elephant-devel">http://common-lisp.net/mailman/listinfo/elephant-devel</A></FONT>

</PRE>

</BLOCKQUOTE>

</BODY>

</HTML>