[elephant-devel] Querying Advice

Ian Eslick eslick at csail.mit.edu
Mon Nov 13 12:57:16 UTC 2006


Performance and functionality of search/sort will depend a great deal on
the model we have of how data is stored, how it is cached, how it is
indexed and what kind of queries we want to support.  Robert's approach
is more than adequate if everything is maintained in lists in memory; it
is more expensive if we're searching by de-serializing slot values from
disk one at a time.  An index can be used to clean that up, but
sequencing the indexes can have an enormous impact on performance and
can perform queries that a naive approach would fail on because of
exhausting memory.

I think a macro system makes sense when we find we want to have a query
syntax (a simpler and more elegant version of what SQL tries to do) that
concisely expresses the class of operations we want to perform and
behind which we need to put a query sequencer or compiler to optimize
retrieval.

I do think that Pierre is right that we need to list a set of queries we
want to perform and not worry about performance up front.  We can worry
about sequencing, etc when we have examples of slow queries that start
to rise to the level of user perception.  Let's build up this
infrastructure incrementally so we can figure out what the problems are
at low cost.

Also, I'm going to promote a faster serializer shortly that should speed
up both SQL and BDB backends which should take some modest constant
factors off of query runtime from disk (vs. DCM).

Cheers,
Ian

Pierre THIERRY wrote:
> Scribit Daniel Salama dies 12/11/2006 hora 10:28:
>   
>> I guess we would have to sequentially navigate thru the  results in  
>> order to "manually" select each record based on all the other  
>> possible search arguments. I suppose, in a way, this can be done  
>> relatively painless by using macros
>>     
>
> I don't think custom macros are needed here. The search could be made
> with the various searching functions of Lisp and a predicate built from
> the constraints. Here is an example of a predicate building function for
> an application about real estate:
>
> (defun search-predicate (&key min-price max-price min-surface max-surface)
>   (lambda (x)
>     (and
>       (if min-price (>= (price x) min-price) t)
>       (if max-price (<= (price x) max-price) t)
>       (if min-surface (>= (surface x) min-surface) t)
>       (if max-surface (<= (surface x) max-surface) t))))
>
> Macros could be used to return an optimized closure where some useless
> branching is avoided, but some implementation may already do that
> optimization by themselves (I think at least SBCL removes unreachable
> code when compiling).
>
> It should be pretty easy to write some macros to avoid writing the
> predicated building function by hand, and we may provide some more as
> some sort of query language for the most common cases (min and max for
> the result of a function, and so on). I think we should keep the ability
> for the user to provide it's own predicate code (it would be so
> unlispy not to...).
>
>   
>> Then we have the issue of the sorting. I suppose it falls into a
>> similar situation: once we get the matching resultset from the
>> previous step, we would have to perform some "efficient" sort
>> algorithm on the data set dynamically based on the user's sorting
>> desire. I also suppose we could create a macro for this as well.
>>     
>
> I'm sure sorting is an absolutely pure functional thing. It's "just" a
> matter of finding an efficient algorithm here. Beware of premature
> optimization though. I think we should build a correct solution first,
> and check the state of the art of database implementations if profiling
> show us a bad performance.
>
> Functionally,
> Nowhere man
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel



More information about the elephant-devel mailing list