[elephant-devel] Is it possible to query by 2 index ranges at the same time

Ian Eslick eslick at csail.mit.edu
Fri Jun 1 03:07:27 UTC 2007


Ignas,

The easiest way to do this is to follow Robert's suggestion, and  
declare slots x and y to be indexed (assuming your parameters are  
slot values of a persistent objects) and then say:

(remove-duplicates
   (nconc (get-instances-by-range 'my-class 'x 10 20)
          (get-instances-by-range 'my-class 'y -5 15)))

Of course you might do a little better, performance-wise, if you  
filter the y value as you traverse the x range (or vice-versa).   
Since get-instances-by-range is just a wrapper around map-inverted- 
index that collects visited objects, this could be up to twice as  
fast, although in practice I'd expect the benefit to be marginal:

(let ((matches nil))
   (map-inverted-index (lambda (obj)
                         (when (and (> y -5) (< y 15))
                           (push obj matches)))
	              'my-class
         	      'x
	              10
                       20)
   matches)

This version avoids a second map-inverted-index operation (called by  
get-instances-by-range) by only collecting objects if their y  
coordinate is in the range.  You can wrap all this in a function like  
(get-objects-in-region x1 y1 x2 y2).  Map over the x or y based on  
which query is smaller, or which parameter is sparser.  If you don't  
know, then it doesn't really matter which you choose!

If you get some performance comparisons of these two scenarios,  
please post them to the list!

Regards,
Ian

PS - It would be interesting to see an R-tree (or one of the well- 
known variants such as R* or Priority R-Trees) in Elephant.  Instead  
of using indexed classes, you would directly implement the R-tree  
nodes as persistent instances and write the construction and  
retrieval algorithms as if they were operating on in-memory objects.   
Graphs are a very natural structure to implement in Elephant and the  
only special provision I can think of might be maintaining a  
persistent set of free nodes for the dynamic case.

On May 31, 2007, at 10:17 PM, Robert L. Read wrote:

> I'm assuming x and y are properites of a data object, which has  
> some other component z which
> you with to retrieve, and you query is that you want to find all  
> the records (x,y,z) such that
> (10 < x < 20) and ( -5 < y < 15).
>
> There is a spectrum of solutions to this problem.  However, in the  
> general case Elephant will
> not compute this with the best possible asymptotic complexity,  
> although it may be better than
> a relational database at doing so.
>
> Here is the most naive solution:
>
> Examine every object in the database, and report those that meet  
> that above condition.
>
> Although this may seem silly, don't knock it till you try it....it  
> may be perfectly reasonable.
>
> A second solution is to make sure that you have specified :index on  
> the x component
> (or, without loss of generality, the y component), and then using  
> the "get-instance-by-range"
> feature to get all of the instances within the x range, and use a  
> simple lisp function to discard
> those for which the y component does not match.
>
> This will be relatively fast if the x range excludes a lot of  
> objects and the y range doesn't.
>
> It's not obvious to me that one can do better than that without  
> some significant coding
> (For example, a "grid file" is a datastructure designed to answer  
> such queries efficiently.
> An R-Tree is another geometric structure.)
>
> A relational database will not do better, in the worst case.  A  
> relational database will do
> a better job at statistical sampling---that is, a query optimizer  
> should, in a huge database,
> be able to decide whether it should use the "x" index or the "y"  
> index first based on the
> selectivity of those clauses in the boolean expression.  However,  
> fundamentally it can't
> do any better than to pick the best index, use that only examine  
> some of the objects, and then
> look at the values of the other ones.
>
> On Fri, 2007-06-01 at 03:48 +0300, Ignas Mikalajunas wrote:
>> Hi, i have an interesting usecase and i would like to ask whether  
>> i can solve it by using Elephant. I would like to index objects  
>> placed on a discreet 2d grid of an arbitarry size, which would  
>> require queries like (10 < x < 20) AND (-5 < y < 15). Is it  
>> possible to perform such a query with elephant? I am not sure i  
>> have looked in the right place, but i could not find such example  
>> in the manual. Ignas Mikalajūnas  
>> _______________________________________________ elephant-devel  
>> site list elephant-devel at common-lisp.net http://common-lisp.net/ 
>> mailman/listinfo/elephant-devel
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel




More information about the elephant-devel mailing list