[elephant-devel] Collection paging

Fri Oct 19 13:04:45 UTC 2007

Can we come up with a list of functions like this that people would  
like to see?
Right now I have:
- feature (difficulty)
- size of btree (moderate)
- size of sub-range?  (harder)
- go to N values from btree starting at the Kth key in btree

Ian

On Oct 3, 2007, at 11:07 PM, lists at infoway.net wrote:

> I have to say that Mariano is hitting some of the issues we will be  
> facing soon as our quest to learn Lisp and Elephant continues and  
> we continue working on migrating some of our SQL-based applications  
> over. This particular need of his is also a real need we have since  
> it's something we offer to our application users. For example, in  
> the application we are working on migrating, we have a table with  
> over 7 million rows. This obviously has many thousands of 50-row  
> pages to navigate through. Our user interface offers the "usual"  
> search, sort by any column(s), and page navigation (First, Last,  
> Next, Previous, or manually input a page number).
>
> The way we handle this in our code is something like SELECT COUNT 
> (*) FROM table_name. From the count, figure out the number of  
> pages. Then compute an offset based on the page the user wants to  
> view (e.g. assuming 50 rows per page and wanting to view page 90,  
> the offset would be (50 * 90) - 1 = 4499) and formulate the SQL  
> query as something like SELECT * FROM table_name ORDER BY  
> {sort_order} OFFSET {computed_offset} LIMIT 50 (note that all this  
> assuming a 50-row page size, and the user also has the ability to  
> change the page size via the web interface)
>
> From a SQL data manipulation language perspective, it's pretty  
> straight forward. From a SQL internal execution path, I really have  
> no idea how it's implemented and don't know if it does any linear  
> scanning to return the results. The fact is that our application  
> allows you to navigate through the 7+ million row table in under 2  
> seconds per page no matter which page you wish to view or sort  
> order. From a user perspective, 2 seconds for a browser-based  
> screen refresh is more than acceptable. Will Elephant allow to  
> "refresh" as quickly if in the current model it needs to do a  
> linear scan? We haven't gotten there yet, but maybe someone can  
> comment on that.
>
> Thanks
>
> On Oct 3, 2007, at 9:13 PM, Ian S Eslick wrote:
>
>> When you say indexes are not sequential, do you mean UIDs are not  
>> sequentially allocated?  I think there is a BDB sequence issue  
>> that I've never worried about that jumps to the nearest 100 when  
>> you reconnect.  However, if you create anything other than a user  
>> object, you will also have gaps in the UID sequence so that's a  
>> fundamental issue.  Don't assume anything about UIDs other than  
>> the fact that they are unique.
>>
>> You could create and index your own field which is a sequential ID  
>> for creation ordering, but it sounds like you probably want to  
>> return a sublist based on some sort order like alphabetical by  
>> name or by date.  In this case, at least doing the last page is  
>> easy, map from end and count the # of users you want before you  
>> terminate, but to find an element that is N elements away from the  
>> first or last element in less than O(n) time isn't possible with  
>> the underlying B-Trees we're using.
>>
>> The first question is whether you database is guaranteed to be so  
>> big that you can't just do this linear time.  When you start to  
>> face performance issues, then you can look at building that  
>> additional data structure.
>>
>> Otherwise, you will have to implement a data structure that  
>> maintains this information on top of the Elephant infrastructure.
>>
>> The first idea that occurs to me is to drop the idea of using an  
>> indexed class or standalone btrees and just build a red-black tree  
>> using object slots (you can inherit from a base class that  
>> implements the RB tree functionality).  This simultaneously solves  
>> the count problem and the access element # N problem.  The O(log  
>> (base 2) N) lookup time will have a higher fixed cost per level  
>> traversal, but if you start getting really large dbs (1000's to  
>> 10k's?) then it will certainly beat a linear map-index approach.   
>> i.e.
>>
>> http://en.wikipedia.org/wiki/Red-black_tree
>>
>> There is a lisp example of this data structure here:
>>
>> http://www.aviduratas.de/lisp/progs/rb-trees.lisp
>>
>> Now there is a problem that you'll need one of these for each  
>> sorted order which for a list sorted many different ways is a  
>> problem.  Anyone know how SQL query systems implement this?
>>
>> Just remember that premature optimization is one of the four  
>> horseman of the apocalypse for the effective programmer.
>>
>> Ian
>> ----- Original Message -----
>> From: Mariano Montone
>> To: Elephant bugs and development
>> Sent: Wednesday, October 03, 2007 6:57 PM
>> Subject: [elephant-devel] Collection paging
>>
>> Hello, it's me again :S.
>>
>> I would like to know how I can access persistent collection pages  
>> efficiently.
>>
>> What I'm trying to do is making work a web list component with  
>> elephant. The list component is supposed to support well known  
>> navigation commands, like look at the collection in pages, support  
>> for first, last, next, previous buttons, and display of collection  
>> size.
>>
>> The collection size problem was treated here:  http://common- 
>> lisp.net/pipermail/elephant-devel/2007-October/001162.html.
>>
>> But now I have a problem with building the pages.
>>
>> My first try was:
>>   (let*
>>       ((start (* (current-page self) (page-size self)))
>>        (end (+ start (page-size self)))
>>        )
>>         (<:ul
>>          (elephant:map-btree #'(lambda (key elem) (declare (ignore  
>> key))
>>                        (let ((elem-text (make-elem-text self elem)))
>>                          (<:li
>>                           (if (slot-value self 'selectable)
>>                           (<ucw:a :action (answer elem)  (<:as- 
>> html elem-text))
>>                           (<:a (<:as-html elem-text))))))
>>                  (model self) :start start :end end)
>>          )
>>
>> with start and end previously fixed based in the current page  
>> number and size.
>>
>> But I realized indexes were not sequential when I created new  
>> objects, as this shows:
>>
>> ASKIT> (with-btree-cursor (cursor (find-class-index 'user))
>>   (iter
>>     (for (values exists? k v) = (cursor-next cursor))
>>     (while exists?)
>>     (format *standard-output* "~A -> ~A ~%" k v)))
>> 2 -> #<USER name: dssdf {B043379}>
>> 3 -> #<USER name: ttttt {B045C69}>
>> 5 -> #<USER name: ff {B048179}>
>> 6 -> #<USER name: other {B04A451}>
>> 7 -> #<USER name: guest {AD61271}>
>> 100 -> #<USER name: qqq {B053001}>
>> 101 -> #<USER name:  {B055721}>
>> 102 -> #<USER name:  {B057E01}>
>> 103 -> #<USER name:  {B05A529}>
>> 104 -> #<USER name:  {B05CCF1}>
>> 105 -> #<USER name:  {B05F579}>
>> 106 -> #<USER name:  {B063E91}>
>> 107 -> #<USER name: qqq {B066851}>
>> 200 -> #<USER name:  {B069519}>
>> 201 -> #<USER name:  {B06C009}>
>> 300 -> #<USER name:  {B06EBA1}>
>> 301 -> #<USER name: aaa {B0717D1}>
>> NIL
>>
>> I don't think this is a bug, it must have to do with how Elephant  
>> manages btrees; but then how am I supposed to access through pages?
>> I would like to have to access all the objects from the beggining  
>> just to discard them instantly (imagine a large collection and the  
>> user wanting to see the last page).
>>
>> Thank you again :)
>>
>> Mariano
>>
>>
>> _______________________________________________
>> elephant-devel site list
>> elephant-devel at common-lisp.net
>> http://common-lisp.net/mailman/listinfo/elephant-devel
>> _______________________________________________
>> elephant-devel site list
>> elephant-devel at common-lisp.net
>> http://common-lisp.net/mailman/listinfo/elephant-devel
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel