<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">

<HTML>

<HEAD>

  <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">

  <META NAME="GENERATOR" CONTENT="GtkHTML/3.3.2">

</HEAD>

<BODY>

On Wed, 2006-07-26 at 17:36 -0400, Daniel Salama wrote:<BR>

<BLOCKQUOTE TYPE=CITE>

    <TT><FONT COLOR="#000000">The other approach I thought would be to model it similarly as to how  </FONT></TT><BR>

    <TT><FONT COLOR="#000000">I would do it in a relational database. Basically, I would create  </FONT></TT><BR>

    <TT><FONT COLOR="#000000">separate collections of objects representing the tables I would have  </FONT></TT><BR>

    <TT><FONT COLOR="#000000">in the relational database. Then, within each object, e.g. a customer  </FONT></TT><BR>

    <TT><FONT COLOR="#000000">object, I would create a reference to a collection that holds a  </FONT></TT><BR>

    <TT><FONT COLOR="#000000">subset of invoices, for example. This would allow me to simply query  </FONT></TT><BR>

    <TT><FONT COLOR="#000000">the invoices collection of a customer and that's it. At the same  </FONT></TT><BR>

    <TT><FONT COLOR="#000000">time, I would be able to query the entire invoices collection.</FONT></TT><BR>

</BLOCKQUOTE>

Dear Daniel,<BR>

    I think this approach is much better than creating a very large object.<BR>

<BR>

    Personally, I have an opinion a lot of people disagree with --- I use the "prevalence" model,<BR>

which is basically that I keep all of the objects in memory, and when I change something I <BR>

write back to the data store.   This pretty much makes your reporting efficiency issues <BR>

go away, because you can compute any report really, really fast.<BR>

<BR>

    I have checked in, in the "contrib" directory, a packaged called DCM, for Data Collection Management,<BR>

that does the in-memory management --- the responsibility of the user is to call "register-object" whenever<BR>

an object needs to be back.  DCM also supports the "reference" problem that you mention --- that is,<BR>

instead of putting a whole object into a slot, you put the key there and look it up in a separate object.<BR>

<BR>

    In this model, each class of object you would  objectify (which is very similar to the "tables" in<BR>

relational model or "entities" in the entity-relational model.)  Each should class gets a "director", and<BR>

you operate against the director when you do something.  One of the advantages of this approach is <BR>

that you can choose the "strategy" for each director --- so you can choose to cache the objects in <BR>

memory, or to directly use the database store, or even to use a generational system.<BR>

<BR>

I think the tests of DCM could be considered a little bit of pseudocode for what you want.<BR>

<BR>

    In considering whether or not things should be kept in memory, one should do the math: the <BR>

maximum number of objects * their size vs. the free memory.  Memories are big and getting bigger.<BR>

<BR>

    Let me know if this addresses you ideas or if you have any other questions; Ian and I had<BR>

previously agreed that the lack of a big example application is one of the biggest weaknesses in <BR>

Elephant right now.<BR>

<BR>

</BODY>

</HTML>