[elephant-devel] BDB vs postmodern
Ian Eslick
eslick at media.mit.edu
Thu Feb 21 04:39:46 UTC 2008
On Feb 20, 2008, at 10:55 PM, lists at infoway.net wrote:
> Ian,
>
> Thanks for the great feedback.
>
> I understand your point about the postmodern performance issue.
> Hopefully that is something that could be overcome in the future.
>
Probably not entirely. Especially if you're talking over a socket.
We're implementing BTrees on top of Tables on top of BTrees, you can't
get around that overhead.
> I do happen to prefer BDB for some reason and I like the whole
> elephant system in general for the reasons you mention and others.
>
Of course! ORM is really annoying for rapid development, especially
if you are evolving a running system.
> However, I'm a bit confused about your comment on BDB vs SQL/ORM
> performance. I would certainly hope that using Elephant with BDB
> would be faster, for most practical cases, than SQL/ORM, as long as
> the application is properly designed to leverage Elephant and BDB's
> architecture. Now, the part that confused me was that of "complex
> queries and joins". I was certainly under the impression that
> properly designing your object model could be more beneficial than
> the relation model and joins of the SQL world. It might have not
> been directly mentioned on the Elephant project, but has certainly
> been mentioned by the folks of AllegroCache and, in essence, the two
> projects seem to have a lot in common.
>
It depends on your query and how much work you're willing to do on the
Elephant side to replicate for your query types all the functionality
that you get from a well-implemented SQL query engine.
> One of the things that AllegroCache has that I haven't seen in
> Elephant is the "oid" keyword parameter to many of their functions.
> As per their documentation: oid - if true return the object id
> instead of the object. So, this could be used as a way to speed up
> certain queries in Elephant, such as getting a COUNT without having
> to generate the entire object.
>
The serializer currently can't do this, but I don't think it's too
hard. There are hooks and vaporware plans for an Elephant query
infrastructure that would make this easier for most users, but alas
time caught up to me before I was able to really push through the bulk
of the work.
> It is true about the architectural performances of Postgres. I'm
> personally not very familiar with it (I'm more of a MySQL person)
> but I know that MySQL does have their issues when it comes to
> clustering, replication, etc.
>
Postgresql is great and while MySQL has come a long way in recent
years, I still think PG is the superior platform (stability,
performance, suitability for complex applications, etc).
> As for the replication and distributed transactions in BDB, well, I
> don't know. I have a feeling that's more in the BDB side of the
> world rather than in Elephant, but I'm not very knowledgeable on
> that either. My impression is that BDB is designed to work in single
> systems rather than in a distributed environment and it has been the
> job of the DBMSs to leverage BDB and enhance it with all those other
> bells and whistles. If that's the case, then your point is correct
> and I would anticipate there would be an enormous amount of work to
> be done in Elephant to support that. Maybe one day.
>
> Thanks again,
> Waldo
>
> On Feb 20, 2008, at 10:05 PM, Ian Eslick wrote:
>
>> I doubt that Elephant on postmodern is going to be faster than
>> using CL-SQL to do direct ORM against Postgresql. Despite the
>> great work that Henrik and others have done with Postmodern, there
>> are too many layers of abstraction in the architecture to
>> overcome. The real point of using Elephant is ease of use, rapid
>> prototyping and the programming API, not absolute performance.
>>
>> I do suspect that raw performance (using BDB) may be comparable or
>> better to any SQL/ORM solution for some simple access patterns, but
>> for complex queries and joins SQL is probably the better way to
>> go. We have already seen various comments on the list about common
>> features of SQL databases that Elephant is missing (like getting a
>> COUNT without generating all the objects).
>>
>>
>> As for using BDB to do replication and distributed transactions -
>> it's possible but no one has tried it. Elephant would need some
>> serious modifications as the transaction protocol is different and
>> I think that you'd have to build your own Global Transaction
>> Manager either in Lisp or as an additional C daemon.
>>
>> It's also possible that there are architectural or performance
>> issues in using Postgresql across multiple machines of which I am
>> unaware - hopefully one of the PG experts will comment.
>>
>> Ian
>>
>> On Feb 20, 2008, at 9:47 PM, lists at infoway.net wrote:
>>
>>> Well, this is certainly interesting, since this would allow me to
>>> decouple the storage system from the lisp environment allowing the
>>> possibility of setting up a cluster of lisp machines to handle
>>> application logic. Isn't there a way to achieve this on BDB?
>>>
>>> We prefer to deploy our systems on clusters of "inexpensive"
>>> machines in order to leverage hardware failures and it seems that
>>> scalability of BDB/Lisp applications is achieved by scaling a
>>> single machine.
>>>
>>> Now, postmodern being 4x slower could be an issue. However, how
>>> does that compare to a regular CL-SQL and relational queries is a
>>> different story. If postmodern is about the same speed as hitting
>>> Postgres with CL-SQL and just using plain SQL instead of elephant,
>>> at the end of the day, that's the performance our users are
>>> getting anyway :)
>>>
>>> Thanks,
>>> Waldo
>>>
>>>>>
>>>>> Since Postgres does allow for features such as replication,
>>>>> clustering, and fail-over with multiple active simultaneous
>>>>> client connections, does this mean that I could have multiple
>>>>> (separate) lisp clients using elephant connecting to a separate
>>>>> Postgres cluster with no concurrency issues?
>>>>
>>>> Yes.
>>>>
>>>> You can do this on BDB, but only on the same system as it relies
>>>> on shared memory locks between processes. This helps for multi-
>>>> CPU systems (one lisp process per CPU) but not for distributing
>>>> Elephant across
>>>>
>>>>> Thanks,
>>>>> Waldo
>>>>> _______________________________________________
>>>>> elephant-devel site list
>>>>> elephant-devel at common-lisp.net
>>>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>>>
>>>> _______________________________________________
>>>> elephant-devel site list
>>>> elephant-devel at common-lisp.net
>>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>>
>>> _______________________________________________
>>> elephant-devel site list
>>> elephant-devel at common-lisp.net
>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>
>> _______________________________________________
>> elephant-devel site list
>> elephant-devel at common-lisp.net
>> http://common-lisp.net/mailman/listinfo/elephant-devel
>
> _______________________________________________
> elephant-devel site list
> elephant-devel at common-lisp.net
> http://common-lisp.net/mailman/listinfo/elephant-devel
More information about the elephant-devel
mailing list