[vivace-graph-devel] elephant

Kevin Raison raison at chatsubo.net
Tue Sep 13 16:15:59 UTC 2011


Dan, I actually already tried using elephant at a very early stage in 
the development of VG.  While elephant is an excellent library (which I 
have used in many projects), it is simply too slow to be of use here. 
One of the goals of VG is to be fast and to be able to handle billions 
of triples.  Elephant slows down very quickly because of a number of 
factors, including its use of BerkeleyDB rather than a native Lisp 
back-end store, as well as its complexity.  VG does not need the level 
of complexity or abstraction that you get with elephant's indexes and 
class redefinition logic.  In our case, we are dealing with one class, 
the triple, and as such, we can be very specific about how we store and 
index it as well as how we deal with it in memory.  Standard b-trees 
simply won't efficiently handle the fanout of a large triple store;  we 
need something specifically tuned to our purpose.  I am fairly certain 
that linear hashing for triple storage combined with b-tries or fb-trees 
for indexing would do much better.  There are other graph dbs out there 
that use this strategy.  See 
http://blog.directededge.com/2009/02/27/on-building-a-stupidly-fast-graph-database/ 
for some good discussion.

Another goal of mine is to develop a native Lisp back-end that projects 
like elephant might be able to take advantage of;  not relying on 
external, non-Lisp libraries is a good thing, especially BerkeleyDB, 
given its terrible licensing terms (thanks, Oracle).

You mention that you had some further thoughts after reading the fractal 
pre-fetching b-trees paper;  care to share?

-Kevin


On 09/13/2011 07:44 AM, Dan Lentz wrote:
> I've been thinking about persistent index strategies, and have read
> through the paper on fpb+trees, and have had a few thoughts.
>
> The first and simplest is to make use of elephant.  Its not very exotic
> or course but it would allow a model in which triples can be first class
> objects, yet leverage a reasonably performant back end (bdb).  In
> addition, the set-valued slots and association slots are nice
> abstractions on top of which to build the rdf semantics (properties,
> extensions) on top of a real clos mop.
>
> I figured I'd shoot the idea onto the mailing list to get a feel for the
> degree and nature of agreement/disagreement.
>
>
> _______________________________________________
> vivace-graph-devel mailing list
> vivace-graph-devel at common-lisp.net
> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/vivace-graph-devel




More information about the vivace-graph-devel mailing list