[vivace-graph-devel] elephant
Kevin Raison
raison at chatsubo.net
Tue Sep 13 16:15:59 UTC 2011
Dan, I actually already tried using elephant at a very early stage in
the development of VG. While elephant is an excellent library (which I
have used in many projects), it is simply too slow to be of use here.
One of the goals of VG is to be fast and to be able to handle billions
of triples. Elephant slows down very quickly because of a number of
factors, including its use of BerkeleyDB rather than a native Lisp
back-end store, as well as its complexity. VG does not need the level
of complexity or abstraction that you get with elephant's indexes and
class redefinition logic. In our case, we are dealing with one class,
the triple, and as such, we can be very specific about how we store and
index it as well as how we deal with it in memory. Standard b-trees
simply won't efficiently handle the fanout of a large triple store; we
need something specifically tuned to our purpose. I am fairly certain
that linear hashing for triple storage combined with b-tries or fb-trees
for indexing would do much better. There are other graph dbs out there
that use this strategy. See
http://blog.directededge.com/2009/02/27/on-building-a-stupidly-fast-graph-database/
for some good discussion.
Another goal of mine is to develop a native Lisp back-end that projects
like elephant might be able to take advantage of; not relying on
external, non-Lisp libraries is a good thing, especially BerkeleyDB,
given its terrible licensing terms (thanks, Oracle).
You mention that you had some further thoughts after reading the fractal
pre-fetching b-trees paper; care to share?
-Kevin
On 09/13/2011 07:44 AM, Dan Lentz wrote:
> I've been thinking about persistent index strategies, and have read
> through the paper on fpb+trees, and have had a few thoughts.
>
> The first and simplest is to make use of elephant. Its not very exotic
> or course but it would allow a model in which triples can be first class
> objects, yet leverage a reasonably performant back end (bdb). In
> addition, the set-valued slots and association slots are nice
> abstractions on top of which to build the rdf semantics (properties,
> extensions) on top of a real clos mop.
>
> I figured I'd shoot the idea onto the mailing list to get a feel for the
> degree and nature of agreement/disagreement.
>
>
> _______________________________________________
> vivace-graph-devel mailing list
> vivace-graph-devel at common-lisp.net
> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/vivace-graph-devel
More information about the vivace-graph-devel
mailing list