[vivace-graph-devel] elephant

Kevin Raison raison at chatsubo.net
Wed Sep 14 05:30:53 UTC 2011


And one more paper on linear hashing.

On 09/13/2011 10:15 PM, Kevin Raison wrote:
> A good paper on linear hashing to disk is attached. I will respond to
> Mon key's comments after some sleep...
>
> On 09/13/2011 09:15 AM, Kevin Raison wrote:
>> Dan, I actually already tried using elephant at a very early stage in
>> the development of VG. While elephant is an excellent library (which I
>> have used in many projects), it is simply too slow to be of use here.
>> One of the goals of VG is to be fast and to be able to handle billions
>> of triples. Elephant slows down very quickly because of a number of
>> factors, including its use of BerkeleyDB rather than a native Lisp
>> back-end store, as well as its complexity. VG does not need the level of
>> complexity or abstraction that you get with elephant's indexes and class
>> redefinition logic. In our case, we are dealing with one class, the
>> triple, and as such, we can be very specific about how we store and
>> index it as well as how we deal with it in memory. Standard b-trees
>> simply won't efficiently handle the fanout of a large triple store; we
>> need something specifically tuned to our purpose. I am fairly certain
>> that linear hashing for triple storage combined with b-tries or fb-trees
>> for indexing would do much better. There are other graph dbs out there
>> that use this strategy. See
>> http://blog.directededge.com/2009/02/27/on-building-a-stupidly-fast-graph-database/
>>
>> for some good discussion.
>>
>> Another goal of mine is to develop a native Lisp back-end that projects
>> like elephant might be able to take advantage of; not relying on
>> external, non-Lisp libraries is a good thing, especially BerkeleyDB,
>> given its terrible licensing terms (thanks, Oracle).
>>
>> You mention that you had some further thoughts after reading the fractal
>> pre-fetching b-trees paper; care to share?
>>
>> -Kevin
>>
>>
>> On 09/13/2011 07:44 AM, Dan Lentz wrote:
>>> I've been thinking about persistent index strategies, and have read
>>> through the paper on fpb+trees, and have had a few thoughts.
>>>
>>> The first and simplest is to make use of elephant. Its not very exotic
>>> or course but it would allow a model in which triples can be first class
>>> objects, yet leverage a reasonably performant back end (bdb). In
>>> addition, the set-valued slots and association slots are nice
>>> abstractions on top of which to build the rdf semantics (properties,
>>> extensions) on top of a real clos mop.
>>>
>>> I figured I'd shoot the idea onto the mailing list to get a feel for the
>>> degree and nature of agreement/disagreement.
>>>
>>>
>>> _______________________________________________
>>> vivace-graph-devel mailing list
>>> vivace-graph-devel at common-lisp.net
>>> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/vivace-graph-devel
>>
>> _______________________________________________
>> vivace-graph-devel mailing list
>> vivace-graph-devel at common-lisp.net
>> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/vivace-graph-devel
>
>
> _______________________________________________
> vivace-graph-devel mailing list
> vivace-graph-devel at common-lisp.net
> http://lists.common-lisp.net/cgi-bin/mailman/listinfo/vivace-graph-devel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p195-ellis.pdf
Type: application/pdf
Size: 1883801 bytes
Desc: not available
URL: <https://mailman.common-lisp.net/pipermail/vivace-graph-devel/attachments/20110913/06d3e76a/attachment.pdf>


More information about the vivace-graph-devel mailing list