[vivace-graph-devel] nodes, fixnums/upper bounds, and multi-constituent indices

Wed Sep 14 15:25:41 UTC 2011

I am still reading though all the homework recommended in recent posts :)
 Really good stuff.  I hope my questions are not a distraction from the
important topics at hand but just contribute toward general discussion and
(at least my) understanding of the project, its goals, how I can utilize VG
and perhaps, in some way, try to contribute to the effort, if possible.

part 1

Another topic I have been looking at related to the indexing and uuid's is
the representation (reification?) of nodes, or  lack thereof.  One
difference in vivace graph versus other tstores I've played with is the
ability to reference nodes as first class "things".  This is called a "node"
in wilbur, and is represented by a simple object composed of the canonical
identifier (uri-namestring) and a flag to indicate "resolution", which, for
wilbur, indicates identification to a short/long namespace mapping, but I
think the concept can be extended to also perhaps refer to hashing or other
deferrable operations.   In the Directed Edge model, nodes are apparently
considered "Items" and have a somewhat richer archetype.

In VG, this is not the case?  Triples are (currently) represented by time
based uuid as previously discussed, and nodes themselves are not hashed and
indexed.  Maybe this is going to change naturally in the course of moving to
v5 uuid?

part 2

This sort of blends into another indexing-model question, related to the
current model which is based on a hierarchical index structure?  Couldn't
additional speed be achieved though multi-constituent indexing?  IE and SP
index, PO, index etc in which multiple nodes of a single triple are hashed
in the aggregate to allow for direct lookup.  This would of course decrease
the upper bounds on the number of triples previously discussed if housed in
a single-rooted index structure, as there would be (eventually) collisions
between these incongruent indexing schemes.  So maybe a multi-rooted index
strategy is something that should be considered and incorporated early on.
 I think this is already partially implemented as spogi, gsopi, etc, but is
still "single-constituent" hierarchical?  As a concrete example -- in case
my question has been as clear as mud :) --  i'd cite the
cassandra-spoc-index-mediator of de.setf.resource, which leverages
multi-constituent indexes extensively.

Apologies (as usual) if I am missing something obvious or distracting from
more useful conversation.

Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/vivace-graph-devel/attachments/20110914/fa32b94b/attachment.html>