Nick and Koert summarized it pretty well. Just to clarify and give some
concrete examples.
If you want to start with a specific vertex, and follow some path, it is
probably easier and faster to use some key values store or even MySQL or a
graph database.
If you want to count the average length of
Likely neither will give real-time for full-graph traversal, no. And once
in memory, GraphX would definitely be faster for "breadth-first" traversal.
But for "vertex-centric" traversals (starting from a vertex and traversing
edges from there, such as "friends of friends" queries etc) then Titan is
it all depends on what kind of traversing. if its point traversing then a
random access based something would be great.
if its more scan-like traversl then spark will fit
On Tue, Apr 8, 2014 at 4:56 PM, Evan Chan wrote:
> I doubt Titan would be able to give you traversal of billions of nodes i
I doubt Titan would be able to give you traversal of billions of nodes in
real-time either. In-memory traversal is typically much faster than
Cassandra-based tree traversal, even including in-memory caching.
On Tue, Apr 8, 2014 at 1:23 PM, Nick Pentreath wrote:
> GraphX, like Spark, will not t
GraphX, like Spark, will not typically be "real-time" (where by "real-time"
here I assume you mean of the order of a few 10s-100s ms, up to a few
seconds).
Spark can in some cases approach the upper boundary of this definition (a
second or two, possibly less) when data is cached in memory and the