Hi,

(1): do we think this is interesting/useful and something we can help them
with?

-> I think this would by a very nice feature.

(2): do we think it makes sense to "host" the FlinkGraphComputer on the
Flink codebase?

-> We are currently discussing about extracting connectors and dependencies
to outside systems into a separate repository. Gremlin-Flink might fit in
there as well.

Cheers, Fabian

2015-12-16 20:21 GMT+01:00 James Thornton <james.thorn...@gmail.com>:

> Hello all -
>
> Yes, Cypher could run on the Gremlin Traversal Machine (GTM), and in some
> ways it already does.
>
> The GTM is like the JVM for graphs -- see this paper by Marko Rodriguez...
>
>    - "The Gremlin Graph Traversal Machine and Language"
>    http://arxiv.org/pdf/1508.03843v1.pdf
>
> And for a high-level overview of the GTM, see this blog post by Marko:
>
>    - "The Benefits of the Gremlin Graph Traversal Machine"
>
>
> http://www.datastax.com/dev/blog/the-benefits-of-the-gremlin-graph-traversal-machine
>
> You can already run SPARQL on Gremlin:
> https://github.com/dkuppitz/sparql-gremlin
>
> And this week Ted Wilmes released a SQL-Gremlin compiler
> https://groups.google.com/d/topic/gremlin-users/npncDyVQJSU/discussion
>
> There is interest in GraphQL and Datalog on Gremlin so you'd get all of
> this for free via the Gremlin FlinkGraphComputer.
>
> - James
>
>
> On Wed, Dec 16, 2015 at 12:54 PM, Vasiliki Kalavri <
> vasilikikala...@gmail.com> wrote:
>
> > Hey,
> >
> > I think I might have confused you, so let me try to explain :)
> >
> > First, Gremlin is a language similar to Cypher, but it is also a
> traversal
> > machine, which also supports distributed traversals. For distributed
> > traversals, Gremlin uses a "graph computer", which runs the Gremlin
> > traversals using the BSP model. Essentially, vertices receive traversers
> as
> > messages and execute the traverser's step as the update function (for
> more
> > info see section 5 in [1]).
> >
> > Thus, Tinkerpop has a GiraphGraphComputer to run on top of Giraph, a
> > SparkGraphComputer to run on top of Spark, etc.
> >
> > The Tinkerpop community has offered to work on a FlinkGraphComputer,
> which,
> > similarly to the existing graph computers, will use one of the
> Flink/Gelly
> > iteration abstractions.
> >
> > Now, there are 2 questions for the Flink community:
> > (1): do we think this is interesting/useful and something we can help
> them
> > with?
> > (2): do we think it makes sense to "host" the FlinkGraphComputer on the
> > Flink codebase?
> >
> >
> > Neo4j/Cypher on Flink is a separate discussion in my opinion. As far as I
> > understand, Cypher could run on Gremlin, but there is no compiler for it
> > yet. I have been discussing with people from Neo4j and we have jointly
> > written a description for a thesis project regarding OpenCypher on Flink.
> > The idea is to collaboratively supervise/help the student(s). Of course,
> if
> > anyone else is interested in this (not necessarily a student) we can
> always
> > use more help, so just let me know!
> >
> > Thanks,
> > -Vasia.
> >
> > ​[1]: ​
> > http://arxiv.org/pdf/1508.03843v1.pdf
> >
> >
> > On 16 December 2015 at 19:21, Stephan Ewen <se...@apache.org> wrote:
> >
> > > I am not very familiar with Gremlin, but I remember a brainstorming
> > session
> > > with Martin Neumann on porting Cypher (the neo4j query language) to
> > Flink.
> > > We looked at Cypher queries for filtering and traversing the graph.
> > >
> > > It looked like it would work well. We remember we could even model
> > > recursive conditions on traversals pretty well with delta iterations.
> > >
> > > If Gremlin's use cases are anything like Cypher, I could ping Martin
> and
> > > see if we can collect again some of those ideas.
> > >
> > > Stephan
> > >
> > >
> > >
> > > On Tue, Dec 15, 2015 at 5:35 PM, Vasiliki Kalavri <
> > > vasilikikala...@gmail.com
> > > > wrote:
> > >
> > > > Hi Dr. Fabian,
> > > >
> > > > thanks a lot for your answer!
> > > >
> > > >
> > > > On 15 December 2015 at 15:42, Fabian Hueske <fhue...@gmail.com>
> wrote:
> > > >
> > > > > Hi Vasia,
> > > > >
> > > > > I agree, Gremlin definitely looks like an interesting API for
> Flink.
> > > > > I'm not sure how it relates to Gelly. I guess Gelly would
> (initially)
> > > be
> > > > > more tightly integrated with the DataSet API whereas Gremlin would
> > be a
> > > > > connector for other languages. Any ideas on this?
> > > > >
> > > >
> > > > The idea is to provide a FlinkGraphComputer which will use Gelly's
> > > > iterations to compile the Gremlin query language to Flink.
> > > > In my previous email, I linked to our discussion over at the
> Tinkerpop
> > > > mailing list, where you can find more details on this. By adding the
> > > > FlinkGraphComputer, we basically get any graph query language that
> > > compiles
> > > > to the Gremlin VM for free.
> > > >
> > > >
> > > > >
> > > > > Another question would be whether the connector should to into
> Flink
> > or
> > > > > Tinkerpop. For example, the Spark, Giraph, and Neo4J connectors are
> > all
> > > > > included in Tinkerpop.
> > > > > This should be discussed with the Tinkerpop community.
> > > > >
> > > > >
> > > > I'm copying from the Tinkerpop mailing list thread (link for full
> > thread
> > > in
> > > > my previous email):​
> > > >
> > > >
> > > > *In the past, TinkerPop use to be a "dumping ground" for all
> > > > implementations, but we decided for TinkerPop3 that we would only
> have
> > > > "reference implementations" so users can play, system providers can
> > > learn,
> > > > and ultimately, system providers would provide TinkerPop support in
> > their
> > > > distribution. As such, we would like to have FlinkGraphComputer
> > > distributed
> > > > with Flink. If that sounds like something your project would be
> > > comfortable
> > > > with, I think we can provide a JIRA/PR for FlinkGraphComputer (as
> well
> > as
> > > > any necessary documentation). We can start with a JIRA ticket to get
> > > things
> > > > going. Thoughts?*
> > > >
> > > >
> > > > ​This is why I brought the conversation over here, so I hear the
> > opinions
> > > > of the Flink community on this :)​
> > > >
> > > >
> > > >
> > > > > Best, Fabian
> > > > >
> > > >
> > > >
> > > > -Vasia.​
> > > >
> > > >
> > > >
> > > > >
> > > > >
> > > > > 2015-12-14 18:33 GMT+01:00 Vasiliki Kalavri <
> > vasilikikala...@gmail.com
> > > >:
> > > > >
> > > > > > Ping squirrels! Any thoughts/opinions on this?
> > > > > >
> > > > > > On 9 December 2015 at 20:40, Vasiliki Kalavri <
> > > > vasilikikala...@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hello squirrels,
> > > > > > >
> > > > > > > I have been discussing with the Apache Tinkerpop [1] community
> > > > > regarding
> > > > > > > an integration with Flink/Gelly.
> > > > > > > You can read our discussion in [2].
> > > > > > >
> > > > > > > Tinkerpop has a graph traversal machine called Gremlin, which
> > > > supports
> > > > > > > many high-level graph processing languages and runs on top of
> > > > different
> > > > > > > systems (e.g. Giraph, Spark, Graph DBs). You can read more in
> > this
> > > > > great
> > > > > > > blog post [3].
> > > > > > >
> > > > > > > The idea is to provide a FlinkGraphComputer implementation,
> which
> > > > will
> > > > > > add
> > > > > > > Gremlin support to Flink.
> > > > > > >
> > > > > > > I believe Tinkerpop is a great project and I would love to see
> an
> > > > > > > integration with Gelly.
> > > > > > > Before we move forward, I would like your input!
> > > > > > > To me, it seems that this addition would nicely fit in
> > > flink-contrib,
> > > > > > > where we also have connectors to other projects.
> > > > > > > If you agree, I will go ahead and open a JIRA about it.
> > > > > > >
> > > > > > > Thank you!
> > > > > > > -Vasia.
> > > > > > >
> > > > > > > [1]: https://tinkerpop.incubator.apache.org/
> > > > > > > [2]:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://mail-archives.apache.org/mod_mbox/incubator-tinkerpop-dev/201511.mbox/%3ccanva_a390l7g169r8sn+ej1-yfkbudlnd4td6atwnp0uza-...@mail.gmail.com%3E
> > > > > > > [3]:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.datastax.com/dev/blog/the-benefits-of-the-gremlin-graph-traversal-machine
> > > > > > >
> > > > > > > On 25 November 2015 at 16:54, Vasiliki Kalavri <
> > > > > > vasilikikala...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hi James,
> > > > > > >>
> > > > > > >> I've just subscribed to the Tinkerpop dev mailing list. Could
> > you
> > > > > please
> > > > > > >> send a reply to the thread, so then I can reply to it?
> > > > > > >> I'm not sure how I can reply to the thread otherwise...
> > > > > > >> I also saw that there is a grafos.ml project thread. I could
> > also
> > > > > > >> provide some input there :)
> > > > > > >>
> > > > > > >> Thanks!
> > > > > > >> -Vasia.
> > > > > > >>
> > > > > > >> On 25 November 2015 at 15:09, James Thornton <
> > > > > james.thorn...@gmail.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> Hi Vasia -
> > > > > > >>>
> > > > > > >>> Yes, a FlinkGraphComputer should be a straight-forward first
> > > step.
> > > > > > Also,
> > > > > > >>> on
> > > > > > >>> the Apache Tinkerpop dev mailing list, Marko thought it might
> > be
> > > > cool
> > > > > > if
> > > > > > >>> there was a "Graph API" similar to the "Table API" -- hooking
> > in
> > > > > > Gremlin
> > > > > > >>> to
> > > > > > >>> Flink's fluent API would give Flink users a full graph query
> > > > > language.
> > > > > > >>>
> > > > > > >>> Stephen Mallette is a TinkerPop core contributor, and he has
> > > > already
> > > > > > >>> started working on a FlinkGraphComputer. There is a
> > > Flink/Tinkerpop
> > > > > > >>> thread
> > > > > > >>> on the TinkerPop dev list -- it would be great to have you
> part
> > > of
> > > > > the
> > > > > > >>> conversation there too as we work on the integration:
> > > > > > >>>
> > > > > > >>>
> > > > http://mail-archives.apache.org/mod_mbox/incubator-tinkerpop-dev/
> > > > > > >>>
> > > > > > >>> Thanks, Vasia.
> > > > > > >>>
> > > > > > >>> - James
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Mon, Nov 23, 2015 at 10:28 AM, Vasiliki Kalavri <
> > > > > > >>> vasilikikala...@gmail.com> wrote:
> > > > > > >>>
> > > > > > >>> > Hi James,
> > > > > > >>> >
> > > > > > >>> > thank you for your e-mail and your interest in Flink :)
> > > > > > >>> >
> > > > > > >>> > I've recently taken a _quick_ look into Apache TinkerPop
> and
> > I
> > > > > think
> > > > > > >>> it'd
> > > > > > >>> > be very interesting to integrate with Flink/Gelly.
> > > > > > >>> > Are you thinking about something like a Flink
> GraphComputer,
> > > > > similar
> > > > > > to
> > > > > > >>> > Giraph and Spark GraphComputer's?
> > > > > > >>> > I believe such an integration should be straight-forward to
> > > > > > implement.
> > > > > > >>> You
> > > > > > >>> > can start by looking into Flink iteration operators [1] and
> > > Gelly
> > > > > > >>> iteration
> > > > > > >>> > abstractions [2].
> > > > > > >>> >
> > > > > > >>> > Regarding Apache Geode, I'm not familiar with project, but
> > I'll
> > > > try
> > > > > > to
> > > > > > >>> take
> > > > > > >>> > a look in the following days!
> > > > > > >>> >
> > > > > > >>> > Cheers,
> > > > > > >>> > -Vasia.
> > > > > > >>> >
> > > > > > >>> >
> > > > > > >>> > [1]:
> > > > > > >>> >
> > > > > > >>> >
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#iteration-operators
> > > > > > >>> > [2]:
> > > > > > >>> >
> > > > > > >>> >
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-master/libs/gelly_guide.html#iterative-graph-processing
> > > > > > >>> >
> > > > > > >>> >
> > > > > > >>> > On 20 November 2015 at 08:32, James Thornton <
> > > > > > james.thorn...@gmail.com
> > > > > > >>> >
> > > > > > >>> > wrote:
> > > > > > >>> >
> > > > > > >>> > > Hi -
> > > > > > >>> > >
> > > > > > >>> > > This is James Thornton (espeed) from the Apache Tinkerpop
> > > > > project (
> > > > > > >>> > > http://tinkerpop.incubator.apache.org/).
> > > > > > >>> > >
> > > > > > >>> > > The Flink iterators should pair well with Gremlin's Graph
> > > > > Traversal
> > > > > > >>> > Machine
> > > > > > >>> > > (
> > > > > > >>> > >
> > > > > > >>> > >
> > > > > > >>> >
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.datastax.com/dev/blog/the-benefits-of-the-gremlin-graph-traversal-machine
> > > > > > >>> > > )
> > > > > > >>> > > -- it would be good to coordinate on creating an
> > integration.
> > > > > > >>> > >
> > > > > > >>> > > Also, Apache Geode made a splash today on HN (
> > > > > > >>> > > https://news.ycombinator.com/item?id=10596859) --
> > connecting
> > > > > Geode
> > > > > > >>> and
> > > > > > >>> > > Flink would be killer. Here's the Geode/Spark connector
> for
> > > > > > >>> refefference:
> > > > > > >>> > >
> > > > > > >>> > >
> > > > > > >>> > >
> > > > > > >>> > >
> > > > > > >>> >
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-geode/tree/develop/gemfire-spark-connector
> > > > > > >>> > >
> > > > > > >>> > > - James
> > > > > > >>> > >
> > > > > > >>> >
> > > > > > >>>
> > > > > > >>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to