Hello from the Apache Ignite community! Last year there was an interesting thread [1] about such integration. Unfortunately there's been little follow-through, so let's try and fix that in 2016 ;-)
I'm sure a lot has changed in the Flink community, with the recent graduation and 1.0 release, so I'd like to make a new (updated) list of synergies and areas of integration I can think of: +++ *Ignite as a bidirectional Connector* +++ The first and obvious integration point is Ignite as a source and a sink of Flink. An Ignite contributor has already sent a pull request [2] to serve as a sink into Ignite Queues, but I feel this integration can be deeper and more functional. Moreover, it should be hosted in the Flink source tree as a Connector (like the Kafka, or ES connectors). Particularly, we could offer these features: * As a Flink sink => inject data directly into a cache via a DataStreamer. * As a Flink source => run a continuous query against one or multiple caches [4]. +++ *Ignite as a state backend* +++ Either natively [5] or via the IGFS (Ignite Filesystem) interface which can run as a Hadoop Filesystem [6]. This would allow Flink to store intermediate states in Ignite. I believe this is what you called "distributed backup for Streaming Operator State" in the initial exchange, is it? +++ *Ignite as a DataSet API connector* +++ Ability to use Ignite as a source for batch pipelines, by executing Ignite SQL queries [7] against a cache and feeding the results into a Flink pipeline. Basically a batch counterpart to the streaming continuous query idea above. +++ *Ignite as an execution backend* +++ You already mentioned this in [1] and I think it makes for a perfect synergy between both projects, through Ignite's Compute API. Still agree with this? Any changes since last year I should take into account? +++ *Ignite as a parameter server* +++ This was in the initial proposal [1], but it's not clear to me. I have found references to the idea of a Parameter Server in Flink, but only as proposed ideas. Was this feature finally implemented, or is it in the future roadmap? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This is just a newer, updated proposal from my side, but I'm sure that both communities can, and will want to, chime in! Cheers, [1] https://mail-archives.apache.org/mod_mbox/flink-dev/201504.mbox/%3CCANC1h_u__KgsdOo2SZ4M=8jf3zomozs3xbekq0erjj9p4wf...@mail.gmail.com%3E [2] https://issues.apache.org/jira/browse/IGNITE-813 [3] https://ignite.apache.org/features/streaming.html [4] http://apacheignite.gridgain.org/v1.5/docs/continuous-queries [5] https://apacheignite-fs.readme.io/docs/igfs [6] https://apacheignite-fs.readme.io/docs/file-system [7] https://apacheignite.readme.io/docs/sql-queries *Raúl Kripalani* PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and Messaging Engineer http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani Blog: raul.io | twitter: @raulvk <https://twitter.com/raulvk>