Hi Dipanjan,
Gelly is built on top of the DataSet API which is a batch-only API that
is slowly phasing out.
It is not possible to connect a DataStream API program with a DataSet
API program unless you go through a connector such as CSV in between.
On 10.09.21 09:09, Dipanjan Mazumder wrote:
Hi Jing,
Thanks for the input another question i had was can Gelly be used
for processing the graph that flink receives through kafka and then
using Gelly i decompose the graph into its nodes and edges and then
process them individually through substreams and then write the final
output of processing the graph somewhere.
I saw Gelly is for batch processing but had this question if it supports
above , it will solve my entire use case.
On Friday, September 10, 2021, 09:50:08 AM GMT+5:30, JING ZHANG
<beyond1...@gmail.com> wrote:
Hi Dipanjan,
Base your description, I think Flink could handle this user case.
Don't worry that Flink can't handle this kind of data scale because
Flink is a distributed engine. As long as the problem of data skew is
carefully avoided, the input throughput can be handled through
appropriate resources.
Dipanjan Mazumder <java...@yahoo.com <mailto:java...@yahoo.com>> 于2021
年9月10日周五 上午11:11写道:
I am working on a usecase and thinking of using flink for the
same. The use case is i will be having many large resource graphs ,
i need to parse that graph for each node and edge and evaluate each
one of them against some suddhi rules , right now the implementation
for evaluating individual entities with flink and siddhi are in
place , but i am in dilemma whether i should do the graph processing
as well in flink or not.
So this is what i am planning to do
From kafka will fetch the graph , decompose the graph into nodes
and edges , fetch additional meradata for each node and edge from
different Rest API’s and then pass the individual nodes and edges
which are resources to different substreams which are already
inplace and rules will work on individual substreams to process
individual nodes and edges and finally they will spit the rule
output into a stream. I will collate all of them based on the graph
id from that stream using another operator and send the final result
to an outputstream.
This is what i am thinking , now need input from all of you whether
this is a fair usecase to do with flink , will flink be able to
handle this level of processing at scale and volume or not.
Any help input will ease my understanding and will help me go ahead
with this idea.