Hi Dipanjan,

Gelly is built on top of the DataSet API which is a batch-only API that is slowly phasing out.

It is not possible to connect a DataStream API program with a DataSet API program unless you go through a connector such as CSV in between.

Regards,
Timo


On 10.09.21 09:09, Dipanjan Mazumder wrote:
Hi Jing,

    Thanks for the input another question i had was can Gelly be used for processing the graph that flink receives through kafka and then using Gelly i decompose the graph into its nodes and edges and then process them individually through substreams and then write the final output of processing the graph somewhere.

I saw Gelly is for batch processing but had this question if it supports above , it will solve my entire use case.

Regards
Dipanjan

On Friday, September 10, 2021, 09:50:08 AM GMT+5:30, JING ZHANG <beyond1...@gmail.com> wrote:


Hi Dipanjan,
Base your description, I think Flink could handle this user case.
Don't worry that Flink can't handle this kind of data scale because Flink is a distributed engine. As long as the problem of data skew is carefully avoided, the input throughput can be handled through appropriate resources.

Best,
JING ZHANG

Dipanjan Mazumder <java...@yahoo.com <mailto:java...@yahoo.com>> 于2021 年9月10日周五 上午11:11写道:

    Hi,

        I am working on a usecase and thinking of using flink for the
    same. The use case is i will be having many large resource graphs ,
    i need to parse that graph for each node and edge and evaluate each
    one of them against some suddhi rules , right now the implementation
    for evaluating individual entities with flink and siddhi are in
    place , but i am in dilemma whether i should do the graph processing
    as well in flink or not.
    So this is what i am planning to do

     From kafka will fetch the graph , decompose the graph into nodes
    and edges , fetch additional meradata for each node and edge from
    different Rest API’s and then pass the individual nodes and edges
    which are resources to different substreams which are already
    inplace and rules will work on individual substreams to process
    individual nodes and edges and finally they will spit the rule
    output into a stream. I will collate all of them based on the graph
    id from that stream using another operator and send the final result
    to an outputstream.

    This is what i am thinking , now need input from all of you whether
    this is a fair usecase to do with flink , will flink be able to
    handle this level of processing at scale and volume or not.

    Any help input will ease my understanding and will help me go ahead
    with this idea.

    Regard
    dipanjan


Reply via email to