Is writing DataStream2 to a Kafka topic and reading it from the other job an option?
2016-09-07 19:07 GMT+02:00 Chakravarthy varaga <chakravarth...@gmail.com>: > Hi Fabian, > > Thanks for your response. Apparently these DataStream > (Job1-DataStream1 & Job2-DataStream2) are from different flink applications > running within the same cluster. > DataStream2 (from Job2) applies transformations and updates a 'cache' > on which (Job1) needs to work on. > Our intention is to not use the external key/value store as we are > trying to localize the cache within the cluster. > Is there a way? > > Best Regards > CVP > > On Wed, Sep 7, 2016 at 5:00 PM, Fabian Hueske <fhue...@gmail.com> wrote: > >> Hi, >> >> Flink does not provide shared state. >> However, you can broadcast a stream to CoFlatMapFunction, such that each >> operator has its own local copy of the state. >> >> If that does not work for you because the state is too large and if it is >> possible to partition the state (and both streams), you can also use keyBy >> instead of broadcast. >> >> Finally, you can use an external system like a KeyValue Store or >> In-Memory store like Apache Ignite to hold your distributed collection. >> >> Best, Fabian >> >> 2016-09-07 17:49 GMT+02:00 Chakravarthy varaga <chakravarth...@gmail.com> >> : >> >>> Hi Team, >>> >>> Can someone help me here? Appreciate any response ! >>> >>> Best Regards >>> Varaga >>> >>> On Mon, Sep 5, 2016 at 4:51 PM, Chakravarthy varaga < >>> chakravarth...@gmail.com> wrote: >>> >>>> Hi Team, >>>> >>>> I'm working on a Flink Streaming application. The data is injected >>>> through Kafka connectors. The payload volume is roughly 100K/sec. The event >>>> payload is a string. Let's call this as DataStream1. >>>> This application also uses another DataStream, call it DataStream2, >>>> (consumes events off a kafka topic). The elements of this DataStream2 >>>> involves in a certain transformation that finally updates a Hashmap(/Java >>>> util Collection). Apparently the flink application should share this >>>> HashMap across the flink cluster so that DataStream1 application could >>>> check the state of the values in this collection. Is there a way to do this >>>> in Flink? >>>> >>>> I don't see any Shared Collection used within the cluster? >>>> >>>> Best Regards >>>> CVP >>>> >>> >>> >> >