Hi, After using Redis, why there need to care about eliminate duplicated data, if you specify the same key, then Redis will do the deduplicate things.
Best, Congxian Fabian Hueske <fhue...@gmail.com> 于2019年10月2日周三 下午5:30写道: > Hi, > > State is always associated with a single task in Flink. > The state of a task cannot be accessed by other tasks of the same operator > or tasks of other operators. > This is true for every type of state, including broadcast state. > > Best, Fabian > > > Am Di., 1. Okt. 2019 um 08:22 Uhr schrieb Navneeth Krishnan < > reachnavnee...@gmail.com>: > >> Hi, >> >> I can use redis but I’m still having hard time figuring out how I can >> eliminate duplicate data. Today without broadcast state in 1.4 I’m using >> cache to lazy load the data. I thought the broadcast state will be similar >> to that of kafka streams where I have read access to the state across the >> pipeline. That will indeed solve a lot of problems. Is there some way I can >> do the same with flink? >> >> Thanks! >> >> On Mon, Sep 30, 2019 at 10:36 PM Congxian Qiu <qcx978132...@gmail.com> >> wrote: >> >>> Hi, >>> >>> Could you use some cache system such as HBase or Reids to storage this >>> data, and query from the cache if needed? >>> >>> Best, >>> Congxian >>> >>> >>> Navneeth Krishnan <reachnavnee...@gmail.com> 于2019年10月1日周二 上午10:15写道: >>> >>>> Thanks Oytun. The problem with doing that is the same data will be have >>>> to be stored multiple times wasting memory. In my case there will around >>>> million entries which needs to be used by at least two operators for now. >>>> >>>> Thanks >>>> >>>> On Mon, Sep 30, 2019 at 5:42 PM Oytun Tez <oy...@motaword.com> wrote: >>>> >>>>> This is how we currently use broadcast state. Our states are re-usable >>>>> (code-wise), every operator that wants to consume basically keeps the same >>>>> descriptor state locally by processBroadcastElement'ing into a local >>>>> state. >>>>> >>>>> I am open to suggestions. I see this as a hard drawback of dataflow >>>>> programming or Flink framework? >>>>> >>>>> >>>>> >>>>> --- >>>>> Oytun Tez >>>>> >>>>> *M O T A W O R D* >>>>> The World's Fastest Human Translation Platform. >>>>> oy...@motaword.com — www.motaword.com >>>>> >>>>> >>>>> On Mon, Sep 30, 2019 at 8:40 PM Oytun Tez <oy...@motaword.com> wrote: >>>>> >>>>>> You can re-use the broadcasted state (along with its descriptor) that >>>>>> comes into your KeyedBroadcastProcessFunction, in another operator >>>>>> downstream. that's basically duplicating the broadcasted state whichever >>>>>> operator you want to use, every time. >>>>>> >>>>>> >>>>>> >>>>>> --- >>>>>> Oytun Tez >>>>>> >>>>>> *M O T A W O R D* >>>>>> The World's Fastest Human Translation Platform. >>>>>> oy...@motaword.com — www.motaword.com >>>>>> >>>>>> >>>>>> On Mon, Sep 30, 2019 at 8:29 PM Navneeth Krishnan < >>>>>> reachnavnee...@gmail.com> wrote: >>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> Is it possible to access a broadcast state across the pipeline? For >>>>>>> example, say I have a KeyedBroadcastProcessFunction which adds the >>>>>>> incoming >>>>>>> data to state and I have downstream operator where I need the same >>>>>>> state as >>>>>>> well, would I be able to just read the broadcast state with a readonly >>>>>>> view. I know this is possible in kafka streams. >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>