By using Redis, you can store all data in one job in one single Redis, no need one slot one Redis, what do you think?
Best, Congxian Navneeth Krishnan <reachnavnee...@gmail.com> 于2019年10月18日周五 上午4:47写道: > Ya, there will not be a problem of duplicates. But what I'm trying to > achieve is if there a large static state which needs to be present just one > per node rather than storing it per slot that would be ideal. The reason > being is that the state is quite large around 100GB of mostly static data > and it is not needed at per slot level. It can be at per instance level > where each slot can read from this shared memory. > > Thanks > > On Wed, Oct 9, 2019 at 12:13 AM Congxian Qiu <qcx978132...@gmail.com> > wrote: > >> Hi, >> >> After using Redis, why there need to care about eliminate duplicated >> data, if you specify the same key, then Redis will do the deduplicate >> things. >> >> Best, >> Congxian >> >> >> Fabian Hueske <fhue...@gmail.com> 于2019年10月2日周三 下午5:30写道: >> >>> Hi, >>> >>> State is always associated with a single task in Flink. >>> The state of a task cannot be accessed by other tasks of the same >>> operator or tasks of other operators. >>> This is true for every type of state, including broadcast state. >>> >>> Best, Fabian >>> >>> >>> Am Di., 1. Okt. 2019 um 08:22 Uhr schrieb Navneeth Krishnan < >>> reachnavnee...@gmail.com>: >>> >>>> Hi, >>>> >>>> I can use redis but I’m still having hard time figuring out how I can >>>> eliminate duplicate data. Today without broadcast state in 1.4 I’m using >>>> cache to lazy load the data. I thought the broadcast state will be similar >>>> to that of kafka streams where I have read access to the state across the >>>> pipeline. That will indeed solve a lot of problems. Is there some way I can >>>> do the same with flink? >>>> >>>> Thanks! >>>> >>>> On Mon, Sep 30, 2019 at 10:36 PM Congxian Qiu <qcx978132...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> Could you use some cache system such as HBase or Reids to storage this >>>>> data, and query from the cache if needed? >>>>> >>>>> Best, >>>>> Congxian >>>>> >>>>> >>>>> Navneeth Krishnan <reachnavnee...@gmail.com> 于2019年10月1日周二 上午10:15写道: >>>>> >>>>>> Thanks Oytun. The problem with doing that is the same data will be >>>>>> have to be stored multiple times wasting memory. In my case there will >>>>>> around million entries which needs to be used by at least two operators >>>>>> for >>>>>> now. >>>>>> >>>>>> Thanks >>>>>> >>>>>> On Mon, Sep 30, 2019 at 5:42 PM Oytun Tez <oy...@motaword.com> wrote: >>>>>> >>>>>>> This is how we currently use broadcast state. Our states are >>>>>>> re-usable (code-wise), every operator that wants to consume basically >>>>>>> keeps >>>>>>> the same descriptor state locally by processBroadcastElement'ing into a >>>>>>> local state. >>>>>>> >>>>>>> I am open to suggestions. I see this as a hard drawback of dataflow >>>>>>> programming or Flink framework? >>>>>>> >>>>>>> >>>>>>> >>>>>>> --- >>>>>>> Oytun Tez >>>>>>> >>>>>>> *M O T A W O R D* >>>>>>> The World's Fastest Human Translation Platform. >>>>>>> oy...@motaword.com — www.motaword.com >>>>>>> >>>>>>> >>>>>>> On Mon, Sep 30, 2019 at 8:40 PM Oytun Tez <oy...@motaword.com> >>>>>>> wrote: >>>>>>> >>>>>>>> You can re-use the broadcasted state (along with its descriptor) >>>>>>>> that comes into your KeyedBroadcastProcessFunction, in another operator >>>>>>>> downstream. that's basically duplicating the broadcasted state >>>>>>>> whichever >>>>>>>> operator you want to use, every time. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> --- >>>>>>>> Oytun Tez >>>>>>>> >>>>>>>> *M O T A W O R D* >>>>>>>> The World's Fastest Human Translation Platform. >>>>>>>> oy...@motaword.com — www.motaword.com >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Sep 30, 2019 at 8:29 PM Navneeth Krishnan < >>>>>>>> reachnavnee...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> Is it possible to access a broadcast state across the pipeline? >>>>>>>>> For example, say I have a KeyedBroadcastProcessFunction which adds the >>>>>>>>> incoming data to state and I have downstream operator where I need >>>>>>>>> the same >>>>>>>>> state as well, would I be able to just read the broadcast state with a >>>>>>>>> readonly view. I know this is possible in kafka streams. >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>