By using Redis, you can store all data in one job in one single Redis, no
need one slot one Redis, what do you think?

Best,
Congxian


Navneeth Krishnan <reachnavnee...@gmail.com> 于2019年10月18日周五 上午4:47写道:

> Ya, there will not be a problem of duplicates. But what I'm trying to
> achieve is if there a large static state which needs to be present just one
> per node rather than storing it per slot that would be ideal. The reason
> being is that the state is quite large around 100GB of mostly static data
> and it is not needed at per slot level. It can be at per instance level
> where each slot can read from this shared memory.
>
> Thanks
>
> On Wed, Oct 9, 2019 at 12:13 AM Congxian Qiu <qcx978132...@gmail.com>
> wrote:
>
>> Hi,
>>
>> After using Redis, why there need to care about eliminate duplicated
>> data, if you specify the same key, then Redis will do the deduplicate
>> things.
>>
>> Best,
>> Congxian
>>
>>
>> Fabian Hueske <fhue...@gmail.com> 于2019年10月2日周三 下午5:30写道:
>>
>>> Hi,
>>>
>>> State is always associated with a single task in Flink.
>>> The state of a task cannot be accessed by other tasks of the same
>>> operator or tasks of other operators.
>>> This is true for every type of state, including broadcast state.
>>>
>>> Best, Fabian
>>>
>>>
>>> Am Di., 1. Okt. 2019 um 08:22 Uhr schrieb Navneeth Krishnan <
>>> reachnavnee...@gmail.com>:
>>>
>>>> Hi,
>>>>
>>>> I can use redis but I’m still having hard time figuring out how I can
>>>> eliminate duplicate data. Today without broadcast state in 1.4 I’m using
>>>> cache to lazy load the data. I thought the broadcast state will be similar
>>>> to that of kafka streams where I have read access to the state across the
>>>> pipeline. That will indeed solve a lot of problems. Is there some way I can
>>>> do the same with flink?
>>>>
>>>> Thanks!
>>>>
>>>> On Mon, Sep 30, 2019 at 10:36 PM Congxian Qiu <qcx978132...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Could you use some cache system such as HBase or Reids to storage this
>>>>> data, and query from the cache if needed?
>>>>>
>>>>> Best,
>>>>> Congxian
>>>>>
>>>>>
>>>>> Navneeth Krishnan <reachnavnee...@gmail.com> 于2019年10月1日周二 上午10:15写道:
>>>>>
>>>>>> Thanks Oytun. The problem with doing that is the same data will be
>>>>>> have to be stored multiple times wasting memory. In my case there will
>>>>>> around million entries which needs to be used by at least two operators 
>>>>>> for
>>>>>> now.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> On Mon, Sep 30, 2019 at 5:42 PM Oytun Tez <oy...@motaword.com> wrote:
>>>>>>
>>>>>>> This is how we currently use broadcast state. Our states are
>>>>>>> re-usable (code-wise), every operator that wants to consume basically 
>>>>>>> keeps
>>>>>>> the same descriptor state locally by processBroadcastElement'ing into a
>>>>>>> local state.
>>>>>>>
>>>>>>> I am open to suggestions. I see this as a hard drawback of dataflow
>>>>>>> programming or Flink framework?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>> Oytun Tez
>>>>>>>
>>>>>>> *M O T A W O R D*
>>>>>>> The World's Fastest Human Translation Platform.
>>>>>>> oy...@motaword.com — www.motaword.com
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 30, 2019 at 8:40 PM Oytun Tez <oy...@motaword.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> You can re-use the broadcasted state (along with its descriptor)
>>>>>>>> that comes into your KeyedBroadcastProcessFunction, in another operator
>>>>>>>> downstream. that's basically duplicating the broadcasted state 
>>>>>>>> whichever
>>>>>>>> operator you want to use, every time.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>> Oytun Tez
>>>>>>>>
>>>>>>>> *M O T A W O R D*
>>>>>>>> The World's Fastest Human Translation Platform.
>>>>>>>> oy...@motaword.com — www.motaword.com
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Sep 30, 2019 at 8:29 PM Navneeth Krishnan <
>>>>>>>> reachnavnee...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> Is it possible to access a broadcast state across the pipeline?
>>>>>>>>> For example, say I have a KeyedBroadcastProcessFunction which adds the
>>>>>>>>> incoming data to state and I have downstream operator where I need 
>>>>>>>>> the same
>>>>>>>>> state as well, would I be able to just read the broadcast state with a
>>>>>>>>> readonly view. I know this is possible in kafka streams.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>

Reply via email to