Hi,

After using Redis, why there need to care about eliminate duplicated data,
if you specify the same key, then Redis will do the deduplicate things.

Best,
Congxian


Fabian Hueske <fhue...@gmail.com> 于2019年10月2日周三 下午5:30写道:

> Hi,
>
> State is always associated with a single task in Flink.
> The state of a task cannot be accessed by other tasks of the same operator
> or tasks of other operators.
> This is true for every type of state, including broadcast state.
>
> Best, Fabian
>
>
> Am Di., 1. Okt. 2019 um 08:22 Uhr schrieb Navneeth Krishnan <
> reachnavnee...@gmail.com>:
>
>> Hi,
>>
>> I can use redis but I’m still having hard time figuring out how I can
>> eliminate duplicate data. Today without broadcast state in 1.4 I’m using
>> cache to lazy load the data. I thought the broadcast state will be similar
>> to that of kafka streams where I have read access to the state across the
>> pipeline. That will indeed solve a lot of problems. Is there some way I can
>> do the same with flink?
>>
>> Thanks!
>>
>> On Mon, Sep 30, 2019 at 10:36 PM Congxian Qiu <qcx978132...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Could you use some cache system such as HBase or Reids to storage this
>>> data, and query from the cache if needed?
>>>
>>> Best,
>>> Congxian
>>>
>>>
>>> Navneeth Krishnan <reachnavnee...@gmail.com> 于2019年10月1日周二 上午10:15写道:
>>>
>>>> Thanks Oytun. The problem with doing that is the same data will be have
>>>> to be stored multiple times wasting memory. In my case there will around
>>>> million entries which needs to be used by at least two operators for now.
>>>>
>>>> Thanks
>>>>
>>>> On Mon, Sep 30, 2019 at 5:42 PM Oytun Tez <oy...@motaword.com> wrote:
>>>>
>>>>> This is how we currently use broadcast state. Our states are re-usable
>>>>> (code-wise), every operator that wants to consume basically keeps the same
>>>>> descriptor state locally by processBroadcastElement'ing into a local 
>>>>> state.
>>>>>
>>>>> I am open to suggestions. I see this as a hard drawback of dataflow
>>>>> programming or Flink framework?
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Oytun Tez
>>>>>
>>>>> *M O T A W O R D*
>>>>> The World's Fastest Human Translation Platform.
>>>>> oy...@motaword.com — www.motaword.com
>>>>>
>>>>>
>>>>> On Mon, Sep 30, 2019 at 8:40 PM Oytun Tez <oy...@motaword.com> wrote:
>>>>>
>>>>>> You can re-use the broadcasted state (along with its descriptor) that
>>>>>> comes into your KeyedBroadcastProcessFunction, in another operator
>>>>>> downstream. that's basically duplicating the broadcasted state whichever
>>>>>> operator you want to use, every time.
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---
>>>>>> Oytun Tez
>>>>>>
>>>>>> *M O T A W O R D*
>>>>>> The World's Fastest Human Translation Platform.
>>>>>> oy...@motaword.com — www.motaword.com
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 30, 2019 at 8:29 PM Navneeth Krishnan <
>>>>>> reachnavnee...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> Is it possible to access a broadcast state across the pipeline? For
>>>>>>> example, say I have a KeyedBroadcastProcessFunction which adds the 
>>>>>>> incoming
>>>>>>> data to state and I have downstream operator where I need the same 
>>>>>>> state as
>>>>>>> well, would I be able to just read the broadcast state with a readonly
>>>>>>> view. I know this is possible in kafka streams.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>

Reply via email to