Thanks for the solution Fab. My map would be substantially large. So I
wouldn't want to replicate it in each operator. I will probably add a layer
of redis cache adn use it in streaming process. Do you foresee any problems
with that?

Thanks,
Sandeep

On Wed, Dec 21, 2016 at 9:52 AM, Fabian Hueske <fhue...@gmail.com> wrote:

> You could read the map from a file in the open method of a RichMapFunction.
> The open method is called before the first record is processed and can put
> data into the operator state.
>
> The downside of this approach is that the data is replicated in each
> operator, i.e., each operator holds a full copy of the map.
> On the other hand, you do not need to shuffle the data because each
> parallel task can do the look-up.
> If your <id, name> map is small, this would be the preferred approach.
>
> Best, Fabian
>
> 2016-12-21 18:46 GMT+01:00 Meghashyam Sandeep V <vr1meghash...@gmail.com>:
>
>> As a follow up question, can we populate the operator state from an
>> external source?
>>
>> My use case is as follows: I have a flink streaming process with Kafka as
>> a source. I only have ids coming from kafka messages. My look ups
>> (<id,name>) which is a static map come from a different source. I would
>> like to use those lookups while applying operators on stream from Kafka.
>>
>> Thanks,
>> Sandeep
>>
>> On Wed, Dec 21, 2016 at 6:17 AM, Fabian Hueske <fhue...@gmail.com> wrote:
>>
>>> OK, I see. Yes, you can do that with Flink. It's actually a very common
>>> use case.
>>>
>>> You can store the names in operator state and Flink takes care of
>>> checkpointing the state and restoring it in case of a failure.
>>> In fact, the operator state is persisted in the state backends you
>>> mentioned before.
>>>
>>> Best, Fabian
>>>
>>> 2016-12-21 15:02 GMT+01:00 Meghashyam Sandeep V <vr1meghash...@gmail.com
>>> >:
>>>
>>>> Hi Fabian,
>>>>
>>>> I meant look ups like IDs to names. For example if I have IDs coming
>>>> through the stream and if I want to replace them with corresponding names
>>>> stored in cache or somewhere within flink.
>>>>
>>>> Thanks,
>>>> Sandeep
>>>>
>>>> On Dec 21, 2016 12:35 AM, "Fabian Hueske" <fhue...@gmail.com> wrote:
>>>>
>>>>> Hi Sandeep,
>>>>>
>>>>> I'm sorry but I think I do not understand your question.
>>>>> What do you mean by static or dynamic look ups? Do you want to access
>>>>> an external data store and cache data?
>>>>>
>>>>> Can you give a bit more detail about your use?
>>>>>
>>>>> Best, Fabian
>>>>>
>>>>> 2016-12-20 23:07 GMT+01:00 Meghashyam Sandeep V <
>>>>> vr1meghash...@gmail.com>:
>>>>>
>>>>>> Hi there,
>>>>>>
>>>>>> I know that there are various state backends to persist state. Is
>>>>>> there a similar way to persist static/dynamic look ups and use them while
>>>>>> streaming the data in Flink?
>>>>>>
>>>>>> Thanks,
>>>>>> Sandeep
>>>>>>
>>>>>>
>>>>>
>>>
>>
>

Reply via email to