Thanks for the solution Fab. My map would be substantially large. So I wouldn't want to replicate it in each operator. I will probably add a layer of redis cache adn use it in streaming process. Do you foresee any problems with that?
Thanks, Sandeep On Wed, Dec 21, 2016 at 9:52 AM, Fabian Hueske <fhue...@gmail.com> wrote: > You could read the map from a file in the open method of a RichMapFunction. > The open method is called before the first record is processed and can put > data into the operator state. > > The downside of this approach is that the data is replicated in each > operator, i.e., each operator holds a full copy of the map. > On the other hand, you do not need to shuffle the data because each > parallel task can do the look-up. > If your <id, name> map is small, this would be the preferred approach. > > Best, Fabian > > 2016-12-21 18:46 GMT+01:00 Meghashyam Sandeep V <vr1meghash...@gmail.com>: > >> As a follow up question, can we populate the operator state from an >> external source? >> >> My use case is as follows: I have a flink streaming process with Kafka as >> a source. I only have ids coming from kafka messages. My look ups >> (<id,name>) which is a static map come from a different source. I would >> like to use those lookups while applying operators on stream from Kafka. >> >> Thanks, >> Sandeep >> >> On Wed, Dec 21, 2016 at 6:17 AM, Fabian Hueske <fhue...@gmail.com> wrote: >> >>> OK, I see. Yes, you can do that with Flink. It's actually a very common >>> use case. >>> >>> You can store the names in operator state and Flink takes care of >>> checkpointing the state and restoring it in case of a failure. >>> In fact, the operator state is persisted in the state backends you >>> mentioned before. >>> >>> Best, Fabian >>> >>> 2016-12-21 15:02 GMT+01:00 Meghashyam Sandeep V <vr1meghash...@gmail.com >>> >: >>> >>>> Hi Fabian, >>>> >>>> I meant look ups like IDs to names. For example if I have IDs coming >>>> through the stream and if I want to replace them with corresponding names >>>> stored in cache or somewhere within flink. >>>> >>>> Thanks, >>>> Sandeep >>>> >>>> On Dec 21, 2016 12:35 AM, "Fabian Hueske" <fhue...@gmail.com> wrote: >>>> >>>>> Hi Sandeep, >>>>> >>>>> I'm sorry but I think I do not understand your question. >>>>> What do you mean by static or dynamic look ups? Do you want to access >>>>> an external data store and cache data? >>>>> >>>>> Can you give a bit more detail about your use? >>>>> >>>>> Best, Fabian >>>>> >>>>> 2016-12-20 23:07 GMT+01:00 Meghashyam Sandeep V < >>>>> vr1meghash...@gmail.com>: >>>>> >>>>>> Hi there, >>>>>> >>>>>> I know that there are various state backends to persist state. Is >>>>>> there a similar way to persist static/dynamic look ups and use them while >>>>>> streaming the data in Flink? >>>>>> >>>>>> Thanks, >>>>>> Sandeep >>>>>> >>>>>> >>>>> >>> >> >