What if we copy the big data set to HDFS on start of cluster (eg EMR if using
AWS) and then use that to build distributed operatot state in Flink instead of
calling the external store?
How does flink contributors feel about that?
Thanks
Ankit
On 5/14/17, 8:17 PM, "yunfan123" wrote:
The 1
The 1.2.0 is released. Can you give an example for the feature function
asynchronous operations?
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/static-dynamic-lookups-in-flink-streaming-tp10726p13133.html
Sent from the Apache Flink User Mail
That approach should work as well.
The upcoming Flink 1.2.0 release will feature a function for asynchronous
operations, i.e., you can have multiple concurrent Redis requests, without
losing the fault tolerance guarantees.
Another alternative is to store the map in key-partitioned operator state
o
Thanks for the solution Fab. My map would be substantially large. So I
wouldn't want to replicate it in each operator. I will probably add a layer
of redis cache adn use it in streaming process. Do you foresee any problems
with that?
Thanks,
Sandeep
On Wed, Dec 21, 2016 at 9:52 AM, Fabian Hueske
You could read the map from a file in the open method of a RichMapFunction.
The open method is called before the first record is processed and can put
data into the operator state.
The downside of this approach is that the data is replicated in each
operator, i.e., each operator holds a full copy
As a follow up question, can we populate the operator state from an
external source?
My use case is as follows: I have a flink streaming process with Kafka as a
source. I only have ids coming from kafka messages. My look ups ()
which is a static map come from a different source. I would like to us
OK, I see. Yes, you can do that with Flink. It's actually a very common use
case.
You can store the names in operator state and Flink takes care of
checkpointing the state and restoring it in case of a failure.
In fact, the operator state is persisted in the state backends you
mentioned before.
B
Hi Fabian,
I meant look ups like IDs to names. For example if I have IDs coming
through the stream and if I want to replace them with corresponding names
stored in cache or somewhere within flink.
Thanks,
Sandeep
On Dec 21, 2016 12:35 AM, "Fabian Hueske" wrote:
> Hi Sandeep,
>
> I'm sorry but
Hi Sandeep,
I'm sorry but I think I do not understand your question.
What do you mean by static or dynamic look ups? Do you want to access an
external data store and cache data?
Can you give a bit more detail about your use?
Best, Fabian
2016-12-20 23:07 GMT+01:00 Meghashyam Sandeep V :
> Hi t