Re: How to deal with blocking calls inside a Sink?

Timo Walther Mon, 02 Oct 2017 03:18:08 -0700

Hi Federico,

would it help to buffer events first and perform batches of insertionsfor better throughtput? I saw some similar work recently here:https://tech.signavio.com/2017/postgres-flink-sink

But I would first try the AsyncIO approach, because actually this is ause case it was made for.


Regards,
Timo


Am 10/2/17 um 11:53 AM schrieb Federico D'Ambrosio:

Hi, I've implemented a sink for Hive as a RichSinkFunction, but onceI've integrated it in my current flink job, I noticed that theprocessing of the events slowed down really bad, I guess because ofsome blocking calls that need to be when interacting with hivestreaming api.
So, what can be done to make it so the throughput doesn't get hurt bythese calls? I guess increasing (by a lot) the parallelism of the sinkoperator could be a solution, but I'd think it's not really a good one.
Maybe using the AsyncFunction API? Decoupling the sink in a bufferwhich sends the data + operations to be made in the asyncInvoke methodof the AsyncFunction?
Any suggestion is appreciated.
Kind regards,
Federico D'Ambrosio

Re: How to deal with blocking calls inside a Sink?

Reply via email to