Hello,

I have an interesting problem that I'm having a hard time modeling on Flink,
I'm not sure if it's the right tool for the job. 

I have a stream of messages in Kafka that I need to group and send them to
an external web service but I have some concerns that need to be addressed: 

1. Rate Limited requests => Only tens of requests per minute. If the limit
is exceeded the system has to stop making requests for a few minutes.
2. Crash handling => I'm using savepoints

My first (naive) solution was to implement on a Sink function but the
requests may take a long time to return (up to minutes) so blocking the
thread will interfere with the savepoint mechanism (see  here
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Rate-limit-processing-td11174.html>
 
). Because of this implementing the limit on the sink and relying on
backpressure to slow down the flow will get in the way of savepointing. I'm
not sure how big of a problem this will be but on my tests I'm reading
thousands of messages before the backpressure mechanism starts and
savepointing is taking around 20 minutes.

My second implementation was sleeping on the Fetcher for the Kafka Consumer
but the ws requests time have a huge variance so I ended up implementing a
communication channel between the sink and the source - an object with
mutable state. Not great.  

So my question is if there is a nice way to limit the flow of messages on
the system according to the rate given by a sink function? Is there any
other way I could make this work on Flink?

Thank you



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-requesting-external-web-service-with-rate-limited-requests-tp11952.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.

Reply via email to