Hi, I assume the problem with the slow savepoints is because the checkpoint barriers which ensure the consistency of the savepoint get stuck between the records which are buffered due to backpressure. At some point the savepoint might get cancelled because it does not seem to make progress. You can reduce the amount of data which is buffered due to backpressure by reducing the number of network buffers (taskmanager.network.numberOfBuffers) [1]. This will help the barriers to reach the operators faster.
I don't think there is a ready-to-go way to integrate Kafka offsets with a webserver response. You can of course always implement your own source function but that's a bit of work. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/config.html#jobmanager-amp-taskmanager 2017-03-01 0:58 GMT+01:00 Giuliano Caliari <giuliano.cali...@gmail.com>: > Hey Fabian, > > One of my solutions implements the AsyncFunction but I'm still unable to > savepoint because Flink reads the backed up records, thousands of > historical > records, right off the bat and when I issue a savepoint request it has to > wait for all those records to be processed which takes a couple of hours. > So > I'm still getting the error when savepointing. > Alternatively I could wait for the backed up records to be processed and > issue savepoints afterwards but there is a risk of failures and I would > have > to restart the whole process. > > Another idea would be if we could commit the Kafka offset only after we get > a positive response from the external web service. There would be some > duplication in case of errors but that's acceptable. Is there any easy way > we can do this? > > Cheers, > > > > -- > View this message in context: http://apache-flink-user- > mailing-list-archive.2336050.n4.nabble.com/Flink-requesting-external-web- > service-with-rate-limited-requests-tp11952p11977.html > Sent from the Apache Flink User Mailing List archive. mailing list archive > at Nabble.com. >