That's true. The reason why it works in Flink is that a slow downstream operator will back pressure an upstream operator which will then slow down. The technical implementation of this relies on the fact that Flink uses a bounded pool of network buffers. A sending operator writes data to network buffers and they are free for reuse once the data was sent. If a downstream operator is slow in processing received network buffers then the upstream operator will block until more network buffers become available.
Cheers, Aljoscha On Fri, 2 Sep 2016 at 21:57 rss rss <rssde...@gmail.com> wrote: > Hi, > > some time ago I found a problem with backpressure in Spark and prepared > a simple test to check it and compare with Flink. > https://github.com/rssdev10/spark-kafka-streaming > > > + > https://mail-archives.apache.org/mod_mbox/spark-user/201607.mbox/%3CCA+AWphp=2VsLrgSTWFFknw_KMbq88fZhKfvugoe4YYByEt7a=w...@mail.gmail.com%3E > > In case of Flink it works. In case of Spark it works if you setup > limitations of input rates per data sources. See source code an example. > And actually backpressure detector in Spark works very bad. > > Best regards > > 2016-09-02 15:07 GMT+03:00 jiecxy <253441...@qq.com>: > >> For an operator, the input stream is faster than its output stream, so its >> input buffer will block the previous operator's output thread that >> transfers >> the data to this operator. Right? >> >> Do the Flink and the Spark both handle the backpressure by blocking the >> thread? So what's the difference between them? >> >> For the data source, it is continuously producing the data, what if its >> output thread is blocked? Would the buffer overflow? >> >> >> >> -- >> View this message in context: >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Apache-Flink-How-does-it-handle-the-backpressure-tp8866.html >> Sent from the Apache Flink User Mailing List archive. mailing list >> archive at Nabble.com. >> > >