Does the same thing happen if you're only using direct stream plus back pressure, not the receiver stream?
On Sep 9, 2016 6:41 PM, "Jeff Nadler" <jnad...@srcginc.com> wrote: > Maybe this is a pretty esoteric implementation, but I'm seeing some bad > behavior with backpressure plus multiple Kafka streams / direct streams. > > Here's the scenario: > We have 1 Kafka topic using the reliable receiver (4 receivers, union the > result). In the same app, we consume another Kafka topic using a direct > stream. > > This may seem strange, but it's necessary in my application to work around > another problem: Maxrate is set globally in SparkConf. IMO It would be > more flexible if we could set maxrate for each stream independently. > Since directstream uses a different config parameter for maxrate, we get > the desired result. > > A bit hacky I know. > > Anyway, we recently turned on backpressure. It works as expected for the > receiver-based stream. For the direct stream, it starts out at the > maxrate (as expected) on the first batch. Then it ratchets down the > consumption until it is eventually consuming 1 record / second / partition. > > This happens even though there's no scheduling delay, and the > receiver-based stream does not appear to be throttled. > > Anyone ever see anything like this? > > Thanks! > > Jeff Nadler > Aerohive Networks > >