Hi, Thanks for the tip man. I tried playing with this. Was changing fetch.message.max.bytes (I still have 0.8 kafka) and also socket.receive.buffer.bytes. With some optimal settings I was able to get to 1,2 million reads per second. So 50% increase. But that unfortunately does not increase when I enable hbase sink again. So it means that backpressure kicks in and hbase writing is here limiting factor. I will try to tweak this a bit more if I find something I will share.
Cheers, Kamil. On Thu, Mar 30, 2017 at 12:45 PM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote: > I'm wondering what I can tweak further to increase this. I was reading in > this blog: https://data-artisans.com/extending-the-yahoo- > streaming-benchmark/ > about 3 millions per sec with only 20 partitions. So i'm sure I should be > able to squeeze out more out of it. > > > Not really sure if it is relevant under the context of your case, but you > could perhaps try tweaking the maximum size of Kafka records fetched on > each poll on the partitions. > You can do this by setting a higher value for “max.partition.fetch.bytes” > in the provided config properties when instantiating the consumer; that > will directly configure the internal Kafka clients. > Generally, all Kafka settings are applicable through the provided config > properties, so you can perhaps take a look at the Kafka docs to see what > else there is to tune for the clients. > > On March 30, 2017 at 6:11:27 PM, Kamil Dziublinski ( > kamil.dziublin...@gmail.com) wrote: > > I'm wondering what I can tweak further to increase this. I was reading in > this blog: https://data-artisans.com/extending-the-yahoo- > streaming-benchmark/ > about 3 millions per sec with only 20 partitions. So i'm sure I should be > able to squeeze out more out of it. > >