Revin, We instrument our data pipeline and Kafka applications using both with wall time within our consumers and stream processors, and in a separate application that measures end-to-end latency in our data processing pipelines. We report these metrics to a metric aggregator, which in our case is Datadog. Using these metrics, we perform isolated performance experiments on production volume data.
I would be very happy to hear about your results as well. Best, Håkon lør. 16. sep. 2017 kl. 02:13 skrev Matt Andruff <matt.andr...@gmail.com>: > Look,. I'm a huge fan of sending identical data and using plane old 'wall > time' and averaging a couple runs to make sure you remove any whoops. > > You can use fancy tools for reporting but in the real world wall time still > is the most critical factor. And let's face it it's also simple to > measure. > > I personally would love to hear about your results. If you'd be willing to > share. > > On Fri, Sep 15, 2017, 15:01 Revin Chalil <rcha...@expedia.com> wrote: > > > Any thoughts on the below will be appreciated. Thanks. > > > > > > On 9/13/17, 5:00 PM, "Revin Chalil" <rcha...@expedia.com> wrote: > > > > We are testing kafka’s performance with the real prod data and plan > to > > test things like the below. We would have producers publishing and > > consumers processing production data on a separate non-prod kafka > cluster. > > > > > > * Impact of number of Partitions per Topic on throughput and > > latency on Producer & Consumer > > * Impact of scaling-up Brokers on throughput and latency > > * adding more brokers Vs adding more Disk on existing Brokers. > How > > does the network interface usage differ? > > * cost of Replication on Throughput and Latency > > * impact of Broker vm.swappiness = 60 Vs vm.swappiness = 1 > > * partitions on a Broker pointing to single Disk Vs multiple > Disks > > * EXT4 Vs XFS Filesystem on broker > > * behavior when Broker “num.io<http://num.io/>.threads” is > > increased from 8 to higher value > > * behavior when Broker “num.network.threads” is increased from 3 > > to higher value > > * behavior when the data segment size is increased from 1 GB > > (current setting) > > * producer “acks = 1” Vs “acks = all” (current setting) impact on > > throughput and latency > > * producer sending with Compression enabled (snappy?) Vs sending > > without Compression > > * setting producer batch-size (memory based) Vs record-count > > (current setting) per batch sent to Kafka > > * impact of message size throughput > > * Consumers fetching records from page-cache Vs fetching records > > from Disk > > > > > > Ideally, the metrics we would like to compare for each test are > > (please let know if there are anything else to be compared) > > > > * Producer write Throughput > > * Producer write Latency (ms) > > * Consumption Throughput > > * Consumption Latency (ms) > > * End-to-end Latency > > > > What would be the right tools to collect and compare the above > metrics > > against different Tests? I have setup kafka-monitor but couldn’t find how > > to track the throughput and latency. Kafka-web-console seems to have some > > of these available? Kafka-Manager? Burrow? Anything else? Thank you. > > > > Since we are going to use our own producers and consumers, I do not > > think it makes sense to use tools like kafka-consumer-perf-test.sh or > > kafka-producer-perf-test.sh, but please correct if I am wrong. > > > > Thanks, > > Revin > > > > > > > > > > >