[ https://issues.apache.org/jira/browse/KAFKA-545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao closed KAFKA-545. ------------------------- > Add a Performance Suite for the Log subsystem > --------------------------------------------- > > Key: KAFKA-545 > URL: https://issues.apache.org/jira/browse/KAFKA-545 > Project: Kafka > Issue Type: New Feature > Affects Versions: 0.8 > Reporter: Jay Kreps > Assignee: Jay Kreps > Priority: Blocker > Labels: features > Fix For: 0.8 > > Attachments: KAFKA-545-draft.patch, KAFKA-545.patch, > KAFKA-545-v2.patch, KAFKA-545-v3.patch > > > We have had several performance concerns or potential improvements for the > logging subsystem. To conduct these in a data-driven way, it would be good to > have a single-machine performance test that isolated the performance of the > log. > The performance optimizations we would like to evaluate include > - Special casing appends in a follower which already have the correct offset > to avoid decompression and recompression > - Memory mapping either all or some of the segment files to improve the > performance of small appends and lookups > - Supporting multiple data directories and avoiding RAID > Having a standalone tool is nice to isolate the component and makes profiling > more intelligible. > This test would drive load against Log/LogManager controlled by a set of > command line options. These command line program could then be scripted up > into a suite of tests that covered variations in message size, message set > size, compression, number of partitions, etc. > Here is a proposed usage for the tool: > ./bin/kafka-log-perf-test.sh > Option Description > ------ ----------- > --partitions The number of partitions to write to > --dir The directory in which to write the log > --message-size The size of the messages > --set-size The number of messages per write > --compression Compression alg > --messages The number of messages to write > --readers The number of reader threads reading the data > The tool would capture latency and throughput for the append() and read() > operations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira