[ https://issues.apache.org/jira/browse/KAFKA-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15705475#comment-15705475 ]
Eno Thereska edited comment on KAFKA-4405 at 11/29/16 4:02 PM: --------------------------------------------------------------- [~guozhang], [~hachikuji] Indeed I can verify that I saw a 50% increase in performance with max.poll.records set to 1 with [~ysysberserk]'s suggestion in KafkaConsumer::poll() // 1 line change. For testing, I made fetcher.maxPollRecords public. if (records.size() < fetcher.maxPollRecords){ fetcher.sendFetches(); client.pollNoWakeup(); } Details: I ran SimpleBenchmark.java (for streams) and measured the performance of benchmark.processStreamWithStateStore only (commented out everything except for the benchmark.produce and the above ---- produce is needed to send data to a topic). In createKafkaStreamsWithStateStore I added a line to configure max.poll.records: props.put((ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 1); I got 10MB/s and 15MB/s as results. Did 2 runs to check. was (Author: enothereska): [~guozhang], [~hachikuji] Indeed I can verify that I saw a 50% increase in performance with max.poll.records set to 1 with [~ysysberserk]'s suggestion in KafkaConsumer::poll() // 1 line change. For testing, I made fetcher.maxPollRecords public. if (records.size() < fetcher.maxPollRecords){ fetcher.sendFetches(); client.pollNoWakeup(); } > Kafka consumer improperly send prefetch request > ----------------------------------------------- > > Key: KAFKA-4405 > URL: https://issues.apache.org/jira/browse/KAFKA-4405 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.10.0.1 > Reporter: ysysberserk > > Now kafka consumer has added max.poll.records to limit the count of messages > return by poll(). > According to KIP-41, to implement max.poll.records, the prefetch request > should only be sent when the total number of retained records is less than > max.poll.records. > But in the code of 0.10.0.1 , the consumer will send a prefetch request if it > retained any records and never check if total number of retained records is > less than max.poll.records.. > If max.poll.records is set to a count much less than the count of message > fetched , the poll() loop will send a lot of requests than expected and will > have more and more records fetched and stored in memory before they can be > consumed. > So before sending a prefetch request , the consumer must check if total > number of retained records is less than max.poll.records. -- This message was sent by Atlassian JIRA (v6.3.4#6332)