> On Jan 5, 2017, at 8:23 AM, Hoang Bao Thien <hbthien0...@gmail.com> wrote: > > Yes, the problem is from producer configuration. And James Cheng has told > me how to fix it. > However I still get other poblem with a large file: > > org.apache.kafka.common.errors.TimeoutException: Batch containing 36 > record(s) expired due to timeout while requesting metadata from brokers for > MyTopic-0 >
kafka-console-producer.sh defaults to retries=0. If there is a timeout, as that error indicates, I think it drops the messages it was trying to send. As a test, try setting retries to something high, by doing "--producer-property retries=<somebignumber>" See the description of "retries" at http://kafka.apache.org/documentation/#producerconfigs <http://kafka.apache.org/documentation/#producerconfigs>. -James > Best regards, > > On Thu, Jan 5, 2017 at 10:23 AM, Protoss Hu <hbprot...@yahoo.com.invalid> > wrote: > >> You mean the messages were lost on the way to broker before the broker >> actually received? >> >> Protoss Hu >> Blog: http://hbprotoss.github.io/ >> Weibo: http://weibo.com/hbprotoss >> >> 2017年1月5日 +0800 PM4:53 James Cheng <wushuja...@gmail.com>,写道: >>> kafka-console-producer.sh defaults to acks=0, which means that the >> producer essentially throws messages at the broker and doesn't wait/retry >> to make sure they are properly received. >>> >>> In the kafka-console-producer.sh usage text: >>> --request-required-acks <Integer: The required acks of the producer >>> request required acks> requests (default: 0) >>> >>> Try re-running your test with "--request-required-acks -1" or >> "--request-required-acks all" (They are equivalent) This will tell the >> broker to wait for messages to be fully saved to all replicas before >> returning an acknowledgement to the producer. You can read more about acks >> in the producer configuration section of the kafka docs ( >> http://kafka.apache.org/documentation/#producerconfigs < >> http://kafka.apache.org/documentation/#producerconfigs>) >>> >>> -James >>> >>>> On Jan 4, 2017, at 1:25 AM, Hoang Bao Thien <hbthien0...@gmail.com> >> wrote: >>>> >>>> Hi all, >>>> >>>> I have a problem with losing messages from Kafka. >>>> The situation is as follows: I put a csv file with 286701 rows (size = >>>> 110MB) into Kafka producer with command: >>>> $ cat test.csv | kafka-console-producer.sh --broker-list localhost:9092 >>>> --topic MyTopic > /dev/null >>>> >>>> and then count the number of lines from the Kafka consumer >>>> (kafka-console-consumer.sh --zookeeper localhost:2181 --topic MyTopic >>>> --from-beginning) >>>> However, I only get about 260K-270K, and this number of received >> messages >>>> changes for each test. >>>> >>>> My configuration in the "config/server.properties" has some minor >> change >>>> compared to the original file: >>>> >>>> log.retention.check.interval.hours=24 >>>> log.retention.hours=168 >>>> delete.topic.enable = true >>>> >>>> The remaining configurations are the same as default value. >>>> >>>> Could you please explain why the messages were lost in Kafka? And how >> to >>>> fix this problem please? >>>> >>>> Thanks a lot. >>>> >>>> Best regards >>>> , >>>> Alex >>> >>