It's perfect with the retries>0. Thanks a lot, James.
Best regards On Thu, Jan 5, 2017 at 10:51 PM, James Cheng <[email protected]> wrote: > > > On Jan 5, 2017, at 8:23 AM, Hoang Bao Thien <[email protected]> > wrote: > > > > Yes, the problem is from producer configuration. And James Cheng has told > > me how to fix it. > > However I still get other poblem with a large file: > > > > org.apache.kafka.common.errors.TimeoutException: Batch containing 36 > > record(s) expired due to timeout while requesting metadata from brokers > for > > MyTopic-0 > > > > kafka-console-producer.sh defaults to retries=0. If there is a timeout, as > that error indicates, I think it drops the messages it was trying to send. > > As a test, try setting retries to something high, by doing > "--producer-property retries=<somebignumber>" > > See the description of "retries" at http://kafka.apache.org/ > documentation/#producerconfigs <http://kafka.apache.org/ > documentation/#producerconfigs>. > > -James > > > > Best regards, > > > > On Thu, Jan 5, 2017 at 10:23 AM, Protoss Hu <[email protected] > > > > wrote: > > > >> You mean the messages were lost on the way to broker before the broker > >> actually received? > >> > >> Protoss Hu > >> Blog: http://hbprotoss.github.io/ > >> Weibo: http://weibo.com/hbprotoss > >> > >> 2017年1月5日 +0800 PM4:53 James Cheng <[email protected]>,写道: > >>> kafka-console-producer.sh defaults to acks=0, which means that the > >> producer essentially throws messages at the broker and doesn't > wait/retry > >> to make sure they are properly received. > >>> > >>> In the kafka-console-producer.sh usage text: > >>> --request-required-acks <Integer: The required acks of the producer > >>> request required acks> requests (default: 0) > >>> > >>> Try re-running your test with "--request-required-acks -1" or > >> "--request-required-acks all" (They are equivalent) This will tell the > >> broker to wait for messages to be fully saved to all replicas before > >> returning an acknowledgement to the producer. You can read more about > acks > >> in the producer configuration section of the kafka docs ( > >> http://kafka.apache.org/documentation/#producerconfigs < > >> http://kafka.apache.org/documentation/#producerconfigs>) > >>> > >>> -James > >>> > >>>> On Jan 4, 2017, at 1:25 AM, Hoang Bao Thien <[email protected]> > >> wrote: > >>>> > >>>> Hi all, > >>>> > >>>> I have a problem with losing messages from Kafka. > >>>> The situation is as follows: I put a csv file with 286701 rows (size = > >>>> 110MB) into Kafka producer with command: > >>>> $ cat test.csv | kafka-console-producer.sh --broker-list > localhost:9092 > >>>> --topic MyTopic > /dev/null > >>>> > >>>> and then count the number of lines from the Kafka consumer > >>>> (kafka-console-consumer.sh --zookeeper localhost:2181 --topic MyTopic > >>>> --from-beginning) > >>>> However, I only get about 260K-270K, and this number of received > >> messages > >>>> changes for each test. > >>>> > >>>> My configuration in the "config/server.properties" has some minor > >> change > >>>> compared to the original file: > >>>> > >>>> log.retention.check.interval.hours=24 > >>>> log.retention.hours=168 > >>>> delete.topic.enable = true > >>>> > >>>> The remaining configurations are the same as default value. > >>>> > >>>> Could you please explain why the messages were lost in Kafka? And how > >> to > >>>> fix this problem please? > >>>> > >>>> Thanks a lot. > >>>> > >>>> Best regards > >>>> , > >>>> Alex > >>> > >> > >
