btw, it appears the missing msgs are at the end of the CSV file, so maybe
the producer doesn't properly flush when it gets EOF on stdin ?

On Wed, Jun 15, 2016 at 11:21 AM, Dean Arnold <renodino...@gmail.com> wrote:

> I'm seeing similar issues with 0.9.0.1.
>
> I'm feeding CSV records (65536 total, 1 record per msg) to the console
> producer, which are consumed via a sink connector (using connect-standalone
> and a single partition). The sink occasionally reports flushing less than
> 65536 msgs via the sink flush(). Restarting the sink connector with a
> forced reset to offset 0 (ie, replaying all the msgs on the topic) shows
> that the messages are still missing (ie, no gaps in offsets), so I assume
> the msgs must be lost by the producer ?
>
>
> On Wed, Jun 15, 2016 at 1:29 AM, Radu Radutiu <rradu...@gmail.com> wrote:
>
>> Hi,
>>
>> I was following the Quickstart guide and I have noticed that
>> ConsoleProducer does not publish all messages (the number of messages
>> published differs from one run to another) and happens mostly on a fresh
>> started broker.
>> version: kafka_2.11-0.10.0.0
>> OS: Linux (Ubuntu 14.04, Centos 7.2)
>> JDK: java version "1.7.0_101"
>> OpenJDK Runtime Environment (IcedTea 2.6.6)
>> (7u101-2.6.6-0ubuntu0.14.04.1),
>> openjdk version "1.8.0_91"
>> OpenJDK Runtime Environment (build 1.8.0_91-b14)
>>
>>
>> How to reproduce:
>> - start zookeeper:
>> ~/work/kafka_2.11-0.10.0.0$ bin/zookeeper-server-start.sh
>> config/zookeeper.properties &
>>
>> -start kafka:
>> ~/work/kafka_2.11-0.10.0.0$ bin/kafka-server-start.sh
>> config/server.properties &
>>
>> -start console consumer (topic test1 is already created):
>> ~/work/kafka_2.11-0.10.0.0$ bin/kafka-console-consumer.sh
>> --bootstrap-server localhost:9092 -topic test1 --zookeeper localhost:2181
>>
>> -in another terminal start console producer with the LICENSE file in kafka
>> directory as input:
>> ~/work/kafka_2.11-0.10.0.0$ bin/kafka-console-producer.sh --topic test1
>> --broker-list localhost:9092   <LICENSE
>>
>> The last line in the console consumer output is not the last line in the
>> LICENSE file for the first few runs of the console producer. If I use the
>> --old-producer parameter, all the lines in the LICENSE file are published
>> (and appear in the console consumer output). Different runs of console
>> producer with the same input file publish different number of lines
>> (sometimes all, sometimes only 182 lines out of 330). I've noticed that if
>> the kafka server was started a long time ago the console producer
>> publishes
>> all lines.
>> I have checked the kafka binary log file (in my case
>> /tmp/kafka-logs/test1-0/00000000000000000000.log ) and confirmed that the
>> messages are not published (the console consumer receives all the
>> messages).
>>
>> Is there an explanation for this behavior?
>>
>> Best regards,
>> Radu
>>
>
>

Reply via email to