Just in case it is a metrics bug, I will add a step to do my own counting in 
the Flink job.

Michael

> On Apr 25, 2018, at 9:52 AM, TechnoMage <mla...@technomage.com> wrote:
> 
> I have another java program reading the topic to monitor the test.  It 
> receives 60,000 records on the “travel” topic, while the kafka consumer only 
> reports 4,138.  That and the incongruity of the source to the maps are what 
> seems very weird.  I have several other topics where the source is built with 
> the exact same code, and where there are multiple maps following, and the 
> numbers all match expectations.
> 
> Michael
> 
>> On Apr 25, 2018, at 12:59 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org 
>> <mailto:tzuli...@apache.org>> wrote:
>> .
>> Hi,
>> 
>> Just to clarify your observation here:
>> 
>> Is the problem the fact that map operators after the “source: travel” Kafka 
>> topic source do not receive all records from the source?
>> This does seem weird, but as of now I don’t really have ideas yet of how 
>> this could maybe be Flink related.
>> 
>> One other thing to be sure of - have you verified that the outputs of your 
>> test are incorrect?
>> Or are you assuming that it is incorrect based on the weird metric numbers 
>> shown on the web ui?
>> 
>> Cheers,
>> Gordon
>> 
>> On 25 April 2018 at 6:13:07 AM, TechnoMage (mla...@technomage.com 
>> <mailto:mla...@technomage.com>) wrote:
>> 
>>> Still trying to figure out what is happening with this kafka topic.
>>> 
>>> 1) No exceptions in the task manager log or the UI exceptions tab.
>>> 2) Topic partitions being reset to offset 0 confirmed in log.
>>> 3) Other topics in this and other tests show full consumption of messages 
>>> (all JSON format text).
>>> 4) The source shows more records output than are received by 2 of the 3 
>>> following stages.
>>> 
>>> The diagram for the job is below, as is the GUI showing tasks and record 
>>> counts.
>>> 
>>> <38f1e887-8f89-453c-8289-442ae2af1...@hsd1.co.comcast.net 
>>> <mailto:38f1e887-8f89-453c-8289-442ae2af1...@hsd1.co.comcast.net>.>
>>> 
>>>> On Apr 23, 2018, at 11:23 AM, TechnoMage <mla...@technomage.com 
>>>> <mailto:mla...@technomage.com>> wrote:
>>>> 
>>>> I have been using the kafka connector sucessfully for a while now.  But, 
>>>> am getting weird results in one case.
>>>> 
>>>> I have a test that submits 3 streams to kafka topics, and monitors them on 
>>>> a separate process.  The flink job has a source for each topic, and one 
>>>> such is fed to 3 separate map functions that lead to other operators.  
>>>> This topic only shows 6097 out of 30000 published, and the map functions 
>>>> following the source only show a fraction of that as received.  The 
>>>> consumer is configured to start at the begining and in other cases the 
>>>> same code receives all messages published.  The parallelism is 6 if that 
>>>> makes a difference, as is the partitioning on the topics.
>>>> 
>>>> The code for creating the topic is below.
>>>> 
>>>> Any suggestions on why it is missing so many messages would be welcome.
>>>> 
>>>> Michael
>>>> 
>>>>       String topic = a.kafkaTopicName(ets);
>>>>       Properties props = new Properties();
>>>>       props.setProperty("bootstrap.servers", servers);
>>>>       props.setProperty("group.id <http://group.id/>", 
>>>> UUID.randomUUID().toString());
>>>>       props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
>>>>       DataStream<String> ds = consumers.get(a.eventType);
>>>>       if (ds == null) {
>>>>         FlinkKafkaConsumer011<String> cons = new 
>>>> FlinkKafkaConsumer011<String>(
>>>>             topic, new SimpleStringSchema(), props);
>>>>         cons.setStartFromEarliest();
>>>>         ds = env.addSource(cons).name(et.name).rebalance();
>>>>         consumers.put(a.eventType, ds);
>>>>       }

Reply via email to