Hi Niclas, Flink's Kafka consumer should not apply any deduplication. AFAIK, such a "feature" is not implemented. Do you produce into the topic that you want to read or is the data in the topic static? If you do not produce in the topic while the consuming application is running, this might be an issue with the start position of the consumer [1].
Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/connectors/kafka.html#kafka-consumers-start-position-configuration 2018-02-18 8:14 GMT+01:00 Niclas Hedhman <nic...@apache.org>: > Hi, > I am pretty new to Flink, and I like what I see and have started to build > my first application using it. > I must be missing something very fundamental. I have a > FlinkKafkaConsumer011, followed by a handful of filter, map and flatMap > functions and terminated with the standard CassandraSink. I have try..catch > on all my own maps/filters and the first message in the queue is processed > after start-up, but any additional messages are ignore, i.e. not reaching > the first map(). Any additional messages are swallowed (i.e. consumed but > not forwarded). > > I suspect that this is some type of de-duplication going on, since the > (test) producer of these messages. The producer provide different values on > each, but there is no "key" being passed to the KafkaProducer. > > Is that required? And if so, why? Can I tell Flink or Flink's > KafkaConsumer to ingest all messages, and not try to de-duplicate them? > > Thanks > > -- > Niclas Hedhman, Software Developer > http://zest.apache.org - New Energy for Java >