Hi,
Current version of Kafka producer provides at least once semantics. Duplicates may occur in the stream due to producer retries. ( the idempotent producer is still under development https://issues.apache.org/jira/browse/KAFKA-4815 ) Idempotent/transactional Producer Checklist (KIP-98)<https://issues.apache.org/jira/browse/KAFKA-4815> issues.apache.org This issue tracks implementation progress for KIP-98: https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging. When using streaming cube, Kylin may get duplicated messages and provide unexpected result. Does anyone have some experience dealing with this problem? I think this is more about Kafka itself, but since no Idempotent producer is available at current time, could I have some advice to work around it on Kylin side? Thanks.
