[ 
https://issues.apache.org/jira/browse/NIFI-14820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu sahu  Ksolve NiFi Expert reassigned NIFI-14820:
--------------------------------------------------------

    Assignee: Himanshu sahu  Ksolve NiFi Expert

> Add option to support original message ordering in ConsumeKafka
> ---------------------------------------------------------------
>
>                 Key: NIFI-14820
>                 URL: https://issues.apache.org/jira/browse/NIFI-14820
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Alaksiej Ščarbaty
>            Assignee: Himanshu sahu  Ksolve NiFi Expert
>            Priority: Major
>
> When using schema registries it's possible that messages with different 
> schemas land into the same topic. E.g. when doing a rolling update and 
> updating the schema on producer, or simply writing different messages into 
> the same topic.
> In several cases preserving message ordering (for a single partition) in a 
> pipeline is important.
> For OLTP workloads messages can share casual dependencies, so they must be 
> processed in a particular order. Or some downstream processors may rely on 
> messages being in the exact order as they were received from Kafka.
> Currently {{ConsumeKafka}} groups records in flow files by their 
> topic-partitions *as well as schemas* 
> ([source|https://github.com/apache/nifi/blob/main/nifi-extension-bundles/nifi-kafka-bundle/nifi-kafka-processors/src/main/java/org/apache/nifi/kafka/processors/consumer/convert/AbstractRecordStreamKafkaMessageConverter.java#L128]).
>  Which means the messages may be passed downstream out of order, due to the 
> grouping.
> The processor should support both _Roll FlowFile_ (new) and _Group Records By 
> Schema_ (existing) strategies, as in [ConsumeKinesis 
> (WIP)|https://github.com/apache/nifi/pull/10053/files#diff-5e86a490a55dd29cce20638adb0f41401095aee8d94a734bf862c02f4ecf7fa8].
>  To preserve backward compatibility, _Group Records By Schema_ should be 
> chosen by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to