[ 
https://issues.apache.org/jira/browse/FLINK-30413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufan Sheng updated FLINK-30413:
--------------------------------
    Summary: Drop Shared and Key_Shared subscription support in Pulsar 
connector  (was: Drop Share and Key_Shared subscription support in Pulsar 
connector)

> Drop Shared and Key_Shared subscription support in Pulsar connector
> -------------------------------------------------------------------
>
>                 Key: FLINK-30413
>                 URL: https://issues.apache.org/jira/browse/FLINK-30413
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Pulsar
>    Affects Versions: 1.17.0
>            Reporter: Yufan Sheng
>            Assignee: Yufan Sheng
>            Priority: Critical
>             Fix For: pulsar-4.0.0
>
>
> A lot of Pulsar connector test unstable issues are related to {{Shared}} and 
> {{Key_Shared}} subscription. Because this two subscription is designed to 
> consume the records in an unordered way. And we can support multiple 
> consumers in same topic partition. But this feature lead to some drawbacks in 
> connector.
> 1. Performance
> Flink is a true stream processor with high correctness support. But support 
> multiple consumer will require higher correctness which depends on Pulsar 
> transaction. But the internal implementation of Pulsar transaction on source 
> is record the message one by one and stores all the pending ack status in 
> client side. Which is slow and memory inefficient.
> This means that we can only use {{Shared}} and {{Key_Shared}} on Flink with 
> low throughput. This against our intention to support these two subscription. 
> Because adding multiple consumer to same partition can increase the consuming 
> speed.
> 2. Unstable
> Pulsar transaction acknowledge the messages one by one in an internal 
> Pulsar's topic. But it's not stable enough to get it works. A lot of pending 
> issues in Flink JIRA are related to Pulsar transaction and we don't have any 
> workaround.
> 3. Complex
> Support {{Shared}} and {{Key_Shared}} subscription make the connector's code 
> more complex than we expect. We have to make every part of code into ordered 
> and unordered way. Which is hard to understand for the maintainer.
> 4. Necessary
> The current implementation on {{Shared}} and {{Key_Shared}} is completely 
> unusable to use in Production environment. For the user, this function is not 
> necessary. Because there is no bottleneck in consuming data from Pulsar, the 
> bottleneck is in processing the data, which we can achieve by increasing the 
> parallelism of the processing operator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to