Hi Yufan, I'm also looping in the Dev mailing list for awareness.
>From my perspective, it sounds like it makes more sense to drop the Shared and Key_Shared subscription. If it's currently unstable/not usable and there are alternatives that you can leverage with Flink (like the increased parallelism or the keyBy() functionality), I would drop it. If improvements are made in the future, it can always be re-implemented. Especially with the connector being externalized and not tied to an actual Flink release. Best regards, Martijn On Wed, Dec 14, 2022 at 2:03 PM 盛宇帆 <syh...@gmail.com> wrote: > Hi Zili, > > Thanks for picking up this discussion. Here is my answer: > > I agreed with your first question. If the problems are related to > Pulsar, it should be redelivered to the Pulsar repo. But these flaky > tests only occur on the Shared or Key_Shared subscription with the > transaction and I can’t reproduce it on my local machine. I don’t know > how to submit issues. > > The performance issue is due to the internal implementation of the > Pulsar transaction. Pulsar has to log the ack status in an individual > topic which makes the performance extremely slow for large throughput. > > The only reason I can recall when I started to support the Shared > subscription is that we can have more consumers on the same partition > to increase the processing speed. But Flink can increase the > performance by increasing the parallelism of the backend operators. > The bottle neck isn’t the consuming message from Pulsar with exclusive > subscription. This means that we don’t have to support the Shared > subscription for performance. > > The Key_Shared subscription is only used to distribute the messages by > its key hash for the different consumers in Pulsar which can be > achieved by using Flink’s keyBy(). If we want to consume a subset of > key hash. We have to use an Exclusive subscription with a key ranges. > This makes the support for Key_Shared meaningless. > > So I prefer to remove them to get a better support of Pulsar in Flink. > > Best, > Yufan > > On Wed, Dec 14, 2022 at 8:49 PM Zili Chen <ti...@apache.org> wrote: > > > > Hi Yufan, > > > > Thanks for starting this discussion. My two coins: > > > > 1. It can help the upstream to fix the transaction issues by submitting > the instability and performance issues to the pulsar repo also. > > 2. Could you elaborate on whether and (if so) why we should drop the > Shared and Key_Share subscription support on Flink? > > > > Best, > > tison. > > > > On 2022/12/14 10:00:56 盛宇帆 wrote: > > > Hi, I'm the maintainer of flink-connector-pulsar. I would like to > > > start a survey on a function change proposal in > > > flink-connector-pulsar. > > > > > > I have created a ticket > > > <https://issues.apache.org/jira/browse/FLINK-30413> on JIRA and paste > > > its description here: > > > > > > A lot of Pulsar connector test unstable issues are related to Shared > > > and Key_Shared subscription. Because this two subscription is designed > > > to consume the records in an unordered way. And we can support > > > multiple consumers in same topic partition. But this feature lead to > > > some drawbacks in connector. > > > > > > 1. Performance > > > > > > Flink is a true stream processor with high correctness support. But > > > support multiple consumer will require higher correctness which > > > depends on Pulsar transaction. But the internal implementation of > > > Pulsar transaction on source is record the message one by one and > > > stores all the pending ack status in client side. Which is slow and > > > memory inefficient. > > > > > > This means that we can only use Shared and Key_Shared on Flink with > > > low throughput. This against our intention to support these two > > > subscription. Because adding multiple consumer to same partition can > > > increase the consuming speed. > > > > > > 2. Unstable > > > > > > Pulsar transaction acknowledge the messages one by one in an internal > > > Pulsar's topic. But it's not stable enough to get it works. A lot of > > > pending issues in Flink JIRA are related to Pulsar transaction and we > > > don't have any workaround. > > > > > > 3. Complex > > > > > > Support Shared and Key_Shared subscription make the connector's code > > > more complex than we expect. We have to make every part of code into > > > ordered and unordered way. Which is hard to understand for the > > > maintainer. > > > > > > 4. Necessary > > > > > > The current implementation on Shared and Key_Shared is completely > > > unusable to use in Production environment. For the user, this function > > > is not necessary. Because there is no bottleneck in consuming data > > > from Pulsar, the bottleneck is in processing the data, which we can > > > achieve by increasing the parallelism of the processing operator. > > > >