Hi Neng, Thanks for reminding me of the need to explain the lack of functionality. I have checked the missing functions with the customer and teammates. And we found that the only function we can't support after removing the Shared and Key_Shared subscription support is delayed message delivery https://pulsar.apache.org/docs/2.10.x/concepts-messaging/#delayed-message-delivery.
Because it's based on the Shared subscription. If the end user needs this, they may need to add the message in Flink's state and register an event timer to achieve such ability on Flink. But I don't think we should use delayed message delivery on Flink. In general, we always depend on the watermark on Flink to handle the out of orderliness messages. This is why we have delayed message delivery on Pulsar. Best regards, Yufan On Fri, Dec 16, 2022 at 10:49 AM Neng Lu <nl...@apache.org> wrote: > > Hi Yufan, > > In general, I think it's okay to remove these features. > > But could you elaborate If there will be missing functionality after we > remove these two subscriptions support? > > > On 2022/12/14 13:01:53 盛宇帆 wrote: > > Hi Zili, > > > > Thanks for picking up this discussion. Here is my answer: > > > > I agreed with your first question. If the problems are related to > > Pulsar, it should be redelivered to the Pulsar repo. But these flaky > > tests only occur on the Shared or Key_Shared subscription with the > > transaction and I can’t reproduce it on my local machine. I don’t know > > how to submit issues. > > > > The performance issue is due to the internal implementation of the > > Pulsar transaction. Pulsar has to log the ack status in an individual > > topic which makes the performance extremely slow for large throughput. > > > > The only reason I can recall when I started to support the Shared > > subscription is that we can have more consumers on the same partition > > to increase the processing speed. But Flink can increase the > > performance by increasing the parallelism of the backend operators. > > The bottle neck isn’t the consuming message from Pulsar with exclusive > > subscription. This means that we don’t have to support the Shared > > subscription for performance. > > > > The Key_Shared subscription is only used to distribute the messages by > > its key hash for the different consumers in Pulsar which can be > > achieved by using Flink’s keyBy(). If we want to consume a subset of > > key hash. We have to use an Exclusive subscription with a key ranges. > > This makes the support for Key_Shared meaningless. > > > > So I prefer to remove them to get a better support of Pulsar in Flink. > > > > Best, > > Yufan > > > > On Wed, Dec 14, 2022 at 8:49 PM Zili Chen <ti...@apache.org> wrote: > > > > > > Hi Yufan, > > > > > > Thanks for starting this discussion. My two coins: > > > > > > 1. It can help the upstream to fix the transaction issues by submitting > > > the instability and performance issues to the pulsar repo also. > > > 2. Could you elaborate on whether and (if so) why we should drop the > > > Shared and Key_Share subscription support on Flink? > > > > > > Best, > > > tison. > > > > > > On 2022/12/14 10:00:56 盛宇帆 wrote: > > > > Hi, I'm the maintainer of flink-connector-pulsar. I would like to > > > > start a survey on a function change proposal in > > > > flink-connector-pulsar. > > > > > > > > I have created a ticket > > > > <https://issues.apache.org/jira/browse/FLINK-30413> on JIRA and paste > > > > its description here: > > > > > > > > A lot of Pulsar connector test unstable issues are related to Shared > > > > and Key_Shared subscription. Because this two subscription is designed > > > > to consume the records in an unordered way. And we can support > > > > multiple consumers in same topic partition. But this feature lead to > > > > some drawbacks in connector. > > > > > > > > 1. Performance > > > > > > > > Flink is a true stream processor with high correctness support. But > > > > support multiple consumer will require higher correctness which > > > > depends on Pulsar transaction. But the internal implementation of > > > > Pulsar transaction on source is record the message one by one and > > > > stores all the pending ack status in client side. Which is slow and > > > > memory inefficient. > > > > > > > > This means that we can only use Shared and Key_Shared on Flink with > > > > low throughput. This against our intention to support these two > > > > subscription. Because adding multiple consumer to same partition can > > > > increase the consuming speed. > > > > > > > > 2. Unstable > > > > > > > > Pulsar transaction acknowledge the messages one by one in an internal > > > > Pulsar's topic. But it's not stable enough to get it works. A lot of > > > > pending issues in Flink JIRA are related to Pulsar transaction and we > > > > don't have any workaround. > > > > > > > > 3. Complex > > > > > > > > Support Shared and Key_Shared subscription make the connector's code > > > > more complex than we expect. We have to make every part of code into > > > > ordered and unordered way. Which is hard to understand for the > > > > maintainer. > > > > > > > > 4. Necessary > > > > > > > > The current implementation on Shared and Key_Shared is completely > > > > unusable to use in Production environment. For the user, this function > > > > is not necessary. Because there is no bottleneck in consuming data > > > > from Pulsar, the bottleneck is in processing the data, which we can > > > > achieve by increasing the parallelism of the processing operator. > > > > > >