Hi Dave,
Thanks for your support. We have completed the relevant documents: https://github.com/apache/pulsar-site/pull/712 Pip: https://github.com/apache/pulsar/pull/21114 Please help me take a look when you have time. Thanks, zhangheng ---- Replied Message ---- | From | Xiangying Meng<xiangy...@apache.org> | | Date | 9/26/2023 09:40 | | To | <dev@pulsar.apache.org> | | Subject | Re: [DISSCUSS] PIP-298: Consumer supports specifying consumption isolation level | Hi Dave, Thanks for your support. I also think this should only be for the master branch. Thanks, Xiangying On Tue, Sep 26, 2023 at 9:34 AM Dave Fisher <wave4d...@comcast.net> wrote: Hi - OK. I’ll agree, but I think the PIP ought to include documentation. There should also be clear communication about this use case and how to use it. Sent from my iPhone On Sep 25, 2023, at 6:23 PM, Xiangying Meng <xiangy...@apache.org> wrote: Hi Dave, The uncommitted transactions do not impact actual users' bank accounts. Business Processing System E only reads committed transactional messages and operates users' accounts. It needs Exactly-once semantic. Real-time Monitoring System D reads uncommitted transactional messages. It does not need Exactly-once semantic. They use different subscriptions and choose different isolation levels. One needs transaction, one does not. In general, multiple subscriptions of the same topic do not all require transaction guarantees. Some want low latency without the exact-once semantic guarantee, and some must require the exactly-once guarantee. We just provide a new option for different subscriptions. This should not be a breaking change,right? Not a breaking change, but it does add to the API. It should be discussed if this PIP is only for master - 3.2, or if may be cherry picked to current versions. Looking forward to your reply. Thank you, Dave Thanks, Xiangying On Tue, Sep 26, 2023 at 4:09 AM Dave Fisher <w...@apache.org> wrote: On Sep 20, 2023, at 12:50 AM, Xiangying Meng <xiangy...@apache.org> wrote: Hi, all, Let's consider another example: **System**: Financial Transaction System **Operations**: Large volume of deposit and withdrawal operations, a small number of transfer operations. **Roles**: - **Client A1** - **Client A2** - **User Account B1** - **User Account B2** - **Request Topic C** - **Real-time Monitoring System D** - **Business Processing System E** **Client Operations**: - **Withdrawal**: Client A1 decreases the deposit amount from User Account B1 or B2. - **Deposit**: Client A1 increases the deposit amount in User Account B1 or B2. - **Transfer**: Client A2 decreases the deposit amount from User Account B1 and increases it in User Account B2. Or vice versa. **Real-time Monitoring System D**: Obtains the latest data from Request Topic C as quickly as possible to monitor transaction data and changes in bank reserves in real-time. This is necessary for the timely detection of anomalies and real-time decision-making. **Business Processing System E**: Reads data from Request Topic C, then actually operates User Accounts B1, B2. **User Scenario**: Client A1 sends a large number of deposit and withdrawal requests to Request Topic C. Client A2 writes a small number of transfer requests to Request Topic C. In this case, Business Processing System E needs a read-committed isolation level to ensure operation consistency and Exactly Once semantics. The real-time monitoring system does not care if a small number of transfer requests are incomplete (dirty data). What it cannot tolerate is a situation where a large number of deposit and withdrawal requests cannot be presented in real time due to a small number of transfer requests (the current situation is that uncommitted transaction messages can block the reading of committed transaction messages). So you are willing to let uncommitted transactions impact actual users bank accounts? Are you sure that there is not another way to bypass uncommitted records? Letting uncommitted records through is not Exactly once. Are you ready to rewrite Pulsar’s documentation to explain how normal users can avoid allowing this? Best, Dave In this case, it is necessary to set different isolation levels for different consumers/subscriptions. Thanks, Xiangying On Tue, Sep 19, 2023 at 11:35 PM 杨国栋 <yangguodong1...@gmail.com> wrote: Hi Dave and Xiangying, Thanks for all your support. Let me add some background. Apache Paimon take message queue as External Log Systems and changelog of Paimon can also be consumed from message queue. By default, change-log of message queue in Paimon are visible to consumers only after a snapshot. Snapshot have a same life cycle as message queue transactions. However, users can immediately consume change-log by read uncommited message without waiting for the next snapshot. This behavior reduces the latency of changelog, but it relies on reading uncommited message in Kafka or other message queue. So we hope Pulsar can support Read Uncommitted isolation level. Put aside the application scenarios of Paimon. Let's discuss Read Uncommitted isolation level itself. Read Uncommitted isolation will bring certain security risks, but will also make the message immediately readable. Reading submitted data can ensure accuracy, and reading uncommitted data can ensure real-time performance (there may be some repeated message or dirty message). Real-time performance is what users need. How to handle dirty message should be considered by the application side. We can still get complete and accurate data from Read Committed isolation level. Sincerely yours.