Re: [DISCUSS] Release Pulsar 2.10.4
+1 There will be ~70 commits compared to 2.10.3 which I think it's a good amount of changes. Thanks, Nicolò Boschi Il giorno gio 2 feb 2023 alle ore 07:53 Haiting Jiang < jianghait...@gmail.com> ha scritto: > +1, It's about 3 months since the discussion of the 2.10.3 release. > > Haiting > > On Wed, Feb 1, 2023 at 11:26 AM Xiangying Meng > wrote: > > > > Hello, Pulsar community: > > > > I'd like to propose releasing Apache Pulsar 2.10.4. It's been about one > > month since 2.10.3 was released. > > > > There are 45 PRs [0] needed to cherry-pick in branch-2.10. I will > > cherry-pick these PRs for branch-2.10. Exclude some PRs that merge > directly > > into branch-2.10. > > > > There are 21 PRs [1] opened. I'll follow up on each of those PRs to see > if > > they will be completed soon or will need to be pushed to 2.10.4 > > > > If you have any important fixes or any questions, please reply to this > > email, and we will evaluate whether to include them in 2.10.4 > > > > Thanks, > > Xiangying > > [0] - > > > https://github.com/apache/pulsar/pulls?q=is%3Amerged+is%3Apr+label%3Arelease%2F2.10.4+-label%3Acherry-picked%2Fbranch-2.10+ > > [1] - > > > https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+label%3Arelease%2F2.10.4+-label%3Acherry-picked%2Fbranch-2.10+ >
Re: [DISCUSS] Topic name restriction
Hi, Yong > How about using a blacklist way to restrict it? It's a great idea. Maybe we can define the rules like `keyword` with regular expressions. Best, Mattison On Feb 2, 2023, 10:52 +0800, Yong Zhang , wrote: > Mattison, > > I agree with you about restricting the topic name. > > How about using a blacklist way to restrict it? > > We can have a blacklist on the topic name restriction and make it > configurable. Add the keywords you mentioned in the default configuration. > That would have a more general way to block a topic name creation. > If we have more restrictions on the topic name in the future, this way > can make it easy to fit them without changing any code. > > Thanks, > Yong > > On Thu, 2 Feb 2023 at 07:33, wrote: > > > Hi, All > > > > In the current implementation, pulsar didn't support topic name > > restriction. It's a good chance to discuss it. > > > > I think this discussion aims to identify what types of topic names we all > > need to restrict. > > > > I know three topic names that need to be restricted at the moment. > > > > 1. The `-partition-` keyword. > > 2. Topic name characters validation. > > 3. System topic prefix `__`. > > > > > > Please feel free to leave your comments. > > I will keep this discussion for a week. If there are no more new types of > > restrictions, I will refine the previous PIP-242[0] to explain more details. > > > > If we have other restrictions behind this discussion. We can draft a new > > PIP to add it directly. > > Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to help > > pulsar have a good topic name restriction. > > > > Best, > > Mattison > > > > [0] https://github.com/apache/pulsar/issues/19239 > > [1] https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2 > >
Re: [DISCUSS] Topic name restriction
Hi, Yunze > The topic name character validation is already done by`NamedEntity#checkName` As Michael mentioned, the `NamedEntity#checkName` just checked the tenant and namespace. > But I have a concern that whether we shouldtreat all topics that start with > the long underscore ("__") as systemtopics? Users might have defined their > own "system topics" that startwith the long underscore for special uses.If > yes, how would you like to allow users to access the system topics?In Kafka, > there are a few (only two, IIRC) special topics that onlyallow non-admin > users to read, which means users cannot write to thesetopics or delete them. We can add the warn log when the user creates a topic with the `__` start. To tell users that we will mark this as the system topic prefix keyword. It will probably be banned in the future. Or we can just add the existent system topic name as the keywords and introduce a new complex system topic rule. Best, Mattison On Feb 2, 2023, 11:11 +0800, Yunze Xu , wrote: > The topic name character validation is already done by > `NamedEntity#checkName`. And I agree that the system topic should be > taken carefully as well. But I have a concern that whether we should > treat all topics that start with the long underscore ("__") as system > topics? Users might have defined their own "system topics" that start > with the long underscore for special uses. > > If yes, how would you like to allow users to access the system topics? > In Kafka, there are a few (only two, IIRC) special topics that only > allow non-admin users to read, which means users cannot write to these > topics or delete them. > > Thanks, > Yunze > > On Thu, Feb 2, 2023 at 10:52 AM Yong Zhang wrote: > > > > Mattison, > > > > I agree with you about restricting the topic name. > > > > How about using a blacklist way to restrict it? > > > > We can have a blacklist on the topic name restriction and make it > > configurable. Add the keywords you mentioned in the default configuration. > > That would have a more general way to block a topic name creation. > > If we have more restrictions on the topic name in the future, this way > > can make it easy to fit them without changing any code. > > > > Thanks, > > Yong > > > > On Thu, 2 Feb 2023 at 07:33, wrote: > > > > > > Hi, All > > > > > > > > In the current implementation, pulsar didn't support topic name > > > > restriction. It's a good chance to discuss it. > > > > > > > > I think this discussion aims to identify what types of topic names we > > > > all > > > > need to restrict. > > > > > > > > I know three topic names that need to be restricted at the moment. > > > > > > > > 1. The `-partition-` keyword. > > > > 2. Topic name characters validation. > > > > 3. System topic prefix `__`. > > > > > > > > > > > > Please feel free to leave your comments. > > > > I will keep this discussion for a week. If there are no more new types > > > > of > > > > restrictions, I will refine the previous PIP-242[0] to explain more > > > > details. > > > > > > If we have other restrictions behind this discussion. We can draft > > > > > > a new > > > > PIP to add it directly. > > > > Thanks to Michael's opinion[1], we can expand the PIP-242 scopes to help > > > > pulsar have a good topic name restriction. > > > > > > > > Best, > > > > Mattison > > > > > > > > [0] https://github.com/apache/pulsar/issues/19239 > > > > [1] https://lists.apache.org/thread/dd1kxhodjvovtb8yyojkk209st4o0ft2 > > > >
Re: [DISCUSS] Topic name restriction
Hi, Asaf We are using the regular expression to check the name. "^[-=:.\\w]*$" The \w means [A-Za-z0-9_] , which includes underscores. Best, Mattison On Feb 2, 2023, 23:01 +0800, Asaf Mesika , wrote: > NamedEntity is not allowing underscores - does it make sense? > > > > On Thu, Feb 2, 2023 at 8:35 AM Michael Marshall > wrote: > > > Thanks for starting this thread, Mattison. > > > > > > The topic name character validation is already done by > > > > `NamedEntity#checkName`. > > > > Based on my reading of the code, only the tenant and the namespace > > names are validated using that method. There is a call [0] that looks > > like it validates topic names, but that method is only called by > > tests. > > > > > > But I have a concern that whether we should > > > > treat all topics that start with the long underscore ("__") as system > > > > topics? > > > > This is a reasonable concern, and my primary motivation in proposing > > this change is to make it easier for the broker to handle system > > topics, which often get unique treatment. > > > > I wrote on this topic in several replies on this thread from a year ago > > [1]. > > > > In the context of PIP 242, we're introducing a config to optionally > > enforce strict topic names. As such, we could rely on the config to > > either use the "cheap" check to see if the topic starts with __ or we > > could use the more expensive check to determine if the topic name is > > one of many possible system topic names. Because we want to maintain > > backwards compatibility, we cannot completely get rid of the old > > logic. I like self describing names because they are elegant and > > efficient. > > > > > > If yes, how would you like to allow users to access the system topics? > > > > I proposed some ideas at the end of that thread [1]. We should have a > > clear definition of system topics and how they are or are not accessed by > > users. Ultimately, we continue to create new system topics without > > reserving a designated naming structure and without defining how these > > topics ought to be interacted with, as Yunze points out. Note that any > > system topic we introduce could conflict with existing user topics, so > > proactively reserving a set of names makes it easier for forwards > > compatibility. > > > > Thanks, > > Michael > > > > [0] > > https://github.com/apache/pulsar/blob/b880b1d240ade864181935aa360bfca03a5aa67f/pulsar-common/src/main/java/org/apache/pulsar/common/naming/NamespaceName.java#L159 > > [1] https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17 > > > > > > On Wed, Feb 1, 2023 at 11:28 PM r...@apache.org > > wrote: > > > > > > > > Hi Mattison: > > > > > > > > Agree with Yong's idea. We can expose `disallowed topic` as a > > configuration > > > > to the user side, and a more flexible way is to expose it as a > > > > namespace-level policy. This can ensure that there is no need to do > > special > > > > processing on customized keywords in the future, and the expected effect > > > > can be achieved by modifying the configuration. > > > > > > > > Think Yunze's concerns are justified for the system topic. Is it okay if > > we > > > > use hard code? Because the identification of any keyword is likely to be > > > > hit by the user. The hard code method is used to filter out system > > > > topics > > > > and not allow users to operate during delete and create operations. > > > > > > > > -- > > > > Thanks > > > > Xiaolong Ran > > > > > > > > Dave Fisher 于2023年2月2日周四 11:26写道: > > > > > > > > > > > > > > > > > > > > > > Sent from my iPhone > > > > > > > > > > > > > > On Feb 1, 2023, at 6:52 PM, Yong Zhang > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > Mattison, > > > > > > > > > > > > > > > > I agree with you about restricting the topic name. > > > > > > > > > > > > > > > > How about using a blacklist way to restrict it? > > > > > > > > > > > > If we do then please call it by another name like “disallowed”. > > > > > > > > > > > > > > > > > > > > > > We can have a blacklist on the topic name restriction and make > > > > > > > > it > > > > > > > > configurable. Add the keywords you mentioned in the default > > > > > > configuration. > > > > > > > > That would have a more general way to block a topic name > > > > > > > > creation. > > > > > > > > If we have more restrictions on the topic name in the future, > > > > > > > > this > > way > > > > > > > > can make it easy to fit them without changing any code. > > > > > > > > > > > > Is there anyone asking for this feature? > > > > > > > > > > > > Best, > > > > > > Dave > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Yong > > > > > > > > > > > > > > > > >> On Thu, 2 Feb 2023 at 07:33, wrote: > > > > > > > > >> > > > > > > > > >> Hi, All > > > > > > > > >> > > > > > > > > >> In the current implementation, pulsar didn't support topic > > > > > > > > >> name > > > > > > > > >> restriction. It's a good chance to discuss it. > > > > > > > > >> > > > > > > > > >> I
Re: [DISCUSS] PIP-186: Introduce two phase deletion protocol based on system topic
> If you persisted the message successfully by the producer and the broker > was terminated before being able to delete the ledger from the metadata > service? If the broker is terminated, the consumer won't ack the message, the message will be re-consume later. > I recommend having the logic to delete the ledger be done in the message > consumer side: > - if the ledger exists in the MD store, delete it. > - send delete command to BK > Both as I understand are idempotent. If for some reason one action was done > but the other wasn't - ZK down, BK down, broker hosting the consumer > terminated - either the message will not be acknowledged, or you negatively > acknowledge. We send a delete command to the broker, it will connect to the corresponding broker which loads the topic. The corresponding broker received the command, then passes the command to ManagedLedger, the ManagedLedger does the actual delete operation. If the consumer does the delete operation, it's a little unreasonable. The ledger manager should be `ManagedLedger`, let it do the delete will be better. > General question: When a ledger is persisted to ZK, where is the ledger > metadata persisted in ZK (more specifically it's metadata, which includes > the component? > Is it also used when building out the key (path) in ZK? https://github.com/apache/bookkeeper/blob/901f76ce4c4f9f771363424dbb60da4d590ad122/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadataImpl.java#L74 It's the content in the zk node. when creating a ledger by bookkeeper, it will create a path like `/ledgers/00//L`, the path value is an instance of LedgerMetadataImpl. The bookkeeper LedgerMetadataImpl, the customMetadata stores the user's metadata. If the ledger is for Ledger, the customMetadata store: `key:application, value:pulsar` `key:component, value:managed-ledger` `key:pulsar/managed-ledger, value: ledgerName` If the ledger is for Cursor, the customMetadata store: `key:application, value:pulsar` `key:component, value:managed-ledger` `key:pulsar/managed-ledger, value: ledgerName` `key:pulsar/cursor, value: curSorName` If the ledger is for schema, the customMetadata store: `key:application, value:pulsar` `key:component, value:schema` `key:pulsar/schemaId, value: schemaId` So when we get the ledger metadata from bookkeeper, we can get the ledger source. > Isn't the type saved together with the ledger in ZK? We need to differ it, the same ledger may store both on the bk side and the offload side. If a ledger want to delete the bk data and the offload data, it should publishes two message to the system topic. The broker needs it to determine whether to delete offload or bk. > It's for the offloaded ledger, when we want to delete the offload ledger, > > we need offloadedContextUuid, here we can simplify it to offloadContextUuid. > Sounds much better. Maybe offloadedLedgerUUID? (why context?) Agree. > > Are you encoding all that extra info besides the ledger ID and its source > to avoid reading it again from ZK when deleting it? No, only encoding the data which is useful for deletion. > > It's for extended. > > > Can't really understand from that short sentence what you mean. Can you > please elaborate? Maybe we can delete it, I just want to didn't change the class when we want to add new property. Put the new property as key-value to extend. > > > In https://github.com/apache/pulsar/issues/16569. The first step section > > and second step section process flow picture are detailed. > > > I'm sorry but you didn't answer all the questions I wrote. I'll paste them > here: > Can you explain the starting point? How does deletion work in general? > > When? What happens? ... I understand there are time based triggers, and > > sometimes used based triggers. They are somehow marked in metadata. > In ManagedLedgerImpl, the method trimConsumedLedgersInBackground will trigger the delete operation. It will find the slowest cursor read position in the topic, and find the ledger which is before the slowest cursor read position. Then check the ledgers if `isLedgerRetentionOverSizeQuota` or `hasLedgerRetentionExpired`. If so, we think the ledger should be delete, append it to deleteLedgerList, we also check if the ledger `isOffloadedNeedsDelete`, if so, append it to deleteOffloadLedgerList. Then iterate `deleteLedgerList` and `deleteOffloadLedgerList`, build `PendingDeleteLedgerInfo` , and send it to systemtopic. If send succeed, I will remove the ledger from the ledgerList, then persist the ledgerList. If send failed, didn't do anything. Example: there are ledger [1,2,3,4,5], and we want to delete 1,2,3. And we send 1,3 to the system topic succeed, send 2 failed. We remove 1,3. And persist [2,4,5] to the zk metadata store. There are some cases to trigger it. 1. A cursor be removed. 2. Close the current ledger and create a new ledger. 3. Consumer ack the message, the slowest cursor move forward. 4. User trigger t
Re: [DISCUSS] PIP-186: Introduce two phase deletion protocol based on system topic
Heesung, thanks. That's a good point for multi steps works. But I'm afraid it will increase the zk pressure and it also needs to handle some corner cases. I prefer to use a system topic to handle it. On 2023/01/31 19:05:34 Heesung Sohn wrote: > On Tue, Jan 31, 2023 at 6:43 AM Yan Zhao wrote: > > > > - Have we considered a metadata store to persist and dedup deletion > > > requests instead of the system topic? Why is the system topic the better > > > choice than a metadata store for this problem? > > If we use the metadata store to store the middle step ledger, we need to > > operate the metadata store after deletion every time. > > > > > > > And we need a trigger to trigger deletion. In the broker, it may have lots > > of topics, the ledger deletion is also much. Using the metadata store to > > store it may be a bottleneck. > > Using pub/sub is easy to implement, and it is a good trigger to trigger > > deletion. > > > > > We can group the multiple resource deletions to a single record in the > metadata store. Also, we can use the metadata store watcher to trigger the > deletion. > > I can see that a similar transactional operation(using metadata store) can > be done like the following. > > Alternatively, > 1. A broker receives a resource(ledger) deletion request from a client. > 2. If the target resource is available, the broker persists a transaction > lock(/transactions/broker-id/delete_ledger/ledger_id) into a metadata > store(state:pending, createdAt:now). > 2.1 If there is no target resource, error > out(ResourceDoesNotExistException). > 2.2 If the lock already exists, error out(OperationInProgressExeception). > 3. The broker returns success to the client. > 4. The transaction watcher(metadata store listener) on the same broker-id > is notified. > 5. The transaction watcher runs the deletion process with an x min timeout. > 5.1 The transaction watcher updates the lock state (state: running, > startedAt: now) > 5.2 Run step 1 ... n (periodically update the lock state and > updatedAt:now every x secs) > 5.3 Delete the lock. > 6. The orphan transaction monitor runs any orphan jobs by retrying step 5. > (If the watcher fails in the middle at step 5, the lock state will be > orphan(state:running and startedAt : > x min)) > 7. The leader monitor(on the leader broker) manages orphan jobs if brokers > are gone or unavailable. > > We can have multiple types of transaction locks(or generic lock) depending > on the operations types. This will reduce the number of locks to > create/update if there are multiple target resources to operate on for a > single transaction. > > - Single ledger deletion: /transactions/broker-id/delete_ledger/ledger_id > - Mult-ledger deletion: /transactions/broker-id/delete_ledgers/ledgers : > {ledger_ids[a,b,c,d], last_deleted_ledger_index:3} > //last_deleted_ledger_index could be periodically updated every min. This > can help to resume the deletion when retrying. > - Topic deletion : /transactions/broker-id/delete_topic/topic_name > > > > > > - How does Pulsar deduplicate deletion requests(error out to users) while > > > the deletion request is running? > > The user only can invoke `truncateTopic`, it's not for a particular > > ledger. The note: "The truncate operation will move all cursors to the end > > of the topic and delete all inactive ledgers." > > It's just a trigger for the user. > > > > What if the admin concurrently requests `truncateTopic` many times for the > same topic while one truncation job is running? How does Pulsar currently > deduplicate these requests? And how does this proposal handle this > situation? > > > > > > > - How do users track async deletion flow status? (do we expose any > > > describeDeletion API to show the deletion status?) > > Why need to track the async deletion flow status? The ledger deletion is > > transparent for pulsarClient. In the broker, deleting a ledger will print > > the log `delete ledger xx successfully `. > > If delete failed, it print the log `delete ledger xxx failed.` > > > > IMHO, relying on logs to check the system state is not a good practice. > Generally, every async user/admin API(long-running async workflow API) > needs the corresponding describe* API to return the current running state. > > > Regards, > Heesung >
Re: [DISCUSS] PIP-186: Introduce two phase deletion protocol based on system topic
> If we don't want to add more pressure on the metadata store from this > feature as a requirement, > I think we can use a system topic as well. Just to confirm, is this the > direction we are agreeing on? The design is too generic, I'm afraid not it is not suitable for ledger deletion. > > Using a system topic, > 1. A broker receives an async operation request on a target resource from a > client. > 2. If the target resource is available, the broker publishes an operation > request message to the system topic. >2.1 If there is no target resource, > error out(ResourceDoesNotExistException). It's more generic, I'm afraid it not suitable for ledger deletion. In ManagedLedgerImpl, the method trimConsumedLedgersInBackground will trigger the delete operation. It will find the slowest cursor read position in the topic, and find the ledger which is before the slowest cursor read position. Then check the ledgers if `isLedgerRetentionOverSizeQuota` or `hasLedgerRetentionExpired`. If so, we think the ledger should be delete, append it to deleteLedgerList, we also check if the ledger `isOffloadedNeedsDelete`, if so, append it to deleteOffloadLedgerList. Then iterate `deleteLedgerList` and `deleteOffloadLedgerList`, build `PendingDeleteLedgerInfo` , and send it to systemtopic. If send succeed, I will remove the ledger from the ledgerList, then persist the ledgerList. If send failed, didn't do anything. Example: there are ledger [1,2,3,4,5], and we want to delete 1,2,3. And we send 1,3 to the system topic succeed, send 2 failed. We remove 1,3. And persist [2,4,5] to the zk metadata store. There are some cases to trigger it. 1. A cursor be removed. 2. Close the current ledger and create a new ledger. 3. Consumer ack the message, the slowest cursor move forward. 4. User trigger truncateTopic by admin. Case 4 is related to it, if the topic id not exists, throw ResourceDoesNotExistException. >2.2 If the target resource's state is pending or running, error > out(InvalidResourceStateException). If we use the system topic to store the pending deleted ledger, we can't know it. Unless we read all the messages to check it, it's not realistic. > 3. The broker returns success(messageId) to the client. ledger deletion may trigger by the broker self. Same above reason. > 4. A system topic consumer(Shared Subscription Mode?) consumes the > operation request message. Yes. > 5. The consumer runs the operation workflow with an x-min timeout. > 5.1 The consumer gets the target resource data from the metadata. > 5.1.1 If there is no targeted resource, complete the operation. Go > to 5.4. > 5.2 The consumer updates the target resource state to running. (state: > running, startedAt: now) > 5.3 Run the workflow step 1 ... n. > 5.4 Acknowledge the message. The consumer received message, the message contains the delete ledger info. Didn't need to get the target resource from metadata. It send delete command to the corresponding broker like produce message. The broker received delete command, it check if the ledger is exists in the ledgerList, if exists, do nothing. If not exists, delete the bk data or offload data. If the topic already be delete, when send delete command, it will throw TopicNotFoundException, then consumer will delete the ledger locally. > > > * I guess we will enable deduplication for the system topic for possibly > duplicated operation requests. That's not essential. The deletion is idempotent. > > * Here, Target Resource means the last metadata to update in the metadata > store. > For example, a workflow updates multiple metadata places like the > following. In this case, the target resource is the 3rd one. > 1. sub-resource-1 > 2. sub-resource-2 > 3. target-resource. e.g. ledger, topic metadata. > > * For Step 5.3, all steps should be idempotent. Also, the consumer > periodically updates the operation state in the target resource's metadata > every x secs. Any retry on a long-running job should resume the operation > closer to where it failed instead of running it from the beginning. > * We could have describe* apis to check any inflight operations state by > querying the target resource's operation state, but this will show some > eventual consistency because of the delay between step 3 and step 5. > > > Thanks, > Heesung Thanks for your good idea, but it may not suitable the ledger deletion. And in this pip, I didn't want to operate the ledger metadata too much. But your idea may be a good solution for other cases. Maybe we can use it to draft a new pip.
Re: [DISCUSS] PIP-186: Introduce two phase deletion protocol based on system topic
There are some cases to trigger it. 1. A cursor be removed. 2. Close the current ledger and create a new ledger. 3. Consumer ack the message, the slowest cursor move forward. 4. User trigger truncateTopic by admin. I see that this pip is for the internal ledger deletion cases(1-3 above), and it appears that such internal deletion operations do not require pre-validation or status checks(and no additional iops on the metadata store). I agree that we need a separate pip for async admin operations. On Fri, Feb 3, 2023 at 9:16 AM Yan Zhao wrote: > > If we don't want to add more pressure on the metadata store from this > > feature as a requirement, > > I think we can use a system topic as well. Just to confirm, is this the > > direction we are agreeing on? > The design is too generic, I'm afraid not it is not suitable for ledger > deletion. > > > > > Using a system topic, > > 1. A broker receives an async operation request on a target resource > from a > > client. > > 2. If the target resource is available, the broker publishes an operation > > request message to the system topic. > >2.1 If there is no target resource, > > error out(ResourceDoesNotExistException). > It's more generic, I'm afraid it not suitable for ledger deletion. > > In ManagedLedgerImpl, the method trimConsumedLedgersInBackground will > trigger the delete operation. It will find the slowest cursor read position > in the topic, and find the ledger which is before the slowest cursor read > position. Then check the ledgers if `isLedgerRetentionOverSizeQuota` or > `hasLedgerRetentionExpired`. If so, we think the ledger should be delete, > append it to deleteLedgerList, we also check if the ledger > `isOffloadedNeedsDelete`, if so, append it to deleteOffloadLedgerList. > Then iterate `deleteLedgerList` and `deleteOffloadLedgerList`, build > `PendingDeleteLedgerInfo` , and send it to systemtopic. If send succeed, I > will remove the ledger from the ledgerList, then persist the ledgerList. If > send failed, didn't do anything. > Example: there are ledger [1,2,3,4,5], and we want to delete 1,2,3. And we > send 1,3 to the system topic succeed, send 2 failed. We remove 1,3. And > persist [2,4,5] to the zk metadata store. > > There are some cases to trigger it. > 1. A cursor be removed. > 2. Close the current ledger and create a new ledger. > 3. Consumer ack the message, the slowest cursor move forward. > 4. User trigger truncateTopic by admin. > > Case 4 is related to it, if the topic id not exists, throw > ResourceDoesNotExistException. >2.2 If the target resource's state is pending or running, error > > out(InvalidResourceStateException). > If we use the system topic to store the pending deleted ledger, we can't > know it. Unless we read all the messages to check it, it's not realistic > 3. The broker returns success(messageId) to the client. > ledger deletion may trigger by the broker self. > Same above reason. > > > 4. A system topic consumer(Shared Subscription Mode?) consumes the > > operation request message. > Yes. > > 5. The consumer runs the operation workflow with an x-min timeout. > > 5.1 The consumer gets the target resource data from the metadata. > > 5.1.1 If there is no targeted resource, complete the operation. > Go > > to 5.4. > > 5.2 The consumer updates the target resource state to running. > (state: > > running, startedAt: now) > > 5.3 Run the workflow step 1 ... n. > > 5.4 Acknowledge the message. > The consumer received message, the message contains the delete ledger > info. Didn't need to get the target resource from metadata. It send delete > command to the corresponding broker like produce message. The broker > received delete command, it check if the ledger is exists in the > ledgerList, if exists, do nothing. If not exists, delete the bk data or > offload data. > If the topic already be delete, when send delete command, it will throw > TopicNotFoundException, then consumer will delete the ledger locally. > > > > > > > * I guess we will enable deduplication for the system topic for possibly > > duplicated operation requests. > That's not essential. The deletion is idempotent. > > > > > * Here, Target Resource means the last metadata to update in the metadata > > store. > > For example, a workflow updates multiple metadata places like the > > following. In this case, the target resource is the 3rd one. > > 1. sub-resource-1 > > 2. sub-resource-2 > > 3. target-resource. e.g. ledger, topic metadata. > > > > * For Step 5.3, all steps should be idempotent. Also, the consumer > > periodically updates the operation state in the target resource's > metadata > > every x secs. Any retry on a long-running job should resume the operation > > closer to where it failed instead of running it from the beginning. > > > * We could have describe* apis to check any inflight operations state by > > querying the target resource's operation state, but this will show some > > eventual consistency because
Re: [DISCUSS] PIP-186: Introduce two phase deletion protocol based on system topic
For these internal requesters, 1. A cursor be removed. 2. Close the current ledger and create a new ledger. 3. Consumer ack the message, the slowest cursor move forward. How do we prevent these callers from requesting duplicate requests for the same resources(ledgers) (how do we handle racing conditions between the requesters and consumers? ) It seems like there could be duplicate requests. Are we relying on these methods to dedup messages? 1. the idempotency of the deletion operations 2. enough delay in the requesters' scan cycles 3. enable dedup in the system topic On Fri, Feb 3, 2023 at 10:14 AM Heesung Sohn wrote: > > > There are some cases to trigger it. > 1. A cursor be removed. > 2. Close the current ledger and create a new ledger. > 3. Consumer ack the message, the slowest cursor move forward. > 4. User trigger truncateTopic by admin. > > I see that this pip is for the internal ledger deletion cases(1-3 above), > and it appears that such internal deletion operations do not require > pre-validation or status checks(and no additional iops on the metadata > store). I agree that we need a separate pip for async admin operations. > > > On Fri, Feb 3, 2023 at 9:16 AM Yan Zhao wrote: > >> > If we don't want to add more pressure on the metadata store from this >> > feature as a requirement, >> > I think we can use a system topic as well. Just to confirm, is this the >> > direction we are agreeing on? >> The design is too generic, I'm afraid not it is not suitable for ledger >> deletion. >> >> > >> > Using a system topic, >> > 1. A broker receives an async operation request on a target resource >> from a >> > client. >> > 2. If the target resource is available, the broker publishes an >> operation >> > request message to the system topic. >> >2.1 If there is no target resource, >> > error out(ResourceDoesNotExistException). >> It's more generic, I'm afraid it not suitable for ledger deletion. >> >> In ManagedLedgerImpl, the method trimConsumedLedgersInBackground will >> trigger the delete operation. It will find the slowest cursor read position >> in the topic, and find the ledger which is before the slowest cursor read >> position. Then check the ledgers if `isLedgerRetentionOverSizeQuota` or >> `hasLedgerRetentionExpired`. If so, we think the ledger should be delete, >> append it to deleteLedgerList, we also check if the ledger >> `isOffloadedNeedsDelete`, if so, append it to deleteOffloadLedgerList. >> Then iterate `deleteLedgerList` and `deleteOffloadLedgerList`, build >> `PendingDeleteLedgerInfo` , and send it to systemtopic. If send succeed, I >> will remove the ledger from the ledgerList, then persist the ledgerList. If >> send failed, didn't do anything. >> Example: there are ledger [1,2,3,4,5], and we want to delete 1,2,3. And >> we send 1,3 to the system topic succeed, send 2 failed. We remove 1,3. And >> persist [2,4,5] to the zk metadata store. >> >> There are some cases to trigger it. >> 1. A cursor be removed. >> 2. Close the current ledger and create a new ledger. >> 3. Consumer ack the message, the slowest cursor move forward. >> 4. User trigger truncateTopic by admin. >> >> Case 4 is related to it, if the topic id not exists, throw >> ResourceDoesNotExistException. > > >2.2 If the target resource's state is pending or running, error >> > out(InvalidResourceStateException). >> If we use the system topic to store the pending deleted ledger, we can't >> know it. Unless we read all the messages to check it, it's not realistic > > > 3. The broker returns success(messageId) to the client. >> ledger deletion may trigger by the broker self. >> Same above reason. >> >> > 4. A system topic consumer(Shared Subscription Mode?) consumes the >> > operation request message. >> Yes. >> > 5. The consumer runs the operation workflow with an x-min timeout. >> > 5.1 The consumer gets the target resource data from the metadata. >> > 5.1.1 If there is no targeted resource, complete the operation. >> Go >> > to 5.4. >> > 5.2 The consumer updates the target resource state to running. >> (state: >> > running, startedAt: now) >> > 5.3 Run the workflow step 1 ... n. >> > 5.4 Acknowledge the message. >> The consumer received message, the message contains the delete ledger >> info. Didn't need to get the target resource from metadata. It send delete >> command to the corresponding broker like produce message. The broker >> received delete command, it check if the ledger is exists in the >> ledgerList, if exists, do nothing. If not exists, delete the bk data or >> offload data. >> If the topic already be delete, when send delete command, it will throw >> TopicNotFoundException, then consumer will delete the ledger locally. >> >> > >> > >> > * I guess we will enable deduplication for the system topic for possibly >> > duplicated operation requests. >> That's not essential. The deletion is idempotent. >> >> > >> > * Here, Target Resource means the last metadata to update in the
Re: [VOTE] Reactive Java client for Apache Pulsar 0.2.0 Candidate 2
+1 (binding) - validated 1 signature - validated 1 checksum - validated source equality from git using diff to observe the following differences: $ diff -r pulsar-client-reactive-0.2.0 pulsar-client-reactive-0.2.0-candidate-2-source/ Only in pulsar-client-reactive-0.2.0-candidate-2-source/: .asf.yaml Only in pulsar-client-reactive-0.2.0-candidate-2-source/: .git Only in pulsar-client-reactive-0.2.0-candidate-2-source/: .gitattributes Only in pulsar-client-reactive-0.2.0-candidate-2-source/: .github Only in pulsar-client-reactive-0.2.0-candidate-2-source/: .gitignore Only in pulsar-client-reactive-0.2.0-candidate-2-source/gradle: wrapper Only in pulsar-client-reactive-0.2.0-candidate-2-source/: gradlew Only in pulsar-client-reactive-0.2.0-candidate-2-source/: gradlew.bat - Ran ./gradlew clean build on source code tag, which is equivalent to code in source release and includes a rat check. Thanks, Michael On Thu, Feb 2, 2023 at 10:28 AM Dave Fisher wrote: > > +1 (binding) > > - validated signatures and checksum > - RAT check passes. > > Thanks Christophe! > > Best, > Dave > > > On Feb 2, 2023, at 8:00 AM, Enrico Olivelli wrote: > > > > +1 (binding) > > - built from sources on jdk11 on Mac M1 > > - all tests are passing > > - gradle check passes (it runs RAT) > > - verified signatures and checksum > > > > > > Thank you Christophe for running the release > > > > Enrico > > > > Il giorno gio 2 feb 2023 alle ore 00:04 Christophe Bornet > > ha scritto: > >> > >> This is the release candidate 2 for the Reactive Java client for Apache > >> Pulsar, version 0.2.0. > >> > >> *** Please download, test and vote on this release. This vote will stay > >> open for at least 72 hours *** > >> > >> Note that we are voting upon the source (tag). Binaries in the Maven > >> repository are provided for convenience. > >> > >> Source package: > >> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-reactive-0.2.0-candidate-2/ > >> > >> SHA-512 checksums: > >> 1498a63dd1021108baab61baae8f6f494530118a53c92bc316320878d5850de61313219255010885a9a04762a54077865a2724669f0294ec950db5b5460bedaa > >> pulsar-client-reactive-0.2.0-src.tar.gz > >> > >> Maven staging repo: > >> https://repository.apache.org/content/repositories/orgapachepulsar-1206/ > >> > >> The tag to be voted upon: > >> v0.2.0-candidate-2 (8c676db0f70604e1abb0978f966c0916d9ea3aa9) > >> https://github.com/apache/pulsar-client-reactive/releases/tag/v0.2.0-candidate-2 > >> > >> Please download the source package, and follow detailed instructions for > >> pulsar-client-reactive release validation at > >> https://github.com/apache/pulsar-client-reactive/wiki/Release-process#release-validation > >> . > >> > >> Best regards > >> > >> Christophe >