Shawyeok commented on PR #25042: URL: https://github.com/apache/pulsar/pull/25042#issuecomment-4020463970
@TakaHiR07 @mattisonchao **This pr would reintroduce a silent, unrecoverable failure mode.** Here is why: The root cause of #25041 is that the update-partition-count operation is not atomic — it can fail mid-way with a persistent side effect. The key difference between the old and new ordering: **Old order (pre-3.0.14, what this PR reverts to):** 1. `tryCreatePartitionsAsync(N)` — writes managed-ledger ZK nodes for new partitions 2. `createSubscriptions(N)` — ⚠️ failure here leaves orphaned ZK nodes 3. Update partition metadata When step 2 fails (e.g. due to bundle not yet assigned, causing 307 redirect exhaustion), ZK nodes for the new partitions already exist but the metadata still shows the old count. These orphaned nodes appear as non-partitioned topics, causing any retry to permanently fail with: > "Already have non partition topic … could cause conflict" **New order (3.0.14+, what this PR reverts):** 1. `updatePartitionedTopicAsync(N)` — update metadata first 2. `tryCreatePartitionsAsync(N)` — idempotent ZK writes 3. `createSubscriptionAsync(...)` — `ConflictException` is tolerated With metadata updated first, any mid-operation failure leaves the system in a **self-consistent, retryable state** — the metadata already reflects N partitions, so no orphaned nodes exist outside the valid range. Although the operator can still retry if the endpoint returns an exception, the old ordering makes retries **permanently broken** rather than transiently failing. This was verified on 2.8.1 clusters in my company. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
