Yukang-Lian opened a new pull request, #65196:
URL: https://github.com/apache/doris/pull/65196
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary:
A cloud schema change job in WAITING_TXN uses
`checkFailedPreviousLoadAndAbort()` to abort failed conflict transactions and
make progress. If that conflict transaction is concurrently cleaned up, already
aborted, visible, or otherwise no longer abortable, `abortTransaction()` can
throw `UserException`. Before this patch, `runWaitingTxnJob()` propagated that
exception as `AlterCancelException`, and `AlterJobV2.run()` cancelled the
schema change job.
This patch makes conflict transaction abort best-effort for schema change:
abort failure keeps the job in WAITING_TXN and lets the next scheduler round
re-query transaction state. The docker regression case also now reports the
actual schema change state and fails immediately on CANCELLED instead of using
opaque `assertEquals(1,2)` fallbacks.
### Release note
None
### Check List (For Author)
- Test: Unit Test
- `./run-fe-ut.sh --run
org.apache.doris.alter.CloudIndexTest#testSchemaChangeWaitsWhenConflictTxnAbortFails`
- `./run-fe-ut.sh --run org.apache.doris.alter.CloudIndexTest`
- `git diff --check --
fe/fe-core/src/main/java/org/apache/doris/alter/SchemaChangeJobV2.java
fe/fe-core/src/test/java/org/apache/doris/alter/CloudIndexTest.java
regression-test/suites/schema_change_p0/test_abort_txn_by_fe.groovy`
- Behavior changed: No
- Does this need documentation: No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]