liutang123 opened a new pull request, #64619:
URL: https://github.com/apache/doris/pull/64619

   Bug
   ---
   On the meta-service, start_compaction_job only rejected a new job when its 
type strictly equalled an in-flight job's type. This left two races:
   
   1. EMPTY_CUMULATIVE was treated as a different type from CUMULATIVE. While a 
real CUMULATIVE [v_lo, v_hi] was still running, an EMPTY_CUMULATIVE could be 
accepted and committed, advancing cumulative_point past v_hi. A subsequent BASE 
compaction could then pull rowsets in [v_lo, v_hi] as input and race with the 
in-flight CUMULATIVE on the same rowsets.
   2. With check_input_versions_range=true, BASE and CUMULATIVE were never 
cross-checked against each other, so overlapping input ranges across the two 
types could be accepted concurrently.
   
   Fix
   ---
   * Normalize EMPTY_CUMULATIVE to CUMULATIVE for conflict detection so they 
belong to the same conflict family.
   * Extend the version-range conflict check to the whole rowset compaction 
family (BASE / CUMULATIVE / EMPTY_CUMULATIVE / FULL) instead of same-type only. 
Non-overlapping ranges across types are still allowed.
   * Keep version_in_compaction notification scoped to the same family so BE 
retry semantics are unchanged.
   
   Behaviour matrix (new -> active, OK = accept, BUSY = JOB_TABLET_BUSY) 
---------------------------------------------------------------------
                            before            after
   EMPTY_CUMU vs CUMU       OK   (race)       BUSY
   CUMU       vs EMPTY_CUMU OK                BUSY
   BASE  vs CUMU  overlap   OK   (race)       BUSY
   CUMU  vs BASE  overlap   OK   (race)       BUSY
   BASE  vs CUMU  disjoint  OK                OK   (unchanged)
   same-type / FULL / STOP_TOKEN / idempotent same-id : unchanged
   
   Tests
   -----
   * EmptyCumulativeBlockedByCumulativeTest
   * BaseCumulativeCrossTypeConflictTest
   
   ### What problem does this PR solve?
   
   Issue Number: close #xxx
   
   Related PR: #xxx
   
   Problem Summary:
   
   ### Release note
   
   None
   
   ### Check List (For Author)
   
   - Test <!-- At least one of them must be included. -->
       - [ ] Regression test
       - [x] Unit Test
       - [ ] Manual test (add detailed scripts or steps below)
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason <!-- Add your reason?  -->
   
   - Behavior changed:
       - [ ] No.
       - [ ] Yes. <!-- Explain the behavior change -->
   
   - Does this need documentation?
       - [ ] No.
       - [ ] Yes. <!-- Add document PR link here. eg: 
https://github.com/apache/doris-website/pull/1214 -->
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label <!-- Add branch pick label that this PR should 
merge into -->
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to