cshuo opened a new pull request, #18434:
URL: https://github.com/apache/hudi/pull/18434

   …ootstrap RLI completely
   
   ### Describe the issue this Pull Request addresses
   
   Flink streaming write with record-level index (RLI) bootstrap can get stuck 
in an incomplete recovery state when there are pending instants that need to be 
recommitted after failover or job restart. In that case, the coordinator can 
restore timeline state, but the bootstrap operator may continue running with 
stale in-memory index state and never fully reload the recommitted timeline.
   
   This change makes coordinator recovery explicitly recommit pending instants 
for RLI bootstrap flows and then trigger a global failover so the bootstrap 
operator restarts against the refreshed timeline. That closes the gap between 
timeline recovery and index bootstrap state reloading.
   
   ### Summary and Changelog
   - Move pending-instant recovery logic into `StreamWriteOperatorCoordinator` 
so restored events can be filtered and recommitted during `start()`, 
`resetToCheckpoint(...)`, and `subtaskReset(...)`.
   - Track coordinator lifecycle and whether pending instants were recommitted, 
then trigger a failover for RLI bootstrap so the bootstrap operator reloads 
index state from the recommitted timeline.
   - Add coverage in `TestWriteMergeOnRead`, `MockStateSnapshotContext`, 
`StreamWriteFunctionWrapper`, and `TestWriteBase` for RLI bootstrap recovery 
and failover behavior, while deleting outdated event-buffer wait tests.
   
   ### Impact
   - **Functional impact**: Fixes RLI bootstrap recovery in Flink so pending 
instants are recommitted and index bootstrap is completed after 
recovery-triggered failover.
   
   ### Risk Level
   low
   <!-- Accepted values: none, low, medium or high. Other than `none`, explain 
the risk.
        If medium or high, explain what verification was done to mitigate the 
risks. -->
   
   ### Documentation Update
   
   <!-- Describe any necessary documentation update if there is any new 
feature, config, or user-facing change. If not, put "none".
   
   - The config description must be updated if new configs are added or the 
default value of the configs are changed.
   - Any new feature or user-facing change requires updating the Hudi website. 
Please follow the 
     [instruction](https://hudi.apache.org/contribute/developer-setup#website) 
to make changes to the website. -->
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Enough context is provided in the sections above
   - [ ] Adequate tests were added if applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to