cshuo opened a new pull request, #18434:
URL: https://github.com/apache/hudi/pull/18434
…ootstrap RLI completely
### Describe the issue this Pull Request addresses
Flink streaming write with record-level index (RLI) bootstrap can get stuck
in an incomplete recovery state when there are pending instants that need to be
recommitted after failover or job restart. In that case, the coordinator can
restore timeline state, but the bootstrap operator may continue running with
stale in-memory index state and never fully reload the recommitted timeline.
This change makes coordinator recovery explicitly recommit pending instants
for RLI bootstrap flows and then trigger a global failover so the bootstrap
operator restarts against the refreshed timeline. That closes the gap between
timeline recovery and index bootstrap state reloading.
### Summary and Changelog
- Move pending-instant recovery logic into `StreamWriteOperatorCoordinator`
so restored events can be filtered and recommitted during `start()`,
`resetToCheckpoint(...)`, and `subtaskReset(...)`.
- Track coordinator lifecycle and whether pending instants were recommitted,
then trigger a failover for RLI bootstrap so the bootstrap operator reloads
index state from the recommitted timeline.
- Add coverage in `TestWriteMergeOnRead`, `MockStateSnapshotContext`,
`StreamWriteFunctionWrapper`, and `TestWriteBase` for RLI bootstrap recovery
and failover behavior, while deleting outdated event-buffer wait tests.
### Impact
- **Functional impact**: Fixes RLI bootstrap recovery in Flink so pending
instants are recommitted and index bootstrap is completed after
recovery-triggered failover.
### Risk Level
low
<!-- Accepted values: none, low, medium or high. Other than `none`, explain
the risk.
If medium or high, explain what verification was done to mitigate the
risks. -->
### Documentation Update
<!-- Describe any necessary documentation update if there is any new
feature, config, or user-facing change. If not, put "none".
- The config description must be updated if new configs are added or the
default value of the configs are changed.
- Any new feature or user-facing change requires updating the Hudi website.
Please follow the
[instruction](https://hudi.apache.org/contribute/developer-setup#website)
to make changes to the website. -->
### Contributor's checklist
- [ ] Read through [contributor's
guide](https://hudi.apache.org/contribute/how-to-contribute)
- [ ] Enough context is provided in the sections above
- [ ] Adequate tests were added if applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]