Hastyshell opened a new pull request, #61387:
URL: https://github.com/apache/doris/pull/61387
## Proposed changes
Backport of #61386 to `branch-4.0`.
The three-tier retry logic in `Segment::_open()` (static method) was
structured as `if-else-if`, so when `open_file()` succeeded but
`_parse_footer()` returned `CORRUPTION`, the retry branch was unreachable.
### Root cause
```cpp
// Before: if-else-if structure
auto st = fs->open_file(path, &file_reader, &reader_options);
if (st) { // open_file succeeded (almost
always)
segment->_file_reader = ...;
st = segment->_open(stats); // _parse_footer() → CORRUPTION
stored in st
} else if (st.is<CORRUPTION>() && ...) { // UNREACHABLE: already entered
`if` branch
// Tier 1: clear cache, retry
// Tier 2: bypass cache, read remote directly
}
RETURN_IF_ERROR(st); // CORRUPTION returned without any retry
```
`open_file()` only opens a file handle and rarely returns `CORRUPTION`. The
actual footer checksum validation happens inside `_parse_footer()` (called via
`segment->_open()`). Because the retry was in an `else if` guarded by the same
`st` from `open_file()`, it was never reachable for the common
`_parse_footer()` corruption case.
### Fix
Change `else if` to a separate `if` block, so CORRUPTION from either
`open_file()` or `_parse_footer()` triggers the three-tier retry.
### Issue
Observed in cloud (S3) deployments (`branch-4.0`): schema change fails with
`CORRUPTION: Bad segment file footer checksum not match`. Log analysis
confirmed that no retry log messages were ever emitted, consistent with this
code bug.
## Tests
- Added `TestFooterCorruptionTriggersRetry` to `segment_corruption_test.cpp`
- Uses the existing `Segment::parse_footer:magic_number_corruption` sync
point to corrupt the footer magic number on the first `_parse_footer()` call
only
## Checklist
- [x] Does it affect the original behavior? Yes — previously CORRUPTION from
`_parse_footer()` was never retried; now it correctly follows the three-tier
retry
- [x] Has unit test been added? Yes
- [x] Is this a backport? Yes, backport of #61386
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]