liaoxin01 opened a new pull request, #64755: URL: https://github.com/apache/doris/pull/64755
## Proposed changes ### Problem When an `INSERT`/`CTAS` (or any query) hits a data-conversion error during scan — e.g. a strict `CAST(... AS BIGINT)` on an empty string produced by `regexp_extract`, which returns `INVALID_ARGUMENT parse number fail, string: ''` — the user-facing error reads: ``` [INVALID_ARGUMENT]parse number fail, string: ''failed to initialize storage reader. tablet=421411554, backend=10.228.1.18 ``` The `failed to initialize storage reader. tablet=...` suffix makes it look like the tablet/segment is corrupted or missing, when the real cause is a data/expression error. ### Root cause `OlapScanner::_open_impl` appended `failed to initialize storage reader. tablet=...` to **any** non-OK status returned by `TabletReader::init()`. But `init()` does not merely set up objects — the merge reader eagerly reads the first block of each rowset (`BlockReader::_init_collect_iter` → `VCollectIterator::build_heap` → `Level0Iterator::refresh_current_row` → `RowsetReader::next_batch`) to seed the merge heap. Pushed-down expressions (`common_expr_ctxs`) are evaluated during that first-block read, so a strict-cast failure surfaces inside `init()` and gets wrapped with the storage-reader message. ### Fix Branch on the error code: only genuine storage-level failures keep the `failed to initialize storage reader` wording. For `INVALID_ARGUMENT` (data/expression errors) the message stays neutral and explicitly notes it is a data/expression error rather than a storage failure, while still reporting tablet/backend for locating the node. This is purely a message/diagnostics change; control flow and the returned error code are unchanged. The underlying strict-cast semantics issue is tracked separately (see apache/doris#64266). ## Further comments No behavior change other than the error text; no new tests added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
