liutang123 opened a new issue, #64122: URL: https://github.com/apache/doris/issues/64122
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version 3.x,4.x ### What's Wrong? error message: ``` ColStatsData ndv 0 but min/max is not null and nullCount != count. ('1762805389266--1-user_id_int',0,1761079992750,1762805389266,-1,'user_id_int',null,4813747185,0,1513270905,'0','0',6053083623,'2026-06-01 02:04:04') ``` In ColStatsData.isValid(), the second guard treats a sampled column statistic as invalid whenever ndv == 0, min/max is not all null, and nullCount != count: ``` if (ndv == 0 && (!isNull(minLit) || !isNull(maxLit)) && nullCount != count) { // -> return false; } ``` count vs nullCount are not comparable under sampling <img width="526" height="433" alt="Image" src="https://github.com/user-attachments/assets/44dcb6ca-6076-40f2-bfd6-bdc9865c2f2a" /> So in the sampling path: count is a point-in-time metadata snapshot stored in FE. nullCount is a scaled estimate computed by BE on a subset of tablets, then rounded. Impact Sampled stats for near-all-NULL columns are dropped, the column reverts to ColumnStatistic. ### What You Expected? I think we should accept this stat. ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
