suxiaogang223 opened a new pull request, #64025:
URL: https://github.com/apache/doris/pull/64025

   ### What problem does this PR solve?
   
   Issue Number: close #xxx
   
   Related PR: #xxx
   
   Problem Summary: New parquet row group pruning did not use Parquet bloom 
filters, so equality and IN predicates could only rely on statistics and 
dictionary pruning before falling back to reading row groups.
   
   This PR adds conservative bloom-filter row group pruning for the new parquet 
reader by reusing Arrow Parquet bloom filter APIs and adapting Doris file-layer 
predicates to Arrow Parquet hash checks.
   
   ### Release note
   
   None
   
   ### Check List (For Author)
   
   - Test: Unit Test
       - Local: git diff --check
       - Fedora: BUILD_TYPE=DEBUG ./build.sh --be
       - Fedora: ./run-be-ut.sh --run '--filter=ParquetBloomFilterPruningTest.*'
       - Fedora: ./run-be-ut.sh --run 
'--filter=NewParquetReaderTest.*:ParquetColumnReaderTest.*:ParquetBloomFilterPruningTest.*'
   - Behavior changed: Yes. New parquet reader can prune row groups with 
Parquet bloom filters when enabled and predicates are supported equality or 
IN-list predicates.
   - Does this need documentation: No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to