[ https://issues.apache.org/jira/browse/HIVE-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dong Chen updated HIVE-10252: ----------------------------- Attachment: HIVE-10252.patch > Make PPD work for Parquet in row group level > -------------------------------------------- > > Key: HIVE-10252 > URL: https://issues.apache.org/jira/browse/HIVE-10252 > Project: Hive > Issue Type: Sub-task > Reporter: Dong Chen > Assignee: Dong Chen > Attachments: HIVE-10252.patch > > > In Hive, predicate pushdown figures out the search condition in HQL, > serialize it, and push to file format. ORC could use the predicate to filter > stripes. Similarly, Parquet should use the statics saved in row group to > filter not match row group. But it does not work. > In {{ParquetRecordReaderWrapper}}, it get splits with all row groups (client > side), and push the filter to Parquet for further processing (parquet side). > But in {{ParquetRecordReader.initializeInternalReader()}}, if the splits > have already been selected by client side, it will not handle filter again. > We should make the behavior consistent in Hive. Maybe we could get splits, > filter them, and then pass to parquet. This means using client side strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)