[ 
https://issues.apache.org/jira/browse/HIVE-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-10252:
-----------------------------
    Attachment: HIVE-10252.patch

> Make PPD work for Parquet in row group level
> --------------------------------------------
>
>                 Key: HIVE-10252
>                 URL: https://issues.apache.org/jira/browse/HIVE-10252
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Dong Chen
>            Assignee: Dong Chen
>         Attachments: HIVE-10252.patch
>
>
> In Hive, predicate pushdown figures out the search condition in HQL, 
> serialize it, and push to file format. ORC could use the predicate to filter 
> stripes. Similarly, Parquet should use the statics saved in row group to 
> filter not match row group. But it does not work.
> In {{ParquetRecordReaderWrapper}}, it get splits with all row groups (client 
> side), and push the filter to Parquet for further processing (parquet side). 
> But in  {{ParquetRecordReader.initializeInternalReader()}}, if the splits 
> have already been selected by client side, it will not handle filter again.
> We should make the behavior consistent in Hive. Maybe we could get splits, 
> filter them, and then pass to parquet. This means using client side strategy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to