[ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841666#comment-13841666
 ] 

Justin Coffey commented on HIVE-5783:
-------------------------------------

[~appodictic], regarding the support being built into the semantic analyzer, I 
mimicked what was done for ORC support.  I agree that a hard coded switch 
statement is not the best approach, but thought a larger refactoring was out of 
scope for this request--and definitely not something to be done against the 
0.11 branch :).  Now with trunk support for parquet-hive I suppose we could 
tackle this in a more generic/robust way.

[~xuefuz], do you mean the actual parquet input/output formats and serde?  If 
so, these are in the parquet-hive project 
(https://github.com/Parquet/parquet-mr/tree/master/parquet-hive).

> Native Parquet Support in Hive
> ------------------------------
>
>                 Key: HIVE-5783
>                 URL: https://issues.apache.org/jira/browse/HIVE-5783
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Justin Coffey
>            Assignee: Justin Coffey
>            Priority: Minor
>             Fix For: 0.11.0
>
>         Attachments: hive-0.11-parquet.patch
>
>
> Problem Statement:
> Hive would be easier to use if it had native Parquet support. Our 
> organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
> Hive integration and would like to now contribute that integration to Hive.
> About Parquet:
> Parquet is a columnar storage format for Hadoop and integrates with many 
> Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
> Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
> Parquet integration.
> Changes Details:
> Parquet was built with dependency management in mind and therefore only a 
> single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to