[jira] [Commented] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

Prasanth Jayachandran (JIRA) Fri, 23 Oct 2015 10:34:22 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971420#comment-14971420
 ]


Prasanth Jayachandran commented on HIVE-11981:
----------------------------------------------

[~mmccline] As discussed, I feel like the reader side changes to ORC are 
intrusive and I don't think we need that many changes  to null out the 
additional columns that are being read. Your latest patch doesn't seem to 
address those changes. Ideally we should have OrcInputFormat add add additional 
columns to OrcStruct when it creates RecordReader. RecordReaderImpl should just 
fill those columns with nulls (for OrcStruct reuse) when reading old files that 
has missing columns.

> ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
> ------------------------------------------------------------------
>
>                 Key: HIVE-11981
>                 URL: https://issues.apache.org/jira/browse/HIVE-11981
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Transactions
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: HIVE-11981.01.patch, HIVE-11981.02.patch, 
> HIVE-11981.03.patch, HIVE-11981.05.patch, HIVE-11981.06.patch, 
> HIVE-11981.07.patch, HIVE-11981.08.patch, HIVE-11981.09.patch, ORC Schema 
> Evolution Issues.docx
>
>
> High priority issues with schema evolution for the ORC file format.
> Schema evolution here is limited to adding new columns and a few cases of 
> column type-widening (e.g. int to bigint).
> Renaming columns, deleting column, moving columns and other schema evolution 
> were not pursued due to lack of importance and lack of time.  Also, it 
> appears a much more sophisticated metadata would be needed to support them.
> The biggest issues for users have been adding new columns for ACID table 
> (HIVE-11421 Support Schema evolution for ACID tables) and vectorization 
> (HIVE-10598 Vectorization borks when column is added to table).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

Reply via email to