[
https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144449#comment-13144449
]
Hudson commented on HIVE-2246:
------------------------------
Integrated in Hive-trunk-h0.21 #1059 (See
[https://builds.apache.org/job/Hive-trunk-h0.21/1059/])
HIVE-2366. Metastore upgrade scripts for HIVE-2246 do not migrate indexes
nor rename the old COLUMNS table (Sohan Jain via Ning Zhang)
nzhang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1197644
Files :
* /hive/trunk/metastore/scripts/upgrade/derby/008-HIVE-2246.derby.sql
* /hive/trunk/metastore/scripts/upgrade/derby/008-REVERT-HIVE-2246.derby.sql
* /hive/trunk/metastore/scripts/upgrade/mysql/008-HIVE-2246.mysql.sql
> Dedupe tables' column schemas from partitions in the metastore db
> -----------------------------------------------------------------
>
> Key: HIVE-2246
> URL: https://issues.apache.org/jira/browse/HIVE-2246
> Project: Hive
> Issue Type: Improvement
> Components: Metastore
> Reporter: Sohan Jain
> Assignee: Sohan Jain
> Fix For: 0.8.0
>
> Attachments: HIVE-2246.2.patch, HIVE-2246.3.patch, HIVE-2246.4.patch,
> HIVE-2246.8.patch
>
>
> Note: this patch proposes a schema change, and is therefore incompatible with
> the current metastore.
> We can re-organize the JDO models to reduce space usage to keep the metastore
> scalable for the future. Currently, partitions are the fastest growing
> objects in the metastore, and the metastore keeps a separate copy of the
> columns list for each partition. We can normalize the metastore db by
> decoupling Columns from Storage Descriptors and not storing duplicate lists
> of the columns for each partition.
> An idea is to create an additional level of indirection with a "Column
> Descriptor" that has a list of columns. A table has a reference to its
> latest Column Descriptor (note: a table may have more than one Column
> Descriptor in the case of schema evolution). Partitions and Indexes can
> reference the same Column Descriptors as their parent table.
> Currently, the COLUMNS table in the metastore has roughly (number of
> partitions + number of tables) * (average number of columns pertable) rows.
> We can reduce this to (number of tables) * (average number of columns per
> table) rows, while incurring a small cost proportional to the number of
> tables to store the Column Descriptors.
> Please see the latest review board for additional implementation details.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira