----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1183/#review1176 -----------------------------------------------------------
trunk/metastore/scripts/upgrade/mysql/008-HIVE-2246.mysql.sql <https://reviews.apache.org/r/1183/#comment2467> is the CHARSET (latin1) the same as SDS? This will require the user's comments to be in latin1 which prevents UTF chars. trunk/metastore/scripts/upgrade/mysql/008-HIVE-2246.mysql.sql <https://reviews.apache.org/r/1183/#comment2466> can you also add migration script for derby? we support derby as a default metastore RDBMS as well. trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/1183/#comment2468> here do you check if the 'alter table' command changes the schema (columns definition)? If it just set a table property, then you don't need to create a new ColumnDescriptor right? Also if a table's schema got changed, a new CD will be created, but the old partition will still have the old CDs. When we query the old partition, do we use the old partitons's CD or the table's CD? Also in the above case, when you run 'desc table partition <old_partition>', do you return the old partition's CD or the table's CD? - Ning On 2011-07-22 05:30:29, Sohan Jain wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/1183/ > ----------------------------------------------------------- > > (Updated 2011-07-22 05:30:29) > > > Review request for hive, Ning Zhang and Paul Yang. > > > Summary > ------- > > This patch tries to make minimal changes to the API while keeping migration > short and somewhat easy to revert. > > The new schema can be described as follows: > - CDS is a table corresponding to Column Descriptor objects. Currently, it > only stores a CD_ID. > - COLUMNS_V2 is a table corresponding to MFieldSchema objects, or columns. A > Column Descriptor holds a list of columns. COLUMNS_V2 has a foreign key to > the CD_ID to which it belongs. > - SDS was modified to reference a Column Descriptor. So SDS now has a foreign > key to a CD_ID which describes its columns. > > During migration, we create Column Descriptors for tables in a > straightforward manner: their columns are now just wrapped inside a column > descriptor. The SDS of partitions use their parent table's column > descriptor, since currently a partition and its table share the same list of > columns. > > When altering or adding a partition, give it it's parent table's column > descriptor IF the columns they describe are the same. Otherwise, create a > new column descriptor for its columns. > > When adding or altering a table, create a new column descriptor every time. > > Whenever you drop a storage descriptor (e.g, when dropping tables or > partitions), check to see if the related column descriptor has any other > references in the table. That is, check to see if any other storage > descriptors point to that column descriptor. If none do, then delete that > column descriptor. This check is in place so we don't have unreferenced > column descriptors and columns hanging around after schema evolution for > tables. > > > This addresses bug HIVE-2246. > https://issues.apache.org/jira/browse/HIVE-2246 > > > Diffs > ----- > > trunk/metastore/scripts/upgrade/mysql/008-HIVE-2246.mysql.sql PRE-CREATION > trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java > 1148945 > > trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MColumnDescriptor.java > PRE-CREATION > > trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MStorageDescriptor.java > 1148945 > trunk/metastore/src/model/package.jdo 1148945 > > Diff: https://reviews.apache.org/r/1183/diff > > > Testing > ------- > > Passes facebook's regression testing and all existing test cases. In one > instance, before migration, the overhead involved with storage descriptors > and columns was ~11 GB. After migration, the overhead was ~1.5 GB. > > > Thanks, > > Sohan > >