----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24289/#review50047 -----------------------------------------------------------
Please add .q tests for these. Test for partitioned table with more than one partition column on variety of column types and variety of stats type. ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java <https://reviews.apache.org/r/24289/#comment87572> Include example sql statement for which this task is meant for. ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java <https://reviews.apache.org/r/24289/#comment87575> Add a comment saying grammar prohibits more than 1 column, so we are guaranteed to have only 1 element in this lists. ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java <https://reviews.apache.org/r/24289/#comment87576> Is clear() needed here? ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java <https://reviews.apache.org/r/24289/#comment87579> Add else{ throw SemanticException ("Unknown stat"); } add to all of subsequent block. You may also want to reconsider some of this reptition in private method. ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java <https://reviews.apache.org/r/24289/#comment87580> Add else { throw Exception ("Unsupported type"); } ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java <https://reviews.apache.org/r/24289/#comment87574> Copy-paste comments? ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java <https://reviews.apache.org/r/24289/#comment87573> Comments seem out of place. Copy-paste? ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java <https://reviews.apache.org/r/24289/#comment87563> throw new SemanticException ("table " + tbl + "not found"); ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java <https://reviews.apache.org/r/24289/#comment87564> if (colType == null) throw new Semantic Exception ("col not found"); ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java <https://reviews.apache.org/r/24289/#comment87565> There can be multiple partitioning column, in which case this assert will fail. Dont think you want that. ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java <https://reviews.apache.org/r/24289/#comment87566> Instead of this for loop, you want to use Warehouse.makePartName(partSpec, false); ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java <https://reviews.apache.org/r/24289/#comment87567> throw SemanticEx ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java <https://reviews.apache.org/r/24289/#comment87568> check colType != null ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java <https://reviews.apache.org/r/24289/#comment87562> I don't think this if block is required. Further, you need to add a HiveOperation corresponding to this new token. ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java <https://reviews.apache.org/r/24289/#comment87571> Add comment like, work corresponding to statement: alter table t1 partition (p1=c1,p2=c2), update... ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java <https://reviews.apache.org/r/24289/#comment87569> This field doesnt seem to be used. Can be removed. ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java <https://reviews.apache.org/r/24289/#comment87570> Good to implement this. Useful for debugging. - Ashutosh Chauhan On Aug. 5, 2014, 6:40 p.m., pengcheng xiong wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/24289/ > ----------------------------------------------------------- > > (Updated Aug. 5, 2014, 6:40 p.m.) > > > Review request for hive. > > > Repository: hive-git > > > Description > ------- > > This patch provides ability to update certain stats without scanning any data > or without "hacking the backend db". It helps (esp for CBO work) to set up > unit tests quickly and verify both cbo and the stats subsystem. It also helps > when experimenting with the system if you're just trying out hive/hadoop on a > small cluster. Finally it gives you a quick and clean way to fix things when > something went wrong wrt stats in your environment. > Usage: > ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN > col_name SET col_statistics > For example, > ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET > ('numDVs'='101','highValue'='10001.0'); > ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key > SET ('numDVs'='100','avgColLen'='1.0001'); > > > Diffs > ----- > > > metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java > c3e2820 > > metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java > 89c31dc > > metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java > 44bbab5 > ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 > > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java > 4300145 > ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java > 67a3aa7 > ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ab1188a > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java > 268920a > ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/24289/diff/ > > > Testing > ------- > > > Thanks, > > pengcheng xiong > >