----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/633/#review605 -----------------------------------------------------------
trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java <https://reviews.apache.org/r/633/#comment1249> talked to siying offline - the check: if (split instanceof Hadoop20Shims.InputSplitShim) is not needed - this can be replaced by an assert. Same in Hadoop20SShims. Otherwise looks good - namit On 2011-04-28 08:32:17, Siying Dong wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/633/ > ----------------------------------------------------------- > > (Updated 2011-04-28 08:32:17) > > > Review request for hive, Ning Zhang and namit jain. > > > Summary > ------- > > We need a better input sampling to serve at least two purposes: > 1. test their queries against a smaller data set > 2. understand more about how the data look like without scanning the whole > table. > A simple function that gives a subset splits will help in those cases. It > doesn't have to be strict sampling. > > This diff allows a syntax of .. table TABLESAMPLE(n PERCENT), which samples > input splits with size at least n% of the original inputs. > > > This addresses bug HIVE-2121. > https://issues.apache.org/jira/browse/HIVE-2121 > > > Diffs > ----- > > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096852 > trunk/conf/hive-default.xml 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java > 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java > 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java > 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java > 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRUnion1.java > 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java > 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinFactory.java > 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > 1096852 > trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SplitSample.java > PRE-CREATION > trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 1096852 > trunk/ql/src/test/queries/clientnegative/split_sample_out_of_range.q > PRE-CREATION > trunk/ql/src/test/queries/clientnegative/split_sample_wrong_format.q > PRE-CREATION > trunk/ql/src/test/queries/clientpositive/split_sample.q PRE-CREATION > trunk/ql/src/test/results/clientnegative/split_sample_out_of_range.q.out > PRE-CREATION > trunk/ql/src/test/results/clientnegative/split_sample_wrong_format.q.out > PRE-CREATION > trunk/ql/src/test/results/clientpositive/bucket1.q.out 1096852 > trunk/ql/src/test/results/clientpositive/bucket2.q.out 1096852 > trunk/ql/src/test/results/clientpositive/bucket3.q.out 1096852 > trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out 1096852 > trunk/ql/src/test/results/clientpositive/sample1.q.out 1096852 > trunk/ql/src/test/results/clientpositive/sample10.q.out 1096852 > trunk/ql/src/test/results/clientpositive/sample2.q.out 1096852 > trunk/ql/src/test/results/clientpositive/sample3.q.out 1096852 > trunk/ql/src/test/results/clientpositive/sample4.q.out 1096852 > trunk/ql/src/test/results/clientpositive/sample5.q.out 1096852 > trunk/ql/src/test/results/clientpositive/sample6.q.out 1096852 > trunk/ql/src/test/results/clientpositive/sample7.q.out 1096852 > trunk/ql/src/test/results/clientpositive/sample8.q.out 1096852 > trunk/ql/src/test/results/clientpositive/sample9.q.out 1096852 > trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java > 1096852 > trunk/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java > 1096852 > trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java > 1096852 > > Diff: https://reviews.apache.org/r/633/diff > > > Testing > ------- > > TestCliDriver TestNegativeCliDriver, manual tests on real clusters. > > > Thanks, > > Siying > >