[ https://issues.apache.org/jira/browse/HIVE-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Edward Capriolo updated HIVE-3083: ---------------------------------- Priority: Blocker (was: Critical) Affects Version/s: 0.10.0 > In local mode bucking does not work > ----------------------------------- > > Key: HIVE-3083 > URL: https://issues.apache.org/jira/browse/HIVE-3083 > Project: Hive > Issue Type: Bug > Affects Versions: 0.7.1, 0.8.1, 0.9.0, 0.10.0 > Reporter: Edward Capriolo > Assignee: Edward Capriolo > Priority: Blocker > > In local mode hive bucketing does not work. I am willing to bet that since > none of the bucketing unit tests assert that N files are actually created the > tests are producing false positives as well. > [edward@tablitha hive-0.9.0-bin]$ bin/hive > hive> create table numbersflat(number int); > hive> load data local inpath '/home/edward/numbers' into table numbersflat; > Copying data from file:/home/edward/numbers > Copying file: file:/home/edward/numbers > Loading data to table default.numbersflat > OK > Time taken: 0.288 seconds > hive> select * from numbersflat; > OK > 1 > 2 > 3 > 4 > 5 > 6 > 7 > 8 > 9 > 10 > Time taken: 0.274 seconds > hive> CREATE TABLE numbers_bucketed(number int,number1 int) CLUSTERED > BY (number) INTO 3 BUCKETS; > OK > Time taken: 0.082 seconds > hive> set hive.enforce.bucketing = true; > hive> set hive.exec.reducers.max = 200; > hive> set hive.merge.mapfiles=false; > hive> > > insert OVERWRITE table numbers_bucketed select number,number+1 > as number1 from numbersflat; > Total MapReduce jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks determined at compile time: 3 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer=<number> > In order to limit the maximum number of reducers: > set hive.exec.reducers.max=<number> > In order to set a constant number of reducers: > set mapred.reduce.tasks=<number> > 12/06/04 00:50:35 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH > Execution log at: > /tmp/edward/edward_20120604005050_e17eb952-af76-4cf3-aee1-93bd59e74517.log > Job running in-process (local Hadoop) > Hadoop job information for null: number of mappers: 0; number of reducers: 0 > 2012-06-04 00:50:47,938 null map = 0%, reduce = 0% > 2012-06-04 00:50:48,940 null map = 100%, reduce = 0% > 2012-06-04 00:50:49,942 null map = 100%, reduce = 100% > Ended Job = job_local_0001 > Execution completed successfully > Mapred Local Task Succeeded . Convert the Join into MapJoin > Loading data to table default.numbers_bucketed > Deleted file:/user/hive/warehouse/numbers_bucketed > Table default.numbers_bucketed stats: [num_partitions: 0, num_files: > 1, num_rows: 10, total_size: 43, raw_data_size: 33] > OK > Time taken: 16.722 seconds > hive> dfs -ls /user/hive/warehouse/numbers_bucketed; > Found 1 items > -rwxrwxrwx 1 edward edward 43 2012-06-04 00:50 > /user/hive/warehouse/numbers_bucketed/000000_0 > hive> dfs -ls /user/hive/warehouse/numbers_bucketed/000000_0; > Found 1 items > -rwxrwxrwx 1 edward edward 43 2012-06-04 00:50 > /user/hive/warehouse/numbers_bucketed/000000_0 > hive> cat /user/hive/warehouse/numbers_bucketed/000000_0; > FAILED: Parse Error: line 1:0 cannot recognize input near 'cat' '/' 'user' > hive> dfs -cat /user/hive/warehouse/numbers_bucketed/000000_0; > 1 2 > 2 3 > 3 4 > 4 5 > 5 6 > 6 7 > 7 8 > 8 9 > 9 10 > 10 11 > hive> -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira