Hello, In the documentation I read that as many files are created in each partition as there are buckets. In the following sample script, I created 32 buckets, but only find 2 files in each partition directory. Am I missing something?
In this sample script, I'm trying to load a tab separated file from disk into the table trades ... and then transferring data into alltrades based on the example in : http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL/BucketedTables BTW, ANOTHER question : How does one put in comments in a hive.q file? -------- sample script ------------ SET hive.enforce.bucketing=TRUE; CREATE TABLE trades (symbol STRING, time STRING, exchange STRING, price FLOAT, volume INT) PARTITIONED BY (dt STRING) CLUSTERED BY (symbol) SORTED BY (time ASC) INTO 1 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE ; LOAD DATA LOCAL INPATH 'data/2001-05-22' INTO TABLE trades PARTITION (dt='2001-05-22'); CREATE TABLE alltrades (symbol STRING, time STRING, exchange STRING, price FLOAT, volume INT) PARTITIONED BY (dt STRING) CLUSTERED BY (symbol) SORTED BY (time ASC) INTO 32 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; FROM trades INSERT OVERWRITE TABLE alltrades PARTITION (dt='2001-05-22') SELECT symbol, time, exchange, price, volume WHERE dt='2001-05-22';