I tried to create a skewed table using the group lens 100k data set and setting the skew columns to the movie rating, but I only see one file get created. My understanding was that separate files would be created per value. Is there anything else that needs to be done?

hive commands:
CREATE TABLE u_data (userid int,movieid int, rating int, unixtime string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' stored as textfile;

CREATE TABLE u_data2 (userid int,movieid int, rating int, unixtime string) skewed by (rating) on (3,4,5);

LOAD DATA LOCAL INPATH './ml-100k.base' OVERWRITE INTO TABLE u_data;

insert into u_data2 select * from u_data;

hadoop fs output:
% hadoop fs -ls /user/hive/warehouse/u_data
Found 1 items
... 1792501 2013-12-26 15:06 /user/hive/warehouse/u_data/ua.base

% hadoop fs -ls /user/hive/warehouse/u_data2
Found 1 items
... 1792501 2013-12-26 15:22 /user/hive/warehouse/u_data2/000000_0

Reply via email to