Hey Prasanth, The CTAS for skewed table doesn't work, is it a bug?
create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as directories select r1, r2 from t2; On Thu, Apr 24, 2014 at 3:03 PM, Mayur Gupta <mayur.gupt...@gmail.com>wrote: > Thanks a lot Prasanth for the reply. I would have never figured that out > as the documentation at Hive Wiki DDL > page<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-SkewedTables>and > design > page <https://cwiki.apache.org/confluence/display/Hive/ListBucketing> doesn't > list this. > > One additional point it seems the Skewed table doesn't work when the table > is created as CTAS. The below statement doesn't create separate files. Is > it a bug or is it by intent? > > create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as > directories select r1, r2 from t2; > > > On Thu, Apr 24, 2014 at 6:12 AM, Prasanth Jayachandran < > pjayachand...@hortonworks.com> wrote: > >> Hi Mayur, >> >> The reason why you see single file is, you have not enabled storing >> skewed columns/values as directories. >> You can do the following to enable storing the skewed columns and values >> as directories >> >> set hive.mapred.supports.subdirectories=true; >> set mapred.input.dir.recursive=true; >> create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as >> directories; >> >> This will enable you to store the skewed columns as directories below >> >> /user/hive/warehouse/t1/r2=a/000000_0 (skewed values go here) >> /user/hive/warehouse/t1/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME/000000_0 >> (all other values go here) >> >> With respect to your desc extended question where >> skewedColValueLocationMaps is empty, its a bug in implementation. I just >> verified that it shows empty for unpartitioned tables. But it shows >> correctly for partitioned tables. >> I have created a bug for unpartitioned tables here which you can track >> for progress on this issue >> https://issues.apache.org/jira/browse/HIVE-6968 >> >> >> Thanks >> Prasanth Jayachandran >> >> On Apr 23, 2014, at 6:52 AM, Mayur Gupta <mayur.gupt...@gmail.com> wrote: >> >> Below is my skewedInfo >> >> skewedInfo:SkewedInfo(skewedColNames:[r2], skewedColValues:[[a]], >> skewedColValueLocationMaps:{}) >> >> Any idea why is the skewedColValueLocationMaps empty? >> >> >> On Mon, Apr 21, 2014 at 11:19 AM, Mayur Gupta <mayur.gupt...@gmail.com>wrote: >> >>> Hey There, >>> >>> I was trying to use Skewed tables but I am facing the issue that it is >>> not creating separate files for the skewed data. Even with a simple example >>> I am having the same issue. The hive version is 0.11. >>> >>> create table t(col1 string, col2 string); >>> load data local inpath '/home/hadoop/a.txt' into table t; >>> >>> create table t1(r1 string, r2 string) skewed by (r2) on ('a'); >>> insert into table t1 select * from t; >>> >>> The contents of a.txt are : >>> 1 ^Aa >>> 2^A b >>> 3 ^Ac >>> 4 ^Aa >>> 5 ^Ab >>> 6 ^Aa >>> >>> I see only single file. >>> >>> /user/hive/warehouse/t1/000000_0 >>> >>> Any pointers on what I am doing wrong? >>> >> >> >> >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity >> to which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender immediately >> and delete it from your system. Thank You. > > >