skewed tables

2014-10-09 Thread siva kumar
hi folks, Im working on skewed tables but facing some problems while it has to create seperate files. Here is my query. create table t(key string,value int) row format delimited stored as textfile; load data local inpath 'skew.txt' into table t; select * from

Re: Skewed Tables

2014-04-28 Thread Prasanth Jayachandran
d > something. For an example see the beginning of the Skewed Tables section. > Sometimes the version information isn't called out like that, though, it's > just part of the text. And in the CREATE TABLE syntax it's a comment > alongside a clause such as TBLPROPERTIE

Re: Skewed Tables

2014-04-27 Thread Lefty Leverenz
Prasanth, Hive's user docs are wiki-only at this point so there's no version control. We just add notes about which release introduced or changed something. For an example see the beginning of the Skewed Tables<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#Lang

Re: Skewed Tables

2014-04-27 Thread Prasanth Jayachandran
@Mayur.. I don’t think the initial design considered CTAS for skewed tables. So it might not be supported at all. @Lefty.. I am not sure where/how the docs are maintained. Is it version controlled? Or is it only maintained in confluence wiki? If it is the later can you please provide me access

Re: Skewed Tables

2014-04-26 Thread Mayur Gupta
gt;> Below is my skewedInfo >> >> skewedInfo:SkewedInfo(skewedColNames:[r2], skewedColValues:[[a]], >> skewedColValueLocationMaps:{}) >> >> Any idea why is the skewedColValueLocationMaps empty? >> >> >> On Mon, Apr 21, 2014 at 11:19 AM, Mayur Gupta w

Re: Skewed Tables

2014-04-26 Thread Lefty Leverenz
doc too, so take your pick among these locations: - DDL doc - Create Table -- Row Format, Storage Format, and SerDe<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RowFormat,StorageFormat,andSerDe> - Create Table -- Skewed Tables<https:/

Re: Skewed Tables

2014-04-25 Thread Prasanth Jayachandran
lValueLocationMaps:{}) >> >> Any idea why is the skewedColValueLocationMaps empty? >> >> >> On Mon, Apr 21, 2014 at 11:19 AM, Mayur Gupta >> wrote: >> Hey There, >> >> I was trying to use Skewed tables but I am facing the issue that it is

Re: Skewed Tables

2014-04-24 Thread Lefty Leverenz
ned tables here which you can track >> for progress on this issue >> https://issues.apache.org/jira/browse/HIVE-6968 >> >> >> Thanks >> Prasanth Jayachandran >> >> On Apr 23, 2014, at 6:52 AM, Mayur Gupta wrote: >> >> Below is my skewedIn

Re: Skewed Tables

2014-04-24 Thread Mayur Gupta
hy is the skewedColValueLocationMaps empty? > > > On Mon, Apr 21, 2014 at 11:19 AM, Mayur Gupta wrote: > >> Hey There, >> >> I was trying to use Skewed tables but I am facing the issue that it is >> not creating separate files for the skewed data. Even with a

Re: Skewed Tables

2014-04-23 Thread Prasanth Jayachandran
LocationMaps:{}) > > Any idea why is the skewedColValueLocationMaps empty? > > > On Mon, Apr 21, 2014 at 11:19 AM, Mayur Gupta wrote: > Hey There, > > I was trying to use Skewed tables but I am facing the issue that it is not > creating separate files for the skewe

Re: Skewed Tables

2014-04-23 Thread Mayur Gupta
Below is my skewedInfo skewedInfo:SkewedInfo(skewedColNames:[r2], skewedColValues:[[a]], skewedColValueLocationMaps:{}) Any idea why is the skewedColValueLocationMaps empty? On Mon, Apr 21, 2014 at 11:19 AM, Mayur Gupta wrote: > Hey There, > > I was trying to use Skewed tables but I

Skewed Tables

2014-04-20 Thread Mayur Gupta
Hey There, I was trying to use Skewed tables but I am facing the issue that it is not creating separate files for the skewed data. Even with a simple example I am having the same issue. The hive version is 0.11. create table t(col1 string, col2 string); load data local inpath '/home/h

skewed tables

2013-12-26 Thread Kireet
I tried to create a skewed table using the group lens 100k data set and setting the skew columns to the movie rating, but I only see one file get created. My understanding was that separate files would be created per value. Is there anything else that needs to be done? hive commands: CREATE TA

Re: Hive skewed tables

2013-11-14 Thread Rajesh Balamohan
I mentioned that as it scanned all files based on hdfs bytes read.. Table is not compressed and hdfs bytes read matched the data size in the partition. I had bucketing enabled. But somehow when I joined with another table it had long tail issue where most of the data went to single reducer. He

Re: Hive skewed tables

2013-11-13 Thread Nitin Pawar
how did u check its looking at all files inside the partition? If you want more restriction on limit on filse to be accessed, you can bucket them as well. That way you really dont have to worry about which data is skewed and let the framework handle it. On Thu, Nov 14, 2013 at 11:16 AM, Rajesh B

Re: Hive skewed tables

2013-11-13 Thread Rajesh Balamohan
Thanks Nitin. I have only one partition in this table for testing. I thought within the partition it will scan only certain files based on skewed fields. However it is scanning the entire data within the partition. On Nov 14, 2013 9:38 AM, "Nitin Pawar" wrote: > In my understanding, > when

Re: Hive skewed tables

2013-11-13 Thread Nitin Pawar
In my understanding, when you are saying scanning entire dataset it is looking at all your partitions because your data has been partitioned by the date column. A skewed table is a table where there will be different files created for all your skewed keys in all the partitions. So for your query i

Hive skewed tables

2013-11-13 Thread Rajesh Balamohan
Hi All, I have the following skewed table "addresses_1" select id, count(*) c from addresses_1 group by id order by c desc limit 10; 1426246531554806 198477395958492 102641838220181 138947865211331 156483436193429 96411677179771 210082076168033 800174765152421 1391