RE: Difference in number of row observstions from distinct and group by

2013-11-25 Thread Mayank Bansal
and group by You probably have 400 rows where col1, col2 and col3 have null values. "count(distinct col1,col2,col3) " will not count those rows. On Thu, Nov 21, 2013 at 7:13 AM, Mayank Bansal mailto:mayank.ban...@mu-sigma.com>> wrote: > Hi, > > > > I have a table whi

Difference in number of row observstions from distinct and group by

2013-11-21 Thread Mayank Bansal
Hi, I have a table which has 3 columns combined together to form a primary key. If I do Select count(distinct col1,col2,col3) from table_name; And Select count(a.*) from (select col1,col2,col3,count(*) from table_name group by col1,col2,col3)a ; While running the first query, the count of ro

RE: Percentile calculation

2012-10-03 Thread Mayank Bansal
reached, I increased the reducer memory to 12 GB and then I got the above error. Can you please help in solving this problem. Thanks, Mayank -Original Message- From: Mayank Bansal [mailto:mayank.ban...@mu-sigma.com] Sent: Tuesday, October 02, 2012 6:23 PM To: user@hive.apache.org Subject:

RE: Percentile calculation

2012-10-02 Thread Mayank Bansal
, October 02, 2012 8:41 AM To: user@hive.apache.org Subject: Re: Percentile calculation More info, please. On Mon, Oct 1, 2012 at 4:50 PM, Mayank Bansal wrote: > Hi, > > > > I am trying to run the hive udf percentile, I am trying to run it on a > column with something around 116 mil

RE: hive query fails

2012-10-01 Thread Mayank Bansal
Could you give more details, it would be helpful if you could share the log files from the tasktracker for this job. Your tracking url has details for the process, you could start from there and share the logs. From: Ajit Kumar Shreevastava [mailto:ajit.shreevast...@hcl.com] Sent: Monday, Octobe

Percentile calculation

2012-10-01 Thread Mayank Bansal
Hi, I am trying to run the hive udf percentile, I am trying to run it on a column with something around 116 million unique values. The maximum space that I can give to the reducer is 12 GB, the job keeps on failing due to java heap space error. Is there a way to optimize this, so that I don't en

RE: Upper case column names

2012-08-15 Thread Mayank Bansal
normalize everything to lower case. On Wed, Aug 15, 2012 at 9:05 AM, Mayank Bansal wrote: > Hi, > > > > I wanted to ask one more thing, will changing the database from derby > to some other help me in getting upper case column names? > > > > Thanks, > >

RE: Upper case column names

2012-08-15 Thread Mayank Bansal
Hi, I wanted to ask one more thing, will changing the database from derby to some other help me in getting upper case column names? Thanks, Mayank From: Mayank Bansal [mailto:mayank.ban...@mu-sigma.com] Sent: Wednesday, August 15, 2012 12:53 PM To: user@hive.apache.org Subject: RE: Upper case

RE: Upper case column names

2012-08-15 Thread Mayank Bansal
lly case sensitive though. > Case sensitive field names as an option certainly would use helpful though. > --travis > > On Tue, Aug 14, 2012 at 8:24 AM, Mayank Bansal > mailto:mayank.ban...@mu-sigma.com>> wrote: >> >> Hi, >> >> >> >> The column n

Upper case column names

2012-08-14 Thread Mayank Bansal
Hi, The column names in hive are by default case insensitive. I was wondering if there is any way, I could make the column names case sensitive? I am running a model on a data, the data is now stored in hive, the model has columns referred in camel case. It would require a lot of effort to chang

Loading files from a directory

2012-06-21 Thread Mayank Bansal
Hi, I am trying to create an external table in hive, by referring to a directory in hadoop containing multiple files of the same type inside the directory. But hive gives an error, that the path specified is not a filename. Is there a way to load all the files present inside a directory into one