Hi,

I was also thinking that this might be the case. For that reason I ran this 
query

Select * from (select col1,col2,col3,count(*)  as val from table_name
group by col1,col2,col3)a where a.val>1 ;

The output that I receive from this query is blank, then I ended up doing 
count(*) and I got the same number of rows as originally in the table. Please 
help me figure this out.

Thanks,
Mayank




-----Original Message-----
From: Thejas Nair [mailto:the...@hortonworks.com]
Sent: Friday, November 22, 2013 1:49 AM
To: <user@hive.apache.org>
Subject: Re: Difference in number of row observstions from distinct and group by

You probably have 400 rows where col1, col2 and col3 have null values.
"count(distinct col1,col2,col3) " will not count those rows.


On Thu, Nov 21, 2013 at 7:13 AM, Mayank Bansal 
<mayank.ban...@mu-sigma.com<mailto:mayank.ban...@mu-sigma.com>> wrote:
> Hi,
>
>
>
> I have a table which has 3 columns combined together to form a primary key.
> If I do
>
>
>
> Select count(distinct col1,col2,col3) from table_name;
>
>
>
> And
>
>
>
> Select count(a.*) from (select col1,col2,col3,count(*) from table_name
> group by col1,col2,col3)a ;
>
>
>
> While running the first query, the count of rows that I get is 400
> less than what I get by the second query.
>
> Can someone please explain to me the difference in number of
> observations from both the queries?
>
>
>
> Thanks
>
> Mayank
>
> This email message may contain proprietary, private and confidential
> information. The information transmitted is intended only for the
> person(s) or entities to which it is addressed. Any review,
> retransmission, dissemination or other use of, or taking of any action
> in reliance upon, this information by persons or entities other than
> the intended recipient is prohibited and may be illegal. If you
> received this in error, please contact the sender and delete the
> message from your system. Mu Sigma takes all reasonable steps to
> ensure that its electronic communications are free from viruses.
> However, given Internet accessibility, the Company cannot accept
> liability for any virus introduced by this e-mail or any attachment and you 
> are advised to use up-to-date virus checking software.

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.

Disclaimer: http://www.mu-sigma.com/disclaimer.html

Reply via email to