Re: Hive Join with distinct rows

2013-07-30 Thread Sunita Arvind
Thanks for sharing your experience Marcin Sunita On Tue, Jul 30, 2013 at 11:54 AM, Marcin Mejran wrote: > I’ve used a rank udf for this previously, distribute and sort by the > column then select all rows where rank=1. That should work with a join but > I never tried it. It’d be an issue if t

RE: Hive Join with distinct rows

2013-07-30 Thread Marcin Mejran
I've used a rank udf for this previously, distribute and sort by the column then select all rows where rank=1. That should work with a join but I never tried it. It'd be an issue if the join outputs a lot of records that then are all dropped since that'd slow down the query. I've actually forke

Re: Hive Join with distinct rows

2012-11-09 Thread Praveen Kumar K J V S
But I think Hive should support distinct on single column along with fetching corresponding data from other columns mentioned in the query. Something like "Select distinct(col1), col2, col3 from TB1" For example hive> SELECT col1, col2 FROM t1; 1 3 1 3 1 4 2 5 -- Selects distinct col1, col2 tup

Re: Hive Join with distinct rows

2012-11-09 Thread Praveen Kumar K J V S
Thank you very much Mark Yes query1 is doing just fine, but using query1 I will not be able to get the data in other columns in table T1 On Fri, Nov 9, 2012 at 10:04 PM, Mark Grover wrote: > I see. I re-read your first email and you would like to query "select all > the unique ID's in T1 which

Re: Hive Join with distinct rows

2012-11-09 Thread Mark Grover
I see. I re-read your first email and you would like to query "select all the unique ID's in T1 which are not in T2" Query 1 seems to be doing just fine so I would say that's the way to go. I personally use "IS" operator when comparing something with NULLs instead of "=". There are some optimizat

Re: Hive Join with distinct rows

2012-11-09 Thread Bejoy KS
S Date: Fri, 9 Nov 2012 21:30:28 To: Reply-To: user@hive.apache.org Subject: Re: Hive Join with distinct rows Thanks Mark, I do understand that how Hive works with Distinct keyword. What I was looking for is a solution for my requirement in Hive, I am not an expert in SQL, hence looking for

Re: Hive Join with distinct rows

2012-11-09 Thread Praveen Kumar K J V S
Thanks Mark, I do understand that how Hive works with Distinct keyword. What I was looking for is a solution for my requirement in Hive, I am not an expert in SQL, hence looking for suggestions On Fri, Nov 9, 2012 at 9:54 AM, Mark Grover wrote: > Hi Praveen, > Let's take an example: > (from > h

Re: Hive Join with distinct rows

2012-11-08 Thread Mark Grover
Hi Praveen, Let's take an example: (from https://cwiki.apache.org/Hive/languagemanual-select.html#LanguageManualSelect-ALLandDISTINCTClauses ) -- Print out contents of the table hive> SELECT col1, col2 FROM t1; 1 3 1 3 1 4 2 5 -- Selects distinct col1, col2 tuple hive> SELECT DISTINCT col1, col2