I feel the mail at your end has got a different format. Remove * and - in the data what you see...That is my final data....You can correlate with B
2. I need A LeftJoin B On Apr 27, 2016 11:41 PM, "Markovitz, Dudu" <dmarkov...@paypal.com> wrote: > Huston, we have a problem J > > > > 1. > > Data > > > > Here are the tables with tabs replaced by ‘*’ and spaces replaced by ‘-’. > > A is a mismatch of tabs and spaces and B contains only (a very strange > number of) spaces. > > > > So you’re using TRIM with all A columns, but how exactly have you defined > table B? > > > > > > *A* > > 101*---US*------Tax*----X > > 101*---USA*------Tax*----Y > > 101*---CA*------Tax*----Y > > 101*---CN*------Tax*----Y > > 142*---US*------Pus*----S > > 142*---CA*------Pus*----Y > > 142*---CN*------Pus*----Y > > 101*---USA*------Pus*----Y > > 119*---USA*------Tax*----Y > > 129*---USA*------Tax*----Y > > 101*---USA*------STORE*----X > > 101*---USA*------OCP*----X > > 142*---CAN*------Feed*----X > > 101*---USA*------Feed*----X > > 101*---USA*------Win*----X > > 101*---USA*------Sal*----X > > 192*---US*------Mod*----X > > 142*---USA*------MO*----X > > 101*---US*------Retail*X > > 142*---US*------OAS*----X > > 101*---USA*------OAS*----Y > > > > *B* > > All--------All-----------Retail------All----John > > All-------US----------OM----------X------John > > ALL-----US---------OM----------Y------Smith > > 101-----All----------OAS---------All----John > > 142-----All----------OAS---------All----John > > 192-----All----------Mod----------All----Sean > > 101-----All----------Tax-----------X------John > > 101-----All---------Tax------------Y------Smith > > 142-----US---------Pus----------X------John > > 142-----All----------Pus----------Y------Smith-- > > > > > > 2. > > Requirement > > > > “A left join B” guarantees that all the records from A appear in the > result set and that the number of rows in the result set is *at least* > the number of rows of A. > > You should take into consideration that a record in A might have several > matches from B. > > > > Also – > > In your original question you’ve talked about “B LEFT JOIN A” and now it’s > “A LEFT JOIN B”. > > > > 3. > > Scenario 1 > > > > I don’t want to start a lecture at that point, but ‘ON‘ and ‘WHERE’ have a > completely different meaning and use. > > You’re conditions in the WHERE clause turned the LEFT JOIN to INNER JOIN. > > > > 4. > > Scenario 2 > > > > I’ve suggested ‘LIKE’ not ‘=’ (but it won’t work because of Hive > limitations that require equality conditions for JOIN). > > > > 5. > > Scenario 3 > > > > CROSS JOIN does not use ON (Hive lets you do that but it not an SQL > standard and it’s actually an INNER JOIN). > > > > 6. > > CASE > > > > CASE is defined by ANSI/ISO and works in Hive the same way it works in HQL > the same way it works in any other SQL dialect. > > > > In conclusion - > > The solution I gave you in the previous mails works as expected. > > You should check your data and your tables’ definitions. > > > > Just tell me if you want “B LEFT JOIN A” or “A LEFT JOIN B” and I’ll send > you the exact code. > > > > Dudu > > > > *From:* Kishore A [mailto:kishore.atmak...@gmail.com] > *Sent:* Wednesday, April 27, 2016 6:30 PM > *To:* user@hive.apache.org > *Subject:* Re: Question on Implementing CASE in Hive Join > > > > Hello Dudu, > > > > Thank you for previous reply. I tried all the scenarios but nothing is > working out. I need this condition with with minimal steps like by using > case when. > > > > Please find the below > > > > > > TableA > > Code Country Type Flag > > 101 US Tax X > > 101 USA Tax Y > > 101 CA Tax Y > > 101 CN Tax Y > > 142 US Pus S > > 142 CA Pus Y > > 142 CN Pus Y > > 101 USA Pus Y > > 119 USA Tax Y > > 129 USA Tax Y > > 101 USA STORE X > > 101 USA OCP X > > 142 CAN Feed X > > 101 USA Feed X > > 101 USA Win X > > 101 USA Sal X > > 192 US Mod X > > 142 USA MO X > > 101 US Retail X > > 142 US OAS X > > 101 USA OAS Y > > > > TableB > > Code Country Type Flag Value > > All All Retail All John > > All US OM X John > > ALL US OM Y Smith > > 101 All OAS All John > > 142 All OAS All John > > 192 All Mod All Sean > > 101 All Tax X John > > 101 All Tax Y Smith > > 142 US Pus X John > > 142 All Pus Y Smith > > > > > > This is how my data in reality looks like. *Type *column is the fixed > column here > > > > Requirement : TableA LeftJoin TableB , stamp the matching * Value *from > TableB. Total result should have the number of rows as TableA > > > > Scenario 1: > > > > Select * FROM TableA a > > Left Join TableB b ON > > LOWER(TRIM(a.Type)) = LOWER(TRIM(b.Type)) > > WHERE LOWER(TRIM(a.Code)) = (CASE WHEN LOWER(TRIM(b.Code)) = > LOWER('All') THEN LOWER(TRIM(a.Code)) ELSE LOWER(TRIM(b.Code)) END) > > AND LOWER(TRIM(a.country)) = (CASE WHEN LOWER(TRIM(b.country)) = > LOWER('All') THEN LOWER(TRIM(a.country)) ELSE LOWER(TRIM(b.country)) END) > > AND LOWER(TRIM(a.flag)) = (CASE WHEN LOWER(TRIM(b.flag)) = > LOWER('All') THEN LOWER(TRIM(a.flag)) ELSE LOWER(TRIM(b.flag)) END); > > > > on the output the number of rows should be same as the left table, but as > I am apply WHERE data is getting filtered out and is giving only > > matching records > > > > > > > > Scenario 2: Suggested by you > > > > Select * FROM TableA a > > Left Join TableB b ON > > LOWER(TRIM(a.Type)) = LOWER(TRIM(b.Type)) > > AND LOWER(TRIM(a.Code)) = (CASE WHEN LOWER(TRIM(b.Code)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.Code)) END) > > AND LOWER(TRIM(a.country)) = (CASE WHEN LOWER(TRIM(b.country)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.country)) END) > > AND LOWER(TRIM(a.flag)) = (CASE WHEN LOWER(TRIM(b.flag)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.flag)) END); > > > > As per the left join this is giving me the number of rows just as the left > side table but I am not having the matching records as per the > > right table. I have NULLs on the right side. > > > > Scenario 3: Cross Join > > > > Select * FROM TableA a > > CROSS Join TableB b ON > > LOWER(TRIM(a.Type)) = LOWER(TRIM(b.Type)) > > AND LOWER(TRIM(a.Code)) = (CASE WHEN LOWER(TRIM(b.Code)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.Code)) END) > > AND LOWER(TRIM(a.country)) = (CASE WHEN LOWER(TRIM(b.country)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.country)) END) > > AND LOWER(TRIM(a.flag)) = (CASE WHEN LOWER(TRIM(b.flag)) = > LOWER('All') THEN '%' ELSE LOWER(TRIM(b.flag)) END); > > > > Giving same result as Scenario 2 - All the left columns with NULLs on right > > > > I don't know if we still need to fine tune the query as Hive is not > supporting CASE WHEN just as in SQL. Your thoughts are more helpful to me. > > > > Kishore > > > > > > On Wed, Apr 20, 2016 at 12:04 PM, Markovitz, Dudu <dmarkov...@paypal.com> > wrote: > > The second version works as expected (after fixing a typo in the word > ‘indicator’). > > If you don’t get any results you should check your data (maybe the fields > contains trailing spaces or control characters etc.). > > > > If you’re willing to replace the ‘OUTER’ with ‘INNER’, there’s another > option - > > > > *select* * > > > > *from* b > > > > *cross* *join* a > > > > *where* a.*type* = b.*type* > > *and* a.code like *case* b.code *when* 'ALL' *then* '%' > *else* b.code *end* > > *and* a.indicator like *case* b.indicator *when* 'ALL' *then* '%' > *else* b.indicator *end* > > *;* > > > > Dudu > > > > > > *From:* Kishore A [mailto:kishore.atmak...@gmail.com] > *Sent:* Wednesday, April 20, 2016 5:04 PM > > > *To:* user@hive.apache.org > *Subject:* Re: Question on Implementing CASE in Hive Join > > > > Hi Dudu, > > > > Thank you for sending queries around this. > > > > I have run these queries and below are the observations > > > > 1. It did return the same error as before" SemanticException [Error > 10017]: Line 4:4 Both left and right aliases encountered in JOIN 'code'" > > > > 2. Query execution is successful but not retrieving any results out of it. > > > > I am clueless and not able to proceed to next step until this is resolved. > Do you have any other suggestions please? > > > > Kishore > > > > On Tue, Apr 19, 2016 at 6:08 AM, Markovitz, Dudu <dmarkov...@paypal.com> > wrote: > > Please try the following two options. > > Option 2 might be better, performance wise (depending of the data volume > and characteristics). > > > > P.s. > > I didn’t understand the explanation about the LEFT JOIN > > > > > > Dudu > > > > 1. > > > > *select* b.code > > ,b.*value* > > > > *from* b > > > > *left* *join* a > > > > *on* a.*type* = b.*type* > > *and* a.code like *case* b.code *when* > 'ALL' *then* '%' *else* b.code *end* > > *and* a.indicator like *case* b.indicatior *when* > 'ALL' *then* '%' *else* b.indicatior *end* > > *;* > > > > > > > > 2. > > > > *select* b.code > > ,b.*value* > > > > *from* b > > > > *left* *join* a > > > > *on* a.*type* = b.*type* > > *and* a.code = b.code > > *and* a.indicator = b.indicatior > > > > *where* b.code != 'ALL' > > *and* b.indicatior != 'ALL' > > > > *union* *all* > > > > *select* b.code > > ,b.*value* > > > > *from* b > > > > *left* *join* a > > > > *on* a.*type* = b.*type* > > *and* a.indicator = b.indicatior > > > > *where* b.code = 'ALL' > > *and* b.indicatior != 'ALL' > > > > *union* *all* > > > > *select* b.code > > ,b.*value* > > > > *from* b > > > > *left* *join* a > > > > *on* a.*type* = b.*type* > > *and* a.code = b.code > > > > *where* b.code != 'ALL' > > *and* b.indicatior = 'ALL' > > > > *union* *all* > > > > *select* b.code > > ,b.*value* > > > > *from* b > > > > *left* *join* a > > > > *on* a.*type* = b.*type* > > > > *where* b.code = 'ALL' > > *and* b.indicatior = 'ALL' > > *;* > > > > > > *From:* Kishore A [mailto:kishore.atmak...@gmail.com] > *Sent:* Tuesday, April 19, 2016 3:51 PM > *To:* user@hive.apache.org > *Subject:* Re: Question on Implementing CASE in Hive Join > > > > Hi Dudu, > > > > Actually we use both fields from left and right tables, I mentioned right > table just for my convenience to check whether ALL from right table can be > pulled as per join condition match. > > > > One more reason why we use left join is we should not have extra columns > after join. > > > > Kishore > > > > > > > > On Tue, Apr 19, 2016 at 5:46 AM, Markovitz, Dudu <dmarkov...@paypal.com> > wrote: > > Before dealing with the technical aspect, can you please explain what is > the point of using LEFT JOIN without selecting any field from table A? > > > > Thanks > > > > Dudu > > > > *From:* Kishore A [mailto:kishore.atmak...@gmail.com] > *Sent:* Tuesday, April 19, 2016 2:29 PM > *To:* user@hive.apache.org > *Subject:* Question on Implementing CASE in Hive Join > > > > Hi, > > > > I have a scenario to implement to cases in Hive Joins. I need to implement > case on the value on which join condition to be applied. > > > > Table A > > Code// Type// Indicator// Value// > > A 1 XYZ John > > B 1 PQR Smith > > C 2 XYZ John > > C 2 PQR Smith > > D 3 PQR Smith > > E 3 XYZ Smith > > F 4 MNO Smith > > G 3 MNO Smith > > ...