Thanks a lot Mark for your attention. But I think I cannot go for INNER join, the reason behind the fact being that I want all rows of Table3 irrespective of there is any row corresponding to that ID in Table 1 or Table 2. Probably I would have taken care of that while providing the example. The more refined example would be like below.
Look forward for your help. Thanks and regards, Souvik. P.S. Dropped one table from the earlier example and Now we have table 1 and table 2, we are looking for table 4. (Hope so it's more simple and it's exactly we need) *Table 1* ID Value 1 V11 2 V12 3 V13 *Table 2* ID_1 ID_2 Other_Column1 1 3 C1 2 1 C2 3 2 C3 4 9 C4 12 3 C5 1 8 C6 I want to formulate a table which would look like (*Table 4*) ID_1 ID_1_Value ID_2 ID_2_value Other_Column1 1 V11 3 V13 C1 2 V12 1 V11 C2 3 V13 2 V12 C3 4 NULL / Empty 9 NULL / Empty C4 12 NULL / Empty 3 V13 C5 1 V11 8 NULL / Empty C6 On Thu, Dec 27, 2012 at 9:06 PM, Mark Grover <grover.markgro...@gmail.com>wrote: > Souvik, > Let me begin by saying that simplifying the problem goes a long way in > helping us answer your question. You did it really nicely here, so thank > you for doing that. > > Why don't you use INNER JOIN instead of LEFT SEMI JOIN? You can express > the same query as INNER JOIN with no restrictions on what can be included > in the SELECT clause. LEFT SEMI JOIN just implements an "exists" like query > efficiently. If you want columns from the right table in your select list, > just do the regular (aka inefficient way) inner join. > > Of course, you can optimize inner joins as map joins, sorted merge join or > sorted merge bucketed joins depending on your use case. > > Mark > > On Thu, Dec 27, 2012 at 3:16 PM, Souvik Banerjee <souvikbaner...@gmail.com > > wrote: > >> Hi, >> >> I am struggling with a problem described below. >> Any help how to resolve this problem is highly appreciated. >> >> I have got few tables the structure is over simplified for the sake of >> describing the nature of the problem that I am facing. >> >> *Table 1* >> * * >> >> >> >> ID >> >> Value >> >> 1 >> >> V11 >> >> 2 >> >> V12 >> >> 3 >> >> V13 >> >> >> >> *Table 2* >> * * >> >> >> >> ID >> >> Value >> >> 1 >> >> V21 >> >> 2 >> >> V22 >> >> 3 >> >> V23 >> >> >> >> *Table 3* >> * * >> >> >> >> ID_1 >> >> ID_2 >> >> Other_Column1 >> >> 1 >> >> 3 >> >> C1 >> >> 2 >> >> 1 >> >> C2 >> >> 3 >> >> 2 >> >> C3 >> >> >> >> I want to formulate a table which would look like >> >> ID_1 >> >> ID_1_Value >> >> ID_2 >> >> ID_2_value >> >> Other_Column1 >> >> 1 >> >> V11 >> >> 3 >> >> V23 >> >> C1 >> >> 2 >> >> V12 >> >> 1 >> >> V21 >> >> C2 >> >> 3 >> >> V13 >> >> 2 >> >> V22 >> >> C3 >> >> >> >> I am facing problem with this. >> I tried to LEFT SEMI JOIN in Hive. >> >> I tried to do it in two steps (For 3 tables) >> First step I wanted to do a LEFT SEMI JOIN with TABLE 1 and TABLE 3. >> But the problem is that in LEFT SEMI JOIN you can not have columns from >> the right table in the select clause. So after join my new table simply >> looks like TABLE 3. >> >> Can you help me how I can achieve this is HIVE. >> >> Thanks and regards, >> Souvik. >> > >