interesting. don't know the answer but could you change the UNION in the Postgres to UNION ALL? I'd be curious if the default is UNION DISTINCT on that platform. That would at least partially explain postgres behaviour leaving hive the odd man out.
On Wed, Mar 12, 2014 at 6:47 AM, Martin Kudlej <mkud...@redhat.com> wrote: > Hi all, > > I've tried BigTop test for join_filters: > CREATE TABLE myinput1(key int, value int); > LOAD DATA LOCAL INPATH 'seed_data_files/in3.txt' INTO TABLE myinput1; > > where seed_data_files/in3.txt: > 12 35 > NULL 40 > 48 NULL > 100 100 > > I've tried: > SELECT * FROM myinput1 a FULL OUTER JOIN myinput1 b on a.key > 40 AND > a.value > 50 AND a.key = a.value AND b.key > 40 AND b.value > 50 AND b.key > = b.value ORDER BY a.key, a.value, b.key, b.value; > > and expected result in test is: > NULL NULL NULL 40 > NULL NULL NULL 40 > NULL NULL NULL 40 > NULL NULL NULL 40 > NULL NULL 12 35 > NULL NULL 12 35 > NULL NULL 12 35 > NULL NULL 12 35 > NULL NULL 48 NULL > NULL NULL 48 NULL > NULL NULL 48 NULL > NULL NULL 48 NULL > NULL 40 NULL NULL > NULL 40 NULL NULL > NULL 40 NULL NULL > NULL 40 NULL NULL > 12 35 NULL NULL > 12 35 NULL NULL > 12 35 NULL NULL > 12 35 NULL NULL > 48 NULL NULL NULL > 48 NULL NULL NULL > 48 NULL NULL NULL > 48 NULL NULL NULL > 100 100 NULL NULL > 100 100 NULL NULL > 100 100 NULL NULL > 100 100 100 100 > > > but real hive result is: > NULL NULL NULL 40 > NULL NULL 12 35 > NULL NULL 48 NULL > NULL 40 NULL NULL > 12 35 NULL NULL > 48 NULL NULL NULL > 100 100 100 100 > > btw. result from postgresql is: > (SELECT * > FROM myinput1 a > LEFT JOIN > myinput1 b on > a.key > 40 AND > a.value > 50 AND > a.key = a.value AND > b.key > 40 AND > b.value > 50 AND > b.key = b.value ORDER BY a.key, a.value, b.key, b.value) > UNION (SELECT * > FROM myinput1 a > RIGHT JOIN > myinput1 b on > a.key > 40 AND > a.value > 50 AND > a.key = a.value AND > b.key > 40 AND > b.value > 50 AND > b.key = b.value > ORDER BY a.key, a.value, b.key, b.value); > | | 12 | 35 > 12 | 35 | | > | | 48 | > 48 | | | > | 40 | | > | | | 40 > 100 | 100 | 100 | 100 > > so it's the same like in hive. > > What is the right result for this full outer join in HiveQL, please? > > -- > Best Regards, > Martin Kudlej. > MRG/Grid & RHS-Hadoop Senior Quality Assurance Engineer > Red Hat Czech s.r.o. > > Phone: +420 532 294 155 > E-mail:mkudlej at redhat.com > IRC: mkudlej at #brno, #messaging, #grid, #rhs, #distcomp >