Hi Mohan, The "Left Outer Join of three or more relations with EmptyBagToNullFields" section at https://engineering.linkedin.com/datafu/datafu-10 could probably useful for you.
Hope this helps. Regards Gufran Pathan| +91 7760913355| www.mu-sigma.com | Correlation does not imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing "look over there." -Randall Munroe -----Original Message----- From: Serega Sheypak [mailto:[email protected]] Sent: Wednesday, November 05, 2014 4:33 PM To: [email protected] Subject: Re: Unable to Join more than two datasets in PIG through JOIN ?? JOIN (outer) Performs an outer join of *two* relations based on common field values. 2014-11-05 13:22 GMT+03:00 Mohan <[email protected]>: > Hi There, > > I have Three Datasets as below - > > PROV: > (01) > (02) > (04) > (05) > (06) > (08) > (09) > (10) > (11) > (12) > (13) > (15) > (16) > (17) > (18) > (19) > (20) > > COUNT1: > (01,3) > (04,1) > (05,2) > (06,4) > (08,2) > (09,1) > > COUNT2: > (01,2) > (04,2) > (05,1) > (06,5) > (08,1) > (12,4) > (13,2) > > COUNT2: > (01,3) > (04,1) > (05,2) > (06,4) > (08,2) > (09,1) > > When I try to join these three datasets with below statement, > > MERGEOP = JOIN PROV BY $0 LEFT OUTER,COUNT1 BY $0,COUNT2 BY $0; *(I > would like to carry all the entries in PROV dataset )* > > I am getting an error message like - > > 2014-11-05 02:12:08,709 [main] ERROR org.apache.pig.tools.grunt.Grunt > - ERROR 1200: <line 12, column 49> mismatched input ',' expecting > SEMI_COLON Details at logfile: > /var/lib/hadoop-hdfs/pig_1415182150127.log > > But when I try the same by removing LEFT OUTER it works without > capturing all the entries in PROV. Could some one assist me on this? > > > Thanks, > Mohan > Disclaimer: http://www.mu-sigma.com/disclaimer.html
