So I've got two relations: pageview counts by a GUID and URL and events by
the same GUID and url. I'm trying to join them, but I keep getting this
error: "ERROR 2087: Unexpected problem during optimization. Found index:0
in multiple LocalRearrange operators." I've Googled it, but I'm mostly just
coming up with the Pig source code, but not an explanation of what it is or
means. I've tried making sure I don't have any nulls or empty strings in
the keys

Anyone have any idea what I'm doing wrong?

Here's some sample data:

describe pv_counts: {group::pv_site_guid:chararray,
group::pv_hostname:chararray, pv_count:long}

dump pv_counts:
(bSAw-mF-0r4Q-4acwqm_6r,example-url.com,10)
(bSAw-mF-0r4Q-4acwqm_6r,sports.example-url.com,10)
(bSAw-mF-0r4Q-4acwqm_6r,opinion.example-url.com,10)
(bSAw-mF-0r4Q-4acwqm_6r,newsinfo.example-url.com,10)
(bSAw-mF-0r4Q-4acwqm_6r,lifestyle.example-url.com,10)
.... many more pageviews than events ....
(dZiLDGjsGr3O3zacn9QLBk,example-url2.com.com,10)
(dZiLDGjsGr3O3zacn9QLBk,example-url3.com,10)

describe ev_counts: {group::ev_site_guid:chararray,
group::ee_hostname:chararray, ev1count:long, ev2count:long, ev3count:long,
ev4count:long, ev5count:long}

dump ev_counts:
(bSAw-mF-0r4Q-4acwqm_6r,example-url.com,29,0,0,0,0)
(bSAw-mF-0r4Q-4acwqm_6r,sports.example-url.com,7,0,0,0,0)
(bSAw-mF-0r4Q-4acwqm_6r,lifestyle.example-url.com,2,0,0,0,0)
.... not as many events as pageviews ....
(dZiLDGjsGr3O3zacn9QLBk,example-url2.com.com,0,0,37,0,0)
(dZiLDGjsGr3O3zacn9QLBk,example-url3.com,0,0,1,0,0)

I can dump them just fine in Pig and Grunt.

When I add this statement, it gets to the very end and dies:

joined_counts = JOIN ev_counts BY ev_site_guid, pv_counts BY pv_site_guid;
dump joined_counts;

It'll throw the "ERROR 2087: Unexpected problem during optimization. Found
index:0 in multiple LocalRearrange operators." error and an ugly
stacktrace. I'm pretty new to pig and so I've never dug into it's internals.

If anyone had any tips or things to try, I'd gladly try them. I've tried
the script on Pig 10 and 11. Both of them seem to choke at the same point.
We're running on Cloudera's CDH3U3 (0.20.2).

I also posted this question on SO if it's easier to answer there:
http://stackoverflow.com/questions/16699754/apache-pig-join-error-2087-found-index0-in-multiple-localrearrange-operators

Thanks!!!

Reply via email to