I was able to reproduce the bug, I opened PIG-2932 to track it. Cheers, -- Gianmarco
On Wed, Sep 26, 2012 at 12:07 PM, Gianmarco De Francisci Morales < [email protected]> wrote: > Forwarding to pig-dev. > > Summary, it looks like we have a regression in trunk. > We need to investigate it before branching 0.11 > > Cheers, > -- > Gianmarco > > > > ---------- Forwarded message ---------- > From: Allan <[email protected]> > Date: Wed, Sep 26, 2012 at 11:21 AM > Subject: Re: e2e tests for Rank function > To: cheolsoo <[email protected]>, Gianmarco De Francisci Morales < > [email protected]> > > > Hi Cheolsoo and Gianmarco, > > I double check the e2e tests, and I reproduced the scenario and it's > correct...it's failing. > > Then, looking for a possible reason, I tried the following script: > > SET default_parallel 9; > A = LOAD 'prerank' using PigStorage(',') as > (rownumber:long,rankcabd:long,rankbdaa:long,rankbdca:long,rankaacd:long,rankaaba:long,a:int,b:int,c:int,tail:bytearray); > B = group A by (a, b); > C = foreach B generate flatten(group),A; > D = order C by group::a ASC, group::b ASC; > > > And it fails, with the same exception' message. > > Then, I tried the same script, but omitting the "SET default_parallel 9;" > and it works. So, I'm really surprised that on local mode it doesn't work > with parallelism. > > The reason for using this script is because RANK (RANK BY) operator uses > the same chain of operators: GROUP (B), a flatten (C), SORT (D). > > Best regards, > > On Sun, Sep 23, 2012 at 10:43 PM, Cheolsoo Park <[email protected]>wrote: > >> Hello, >> >> The e2e tests for Rank function in trunk do not pass for me when running >> in >> local mode. I am wondering whether they all pass for everyone. >> >> What I am doing is as following: >> >> ant clean >> ant -Dhadoopversion=20 ... test-e2e-deploy-local >> ant -Dhadoopversion=20 ... test-e2e-local -Dtests.to.run="-t Rank" >> >> All tests except Rank_4 fail with errors similar to this: >> >> java.io.IOException: Illegal partition for Null: false index: 0 (1,7) (1) >> at >> >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1073) >> at >> >> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691) >> at >> >> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) >> at >> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:123) >> at >> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:285) >> at >> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278) >> at >> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) >> at >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) >> >> I wanted to double check whether I am doing something wrong before I open >> a >> jira. >> >> Thanks, >> Cheolsoo >> > > > > -- > > Allan AvendaƱo S. > Computer Engineer > SWY22 Participant > GSOC 2012 Participant > Rome - Italy > Gmail: [email protected] > -- > > >
