Played around with this some more. Got some interesting results. 

It turns out that having two STORE commands in the script is what is causing it 
to fail. If I comment out either of them, the script will run and produce the 
other result. Because of that, we know that the code used in both of those 
paths is ok.

Also, if I copy the script & run it from the grunt prompt, both outputs work 
fine. I imagine this is because the prompt runs one output at a time.

I'd say at this point this looks like a bug in pig. Especially with the NPE in 
the stack trace, I'd say this is not expected. 

>From the line numbers in the version of pig that I'm running, it appears to be 
>this bit of code (line 789). It looks like operationID is not present in the 
>globalCounters map, and thus when you call iterator() you get the NPE.

 

787 
(http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.11.1/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?av=f#787)


                 while(operationIDs.hasNext()) {


788 
(http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.11.1/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?av=f#788)


                     String operationID = operationIDs.next();


789 
(http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.11.1/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?av=f#789)


                     Iterator<Pair<String, Long>> itPairs = 
globalCounters.get(operationID).iterator();


790 
(http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.11.1/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?av=f#790)


                     Pair<String,Long> pair = null;


791 
(http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.11.1/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?av=f#791)


                     while(itPairs.hasNext()) {


792 
(http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.11.1/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?av=f#792)


                         pair = itPairs.next();


793 
(http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.11.1/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?av=f#793)


                         conf.setLong(pair.first, pair.second);


794 
(http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.11.1/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?av=f#794)


                     }


795 
(http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.11.1/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?av=f#795)


                 }


796 
(http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.11.1/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?av=f#796)


    



-Matt


On Monday, October 6, 2014 at 8:54 PM, Sunil S Nandihalli wrote:

> The input file-directory tarred and gzipped is here 
> (https://transfer.sh/Nmnkk/rawlogs.tgz) . The Jar file which contains all the 
> udfs is here (https://transfer.sh/JpSKg/pigpen.jar)
> 
> On Tue, Oct 7, 2014 at 9:07 AM, Sunil S Nandihalli 
> <[email protected] (mailto:[email protected])> wrote:
> > Hi Everybody,
> >  The pig script mba.pig (https://gist.github.com/97073ae7bf16d8be5532) is 
> > giving me the following error when run. This is a PigPen generated script. 
> > the log (https://gist.github.com/228a84351440f7b15e62) is here 
> > (https://gist.github.com/228a84351440f7b15e62). The last few lines of the 
> > stdout is 
> > 
> > 2014-10-07 03:18:14,252 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader
> >  - Current split being processed 
> > file:/tmp/temp-923128527/tmp204410789/part-r-00000:0+0
> > 2014-10-07 03:18:14,259 [LocalJobRunner Map Task Executor #0] WARN  
> > org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already 
> > been initialized
> > 2014-10-07 03:18:14,281 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map 
> > - Aliases being processed per job phase (AliasName[line,offset]): M: 
> > generate6660[329,15],union6387[332,12],generate6661[336,15],generate6662[341,15],generate6663[349,15]
> >  C:  R: 
> > 2014-10-07 03:18:14,291 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.hadoop.mapred.LocalJobRunner - 
> > 2014-10-07 03:18:14,291 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.hadoop.mapred.Task - Task:attempt_local710497996_0012_m_000001_0 
> > is done. And is in the process of committing
> > 2014-10-07 03:18:14,294 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.hadoop.mapred.LocalJobRunner - 
> > 2014-10-07 03:18:14,294 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.hadoop.mapred.Task - Task attempt_local710497996_0012_m_000001_0 
> > is allowed to commit now
> > 2014-10-07 03:18:14,296 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output 
> > of task 'attempt_local710497996_0012_m_000001_0' to 
> > file:/home/hdfs/sunil/mobster-knowledge-clj/hadoop-repl/sunil/output/mba/app-install.clj/_temporary/0/task_local710497996_0012_m_000001
> > 2014-10-07 03:18:14,298 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output 
> > of task 'attempt_local710497996_0012_m_000001_0' to 
> > file:/tmp/temp-923128527/tmp927324561/_temporary/0/task_local710497996_0012_m_000001
> > 2014-10-07 03:18:14,299 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.hadoop.mapred.LocalJobRunner - map
> > 2014-10-07 03:18:14,299 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.hadoop.mapred.Task - Task 
> > 'attempt_local710497996_0012_m_000001_0' done.
> > 2014-10-07 03:18:14,299 [LocalJobRunner Map Task Executor #0] INFO  
> > org.apache.hadoop.mapred.LocalJobRunner - Finishing task: 
> > attempt_local710497996_0012_m_000001_0
> > 2014-10-07 03:18:14,299 [Thread-147] INFO  
> > org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
> > 2014-10-07 03:18:14,722 [main] INFO  
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> >  - 63% complete
> > 2014-10-07 03:18:14,724 [main] WARN  
> > org.apache.pig.tools.pigstats.PigStatsUtil - Failed to get RunningJob for 
> > job job_local710497996_0012
> > 2014-10-07 03:18:14,728 [main] INFO  org.apache.pig.tools.pigstats.JobStats 
> > - using output size reader: 
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader
> > 2014-10-07 03:18:14,731 [main] INFO  
> > org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added 
> > to the job
> > 2014-10-07 03:20:16,529 [main] INFO  
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> >  - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> > 2014-10-07 03:20:16,534 [main] INFO  
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> >  - Reduce phase detected, estimating # of required reducers.
> > 2014-10-07 03:20:16,535 [main] INFO  
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> >  - Setting Parallelism to 1
> > 2014-10-07 03:20:16,547 [main] INFO  
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> >  - Setting up multi store job
> > 2014-10-07 03:20:16,561 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
> > ERROR 2017: Internal error creating job configuration.
> > 
> > 
> > Can somebody help me figure out what is happening.
> > Thanks,
> > Sunil.
> > 
> > 
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "PigPen Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> (mailto:[email protected]).
> For more options, visit https://groups.google.com/d/optout.

Reply via email to