Re: ORDER failed

Lei Liu Sat, 13 Apr 2013 02:57:16 -0700

I am sure it's not that. The ORDER command fails the whole thing. If I
remove the ORDER command, the same script runs just fine except the result
is not in order.



On Sat, Apr 13, 2013 at 4:54 PM, Prasanth J <[email protected]>wrote:

> From the error logs, it seems like input file doesn't exist or not
> accessible.
>
> > Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> > Input path does not exist:
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
>
> can you please check if the input path in $LOGS is proper?
>
> Thanks
> -- Prasanth
>
> On Apr 12, 2013, at 11:02 PM, Lei Liu <[email protected]> wrote:
>
> > Hi, I am using Pig to analyze the percentage of each UserAgents from an
> > apache log. The following program failed because of ORDER command at the
> > very last (the result variable is correct and can be dumped out
> correctly).
> > I am relative new to Pig and could not figure it out so need you guys to
> > help. Following is the program and error message. Thanks!
> >
> > logs = LOAD '$LOGS' USING ApacheCombinedLogLoader AS (remoteHost, hyphen,
> > user, time, method, uri, protocol, statusCode, responseSize, referer,
> > userAgent);
> >
> > uarows = FOREACH logs GENERATE userAgent;
> > total = FOREACH (GROUP uarows ALL) GENERATE COUNT(uarows) as count;
> > dump total;
> >
> > gpuarows = GROUP uarows BY userAgent;
> > result = FOREACH gpuarows {
> >       subtotal = COUNT(uarows);
> >       GENERATE flatten(group) as ua, subtotal AS SUB_TOTAL,
> > 100*(double)subtotal/(double)total.count AS percentage;
> >       };
> > orderresult = ORDER result BY SUB_TOTAL DESC;
> > dump orderresult;
> >
> > -- what's weird is that 'dump result' works just fine, so it's the ORDER
> > line makes trouble
> >
> > Errors:
> > 2013-04-13 10:36:32,409 [Thread-48] INFO
>  org.apache.hadoop.mapred.MapTask
> > - record buffer = 262144/327680
> > 2013-04-13 10:36:32,437 [Thread-48] WARN
> > org.apache.hadoop.mapred.LocalJobRunner - job_local_0005
> > java.lang.RuntimeException:
> > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
> > does not exist:
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> >    at
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:157)
> >    at
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> >    at
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >    at
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677)
> >    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
> >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> >    at
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> > Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> > Input path does not exist:
> >
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
> >    at
> >
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
> >    at
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
> >    at
> >
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
> >    at
> org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:177)
> >    at
> > org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:124)
> >    at
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:131)
> >    ... 6 more
> > 2013-04-13 10:36:32,525 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > - HadoopJobId: job_local_0005
> > 2013-04-13 10:36:32,526 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > - Processing aliases orderresult
> > 2013-04-13 10:36:32,526 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > - detailed locations: M: orderresult[19,14] C:  R:
> > 2013-04-13 10:36:37,536 [main] WARN
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to
> > stop immediately on failure.
> > 2013-04-13 10:36:37,536 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > - job job_local_0005 has failed! Stop running all dependent jobs
> > 2013-04-13 10:36:37,536 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > - 100% complete
> > 2013-04-13 10:36:37,537 [main] ERROR
> > org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> > 2013-04-13 10:36:37,538 [main] INFO
> > org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
> >
> > HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt
>  Features
> > 1.0.4    0.11.0    dliu    2013-04-13 10:35:50    2013-04-13 10:36:37
> > GROUP_BY,ORDER_BY
> >
> > Some jobs have failed! Stop running all dependent jobs
> >
> > Job Stats (time in seconds):
> > JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime
> > MedianMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime
> > MedianReducetime    Alias    Feature    Outputs
> > job_local_0002    1    1    n/a    n/a    n/a    n/a    n/a    n/a
> > 1-18,logs,total,uarows    MULTI_QUERY,COMBINER
> > job_local_0003    1    1    n/a    n/a    n/a    n/a    n/a    n/a
> > gpuarows,result    GROUP_BY,COMBINER
> > job_local_0004    1    1    n/a    n/a    n/a    n/a    n/a    n/a
> > orderresult    SAMPLER
> >
> > Failed Jobs:
> > JobId    Alias    Feature    Message    Outputs
> > job_local_0005    orderresult    ORDER_BY    Message: Job failed! Error -
> > NA    file:/tmp/temp-1225021115/tmp-62411972,
> >
> > Input(s):
> > Successfully read 0 records from:
> > "file:///home/dliu/ApacheLogAnalysisWithPig/access.log"
> >
> > Output(s):
> > Failed to produce result in "file:/tmp/temp-1225021115/tmp-62411972"
> >
> > Counters:
> > Total records written : 0
> > Total bytes written : 0
> > Spillable Memory Manager spill count : 0
> > Total bags proactively spilled: 0
> > Total records proactively spilled: 0
> >
> > Job DAG:
> > job_local_0002    ->    job_local_0003,
> > job_local_0003    ->    job_local_0004,
> > job_local_0004    ->    job_local_0005,
> > job_local_0005
> >
> >
> > 2013-04-13 10:36:37,539 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > - Some jobs have failed! Stop running all dependent jobs
> > 2013-04-13 10:36:37,541 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > ERROR 1066: Unable to open iterator for alias orderresult
> > Details at logfile:
> > /home/dliu/ApacheLogAnalysisWithPig/pig_1365820535568.log
>
>

Re: ORDER failed

Reply via email to