From the error logs, it seems like input file doesn't exist or not accessible. 

> Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> Input path does not exist:
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017

can you please check if the input path in $LOGS is proper?

Thanks
-- Prasanth

On Apr 12, 2013, at 11:02 PM, Lei Liu <[email protected]> wrote:

> Hi, I am using Pig to analyze the percentage of each UserAgents from an
> apache log. The following program failed because of ORDER command at the
> very last (the result variable is correct and can be dumped out correctly).
> I am relative new to Pig and could not figure it out so need you guys to
> help. Following is the program and error message. Thanks!
> 
> logs = LOAD '$LOGS' USING ApacheCombinedLogLoader AS (remoteHost, hyphen,
> user, time, method, uri, protocol, statusCode, responseSize, referer,
> userAgent);
> 
> uarows = FOREACH logs GENERATE userAgent;
> total = FOREACH (GROUP uarows ALL) GENERATE COUNT(uarows) as count;
> dump total;
> 
> gpuarows = GROUP uarows BY userAgent;
> result = FOREACH gpuarows {
>       subtotal = COUNT(uarows);
>       GENERATE flatten(group) as ua, subtotal AS SUB_TOTAL,
> 100*(double)subtotal/(double)total.count AS percentage;
>       };
> orderresult = ORDER result BY SUB_TOTAL DESC;
> dump orderresult;
> 
> -- what's weird is that 'dump result' works just fine, so it's the ORDER
> line makes trouble
> 
> Errors:
> 2013-04-13 10:36:32,409 [Thread-48] INFO  org.apache.hadoop.mapred.MapTask
> - record buffer = 262144/327680
> 2013-04-13 10:36:32,437 [Thread-48] WARN
> org.apache.hadoop.mapred.LocalJobRunner - job_local_0005
> java.lang.RuntimeException:
> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
> does not exist:
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:157)
>    at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>    at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>    at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677)
>    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>    at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> Input path does not exist:
> file:/home/dliu/ApacheLogAnalysisWithPig/pigsample_259943398_1365820592017
>    at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
>    at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
>    at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:177)
>    at
> org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:124)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:131)
>    ... 6 more
> 2013-04-13 10:36:32,525 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - HadoopJobId: job_local_0005
> 2013-04-13 10:36:32,526 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Processing aliases orderresult
> 2013-04-13 10:36:32,526 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - detailed locations: M: orderresult[19,14] C:  R:
> 2013-04-13 10:36:37,536 [main] WARN
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to
> stop immediately on failure.
> 2013-04-13 10:36:37,536 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - job job_local_0005 has failed! Stop running all dependent jobs
> 2013-04-13 10:36:37,536 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 100% complete
> 2013-04-13 10:36:37,537 [main] ERROR
> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> 2013-04-13 10:36:37,538 [main] INFO
> org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
> 
> HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
> 1.0.4    0.11.0    dliu    2013-04-13 10:35:50    2013-04-13 10:36:37
> GROUP_BY,ORDER_BY
> 
> Some jobs have failed! Stop running all dependent jobs
> 
> Job Stats (time in seconds):
> JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime
> MedianMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime
> MedianReducetime    Alias    Feature    Outputs
> job_local_0002    1    1    n/a    n/a    n/a    n/a    n/a    n/a
> 1-18,logs,total,uarows    MULTI_QUERY,COMBINER
> job_local_0003    1    1    n/a    n/a    n/a    n/a    n/a    n/a
> gpuarows,result    GROUP_BY,COMBINER
> job_local_0004    1    1    n/a    n/a    n/a    n/a    n/a    n/a
> orderresult    SAMPLER
> 
> Failed Jobs:
> JobId    Alias    Feature    Message    Outputs
> job_local_0005    orderresult    ORDER_BY    Message: Job failed! Error -
> NA    file:/tmp/temp-1225021115/tmp-62411972,
> 
> Input(s):
> Successfully read 0 records from:
> "file:///home/dliu/ApacheLogAnalysisWithPig/access.log"
> 
> Output(s):
> Failed to produce result in "file:/tmp/temp-1225021115/tmp-62411972"
> 
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 0
> Total records proactively spilled: 0
> 
> Job DAG:
> job_local_0002    ->    job_local_0003,
> job_local_0003    ->    job_local_0004,
> job_local_0004    ->    job_local_0005,
> job_local_0005
> 
> 
> 2013-04-13 10:36:37,539 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Some jobs have failed! Stop running all dependent jobs
> 2013-04-13 10:36:37,541 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1066: Unable to open iterator for alias orderresult
> Details at logfile:
> /home/dliu/ApacheLogAnalysisWithPig/pig_1365820535568.log

Reply via email to