Yep, all you have to do is upgrade to Pig 8... This sort of thing is one of the reasons Load/Store interfaces were completely redesigned after Pig 0.6.
On Wed, Mar 2, 2011 at 8:46 PM, Kris Coward <[email protected]> wrote: > > Yep. That did it. Now if you don't mind my asking, is there any way to > direct LzoTokenizedStorage to put that extension on the part files when > it's writing them in the first place? > > -K > > On Wed, Mar 02, 2011 at 03:17:09PM -0800, Dmitriy Ryaboy wrote: > > Oh. > > Yea we expect LZO files to have a .lzo extension. > > > > D > > > > On Wed, Mar 2, 2011 at 12:16 PM, Kris Coward <[email protected]> wrote: > > > > > > > > I might still be missing something useful (we're running elephant-bird > > > from the gpl-packing distribution, and I've registered most of the > > > jarfiles from it), but the strack trace has changed a little, so now > > > it's producing: > > > > > > Backend error message during job submission > > > ------------------------------------------- > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118: > Unable to > > > create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile > > > at > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269) > > > at > > > org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810) > > > at > > > > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781) > > > at > org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730) > > > at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) > > > at > > > > org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) > > > at > > > org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) > > > at java.lang.Thread.run(Thread.java:662) > > > Caused by: org.apache.pig.PigException: ERROR 0: no files found a path > > > hdfs://master.hadoop:9000/hadooptest/lzofile > > > at > com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.slice(Unknown > > > Source) > > > at > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:260) > > > ... 7 more > > > > > > Pig Stack Trace > > > --------------- > > > ERROR 2997: Unable to recreate exception from backend error: > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118: > Unable to > > > create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile > > > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable > to > > > open iterator for alias test4 > > > at org.apache.pig.PigServer.openIterator(PigServer.java:482) > > > at > > > > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539) > > > at > > > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241) > > > at > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) > > > at > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) > > > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) > > > at org.apache.pig.Main.main(Main.java:352) > > > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR > > > 2997: Unable to recreate exception from backend error: > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2118: > Unable to > > > create input slice for: hdfs://master.hadoop:9000/hadooptest/lzofile > > > at > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176) > > > at > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253) > > > at > > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249) > > > at > > > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781) > > > at org.apache.pig.PigServer.store(PigServer.java:529) > > > at org.apache.pig.PigServer.openIterator(PigServer.java:465) > > > ... 6 more > > > > > > > ================================================================================ > > > > > > The "ERROR 0: no files found a path > > > hdfs://master.hadoop:9000/hadooptest/lzofile" > > > message has me really puzzled because in grunt I can see the files, I > > > can copy them to local, I can rename them with .lzo on the end, > > > uncompress them, and see the data that I expect, and I can even load > > > them with PigLoader (though obviously the data's all wrong when I do > > > that). > > > > > > Any more tips? > > > > > > Thanks, > > > Kris > > > > > > On Wed, Mar 02, 2011 at 09:32:47AM -0800, Dmitriy Ryaboy wrote: > > > > Off the top of my head, I can't think of anything, but you can just > grab > > > > everything in Elephant-Bird's lib/ directory and make sure it's on > the > > > > classpath on all the task trackers and your client machine (you can > > > > propagate it to the TTs via the register keyword if you don't want to > bug > > > > your hadoop sysadmin and restart things). > > > > > > > > D > > > > > > > > On Wed, Mar 2, 2011 at 9:25 AM, Kris Coward <[email protected]> wrote: > > > > > > > > > > > > > > Nope; they're reproduced across all the machines. Does the > > > > > LzoTokenizedLoader class have any dependencies that > LzoTokenizedStorage > > > > > doesn't (which I may be overlooking)? > > > > > > > > > > -K > > > > > > > > > > On Tue, Mar 01, 2011 at 07:17:10PM -0500, Kris Coward wrote: > > > > > > > > > > > > What's peculiar is that the test script for the loader class that > was > > > > > > run a week ago seems also to be failing with the same error. > We've > > > added > > > > > > nodes to the cluster; maybe the relevant .jar files haven't been > > > copied > > > > > > over to those nodes. I'll bug our sysadmin about that.. > > > > > > > > > > > > Thanks, > > > > > > Kris > > > > > > > > > > > > On Tue, Mar 01, 2011 at 02:08:32PM -0800, Dmitriy Ryaboy wrote: > > > > > > > Kris, > > > > > > > Check the pig log file. Often "unable to create input slice" is > > > caused > > > > > by > > > > > > > errors such as not being able to find your loader class, or > some > > > > > dependency > > > > > > > of your loader class. > > > > > > > > > > > > > > D > > > > > > > > > > > > > > On Tue, Mar 1, 2011 at 1:48 PM, Kris Coward <[email protected]> > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > I get the output: > > > > > > > > > > > > > > > > rw-r--r-- 2 kris supergroup 172694 2011-02-25 01:59 > > > > > > > > /path/to/file/item/ex/subdir > > > > > > > > > > > > > > > > -K > > > > > > > > > > > > > > > > On Tue, Mar 01, 2011 at 12:46:31PM -0800, Dmitriy Ryaboy > wrote: > > > > > > > > > What happens when you "hadoop fs -lsr" those paths? > > > > > > > > > > > > > > > > > > D > > > > > > > > > > > > > > > > > > On Sun, Feb 27, 2011 at 7:47 PM, Kris Coward < > [email protected]> > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So I finally got a couple of test scripts running on my > > > cluster > > > > > to take > > > > > > > > > > a sample data file, load it, do a little processing, > store > > > it, > > > > > load it, > > > > > > > > > > do a little more processing, and dump the results. > > > > > > > > > > > > > > > > > > > > Once these were working, I set to parsing and storing > some > > > real > > > > > data, > > > > > > > > > > but when got an "Unable to create input slice" error when > > > trying > > > > > to > > > > > > > > load > > > > > > > > > > this data back out again. This happened with each of: > > > > > > > > > > > > > > > > > > > > foo = LOAD '/path/to/file/{item,list,glob}/*/subdir' > USING > > > > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') > AS > > > > > > > > (schema:...); > > > > > > > > > > foo = LOAD '/path/to/file/item/*/subdir' USING > > > > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') > AS > > > > > > > > (schema:...); > > > > > > > > > > foo = LOAD '/path/to/file/item/ex/subdir' USING > > > > > > > > > > com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') > AS > > > > > > > > (schema:...); > > > > > > > > > > > > > > > > > > > > and yielded the error (the same each time, except for the > > > > > name/glob > > > > > > > > > > used): > > > > > > > > > > > > > > > > > > > > ERROR 2997: Unable to recreate exception from backend > error: > > > > > > > > > > org.apache.pig.backend.executionengine.ExecException: > ERROR > > > 2118: > > > > > > > > Unable to > > > > > > > > > > create input slice for: > > > > > > > > > > hdfs://master.hadoop:9000//path/to/file/item/ex/subdir > > > > > > > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR > > > 1066: > > > > > Unable > > > > > > > > to > > > > > > > > > > open iterator for alias foo > > > > > > > > > > at > > > > > org.apache.pig.PigServer.openIterator(PigServer.java:482) > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539) > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241) > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) > > > > > > > > > > at > org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) > > > > > > > > > > at org.apache.pig.Main.main(Main.java:352) > > > > > > > > > > Caused by: > > > org.apache.pig.backend.executionengine.ExecException: > > > > > ERROR > > > > > > > > > > 2997: Unable to recreate exception from backend error: > > > > > > > > > > org.apache.pig.backend.executionengine.ExecException: > ERROR > > > 2118: > > > > > > > > Unable to > > > > > > > > > > create input slice for: > > > > > > > > > > hdfs://master.hadoop:9000/path/to/file/item/ex/subdir > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:176) > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:253) > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249) > > > > > > > > > > at > > > > > > > > > > > > > > > > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781) > > > > > > > > > > at > org.apache.pig.PigServer.store(PigServer.java:529) > > > > > > > > > > at > > > > > org.apache.pig.PigServer.openIterator(PigServer.java:465) > > > > > > > > > > ... 6 more > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Anyone have any suggestions why this may be happening and > how > > > to > > > > > fix > > > > > > > > it? > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Kris > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Kris Coward > > > > > > > > http://unripe.melon.org/ > > > > > > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 > > > 1FEB > > > > > 12B3 > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Kris Coward > > > > > http://unripe.melon.org/ > > > > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 > 1FEB > > > 12B3 > > > > > > > > > > > > > > > > > > > > -- > > > > > > Kris Coward > > > http://unripe.melon.org/ > > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB > 12B3 > > > > > > > > > > -- > > > > > Kris Coward > > > http://unripe.melon.org/ > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3 > > > > > > > > > > > -- > > > Kris Coward > http://unripe.melon.org/ > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3 > > > > > -- > Kris Coward http://unripe.melon.org/ > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3 >
