+User group Hi Bhooshan,
By default you should be running in MapReduce mode unless specified otherwise. Are you creating a PigServer object to run your jobs? Can you provide your code here? Sent from my iPhone On Apr 12, 2013, at 6:23 PM, Bhooshan Mogal <[email protected]> wrote: Apologies for the premature send. I may have some more information. After I applied the patch and set "pig.use.overriden.hadoop.configs=true", I saw an NPE (stacktrace below) and a message saying pig was running in exectype local - 2013-04-13 07:37:13,758 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: local 2013-04-13 07:37:13,760 [main] WARN org.apache.hadoop.conf.Configuration - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 2013-04-13 07:37:14,162 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: Pig script failed to parse: <file test.pig, line 1, column 4> pig script failed to validate: java.lang.NullPointerException Here is the stacktrace = org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Pig script failed to parse: <file test.pig, line 1, column 4> pig script failed to validate: java.lang.NullPointerException at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1606) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1549) at org.apache.pig.PigServer.registerQuery(PigServer.java:549) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:971) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:190) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:555) at org.apache.pig.Main.main(Main.java:111) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: Failed to parse: Pig script failed to parse: <file test.pig, line 1, column 4> pig script failed to validate: java.lang.NullPointerException at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1598) ... 14 more Caused by: <file test.pig, line 1, column 4> pig script failed to validate: java.lang.NullPointerException at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:438) at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3168) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1291) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177) ... 15 more On Fri, Apr 12, 2013 at 6:16 PM, Bhooshan Mogal <[email protected]>wrote: > Yes, however I did not add core-site.xml, hdfs-site.xml, yarn-site.xml. > Only my-filesystem-site.xml using both Configuration.addDefaultResource and > Configuration.addResource. > > I see what you are saying though. The patch might require users to take > care of adding the default config resources as well apart from their own > resources? > > > On Fri, Apr 12, 2013 at 6:06 PM, Prashant Kommireddi > <[email protected]>wrote: > >> Did you set "pig.use.overriden.hadoop.configs=true" and then add your >> configuration resources? >> >> >> On Fri, Apr 12, 2013 at 5:32 PM, Bhooshan Mogal <[email protected] >> > wrote: >> >>> Hi Prashant, >>> >>> Thanks for your response to my question, and sorry for the delayed >>> reply. I was not subscribed to the dev mailing list and hence did not get a >>> notification about your reply. I have copied our thread below so you can >>> get some context. >>> >>> I tried the patch that you pointed to, however with that patch looks >>> like pig is unable to find core-site.xml. It indicates that it is running >>> the script in local mode inspite of having fs.default.name defined as >>> the location of the HDFS namenode. >>> >>> Here is what I am trying to do - I have developed my own >>> org.apache.hadoop.fs.FileSystem implementation and am trying to use it in >>> my pig script. This implementation requires its own *-default and >>> *-site.xml files. I have added the path to these files in PIG_CLASSPATH as >>> well as HADOOP_CLASSPATH and can confirm that hadoop can find these files, >>> as I am able to read these configurations in my code. However, pig code >>> cannot find these configuration parameters. Upon doing some debugging in >>> the pig code, it seems to me that pig does not use all the resources added >>> in the Configuration object, but only seems to use certain specific ones >>> like hadoop-site, core-site, pig-cluster-hadoop-site.xml,yarn-site.xml, >>> hdfs-site.xml (I am looking at HExecutionEngine.java). Is it possible to >>> have pig load user-defined resources like say foo-default.xml and >>> foo-site.xml while creating the JobConf object? I am narrowing on this as >>> the problem, because pig can find my config parameters if I define them in >>> core-site.xml instead of my-filesystem-site.xml. >>> >>> Let me know if you need more details about the issue. >>> >>> >>> Here is our previous conversation - >>> >>> Hi Bhooshan, >>> >>> There is a patch that addresses what you need, and is part of 0.12 >>> (unreleased). Take a look and see if you can apply the patch to the version >>> you are using.https://issues.apache.org/jira/browse/PIG-3135. >>> >>> With this patch, the following property will allow you to override the >>> default and pass in your own configuration. >>> pig.use.overriden.hadoop.configs=true >>> >>> >>> On Thu, Mar 28, 2013 at 6:10 PM, Bhooshan Mogal >>> <[email protected]>wrote: >>> >>> > Hi Folks, >>> > >>> > I had implemented the Hadoop FileSystem abstract class for a storage >>> > system >>> > at work. This implementation uses some config files that are similar in >>> > structure to hadoop config files. They have a *-default.xml and a >>> > *-site.xml for users to override default properties. In the class that >>> > implemented the Hadoop FileSystem, I had added these configuration files >>> > as >>> > default resources in a static block using >>> > Configuration.addDefaultResource("my-default.xml") and >>> > Configuration.addDefaultResource("my-site.xml". This was working fine and >>> > we were able to run the Hadoop Filesystem CLI and map-reduce jobs just >>> > fine >>> > for our storage system. However, when we tried using this storage system >>> > in >>> > pig scripts, we saw errors indicating that our configuration parameters >>> > were not available. Upon further debugging, we saw that the config files >>> > were added to the Configuration object as resources, but were part of >>> > defaultResources. However, in Main.java in the pig source, we saw that the >>> > Configuration object was created as Configuration conf = new >>> > Configuration(false);, thereby setting loadDefaults to false in the conf >>> > object. As a result, properties from the default resources (including my >>> > config files) were not loaded and hence, unavailable. >>> > >>> > We solved the problem by using Configuration.addResource instead of >>> > Configuration.addDefaultResource, but still could not figure out why Pig >>> > does not use default resources? >>> > >>> > Could someone on the list explain why this is the case? >>> > >>> > Thanks, >>> > -- >>> > Bhooshan >>> > >>> >>> >>> >>> -- >>> Bhooshan >>> >> >> > > > -- > Bhooshan > -- Bhooshan
