Re: pig + hadoop

Jeremy Hanna Tue, 19 Apr 2011 20:28:55 -0700

oh yeah - that's what's going on.  what I do is on the machine that I run the 
pig script from, I set the PIG_CONF variable to my HADOOP_HOME/conf directory 
and in my mapred-site.xml file found there, I set the three variables.


I don't use environment variables when I run against a cluster.

On Apr 19, 2011, at 9:54 PM, Jeffrey Wang wrote:

> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error for a 
> while before I added that.
>  
> -Jeffrey
>  
> From: pob [mailto:peterob...@gmail.com] 
> Sent: Tuesday, April 19, 2011 6:42 PM
> To: user@cassandra.apache.org
> Subject: Re: pig + hadoop
>  
> Hey Aaron,
>  
> I read it, and all of 3 env variables was exported. The results are same.
>  
> Best,
> P
> 
> 2011/4/20 aaron morton <aa...@thelastpickle.com>
> Am guessing but here goes. Looks like the cassandra RPC port is not set, did 
> you follow these steps in contrib/pig/README.txt
>  
> Finally, set the following as environment variables (uppercase,
> underscored), or as Hadoop configuration variables (lowercase, dotted):
> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on 
> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to 
> connect to
> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
>  
> Hope that helps. 
> Aaron
>  
>  
> On 20 Apr 2011, at 11:28, pob wrote:
> 
> 
> Hello, 
>  
> I did cluster configuration by 
> http://wiki.apache.org/cassandra/HadoopSupport. When I run pig 
> example-script.pig 
> -x local, everything is fine and i get correct results.
>  
> Problem is occurring with -x mapreduce 
>  
> Im getting those errors :>
>  
>  
> 2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.tools.pigstats.PigStats - 
> ERROR: java.lang.NumberFormatException: null
> 2011-04-20 01:24:21,792 [main] ERROR 
> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.tools.pigstats.PigStats - 
> Script Statistics: 
>  
> Input(s):
> Failed to read data from "cassandra://Keyspace1/Standard1"
>  
> Output(s):
> Failed to produce result in 
> "hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
>  
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 0
> Total records proactively spilled: 0
>  
> Job DAG:
> job_201104200056_0005   ->      null,
> null    ->      null,
> null
>  
>  
> 2011-04-20 01:24:21,793 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Failed!
> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1066: Unable to open iterator for alias topnames. Backend error : 
> java.lang.NumberFormatException: null
>  
>  
>  
> ====
> thats from jobtasks web management - error  from task directly:
>  
> java.lang.RuntimeException: java.lang.NumberFormatException: null
> at 
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.NumberFormatException: null
> at java.lang.Integer.parseInt(Integer.java:417)
> at java.lang.Integer.parseInt(Integer.java:499)
> at org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
> at 
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
> ... 5 more
>  
>  
>  
> Any suggestions where should be problem?
>  
> Thanks,
>  
>  
>

Re: pig + hadoop

Reply via email to