Re: pig + hadoop

pob Wed, 20 Apr 2011 02:20:01 -0700

Hi,

that was the problem! Thanks, you should pick that stuff into your
documentation.



Thanks for help!


Best,
P

2011/4/20 Jeremy Hanna <jeremy.hanna1...@gmail.com>

> Just as an example:
>
>  <property>
>    <name>cassandra.thrift.address</name>
>    <value>10.12.34.56</value>
>  </property>
>  <property>
>    <name>cassandra.thrift.port</name>
>    <value>9160</value>
>  </property>
>  <property>
>    <name>cassandra.partitioner.class</name>
>    <value>org.apache.cassandra.dht.RandomPartitioner</value>
>  </property>
>
>
> On Apr 19, 2011, at 10:28 PM, Jeremy Hanna wrote:
>
> > oh yeah - that's what's going on.  what I do is on the machine that I run
> the pig script from, I set the PIG_CONF variable to my HADOOP_HOME/conf
> directory and in my mapred-site.xml file found there, I set the three
> variables.
> >
> > I don't use environment variables when I run against a cluster.
> >
> > On Apr 19, 2011, at 9:54 PM, Jeffrey Wang wrote:
> >
> >> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error
> for a while before I added that.
> >>
> >> -Jeffrey
> >>
> >> From: pob [mailto:peterob...@gmail.com]
> >> Sent: Tuesday, April 19, 2011 6:42 PM
> >> To: user@cassandra.apache.org
> >> Subject: Re: pig + hadoop
> >>
> >> Hey Aaron,
> >>
> >> I read it, and all of 3 env variables was exported. The results are
> same.
> >>
> >> Best,
> >> P
> >>
> >> 2011/4/20 aaron morton <aa...@thelastpickle.com>
> >> Am guessing but here goes. Looks like the cassandra RPC port is not set,
> did you follow these steps in contrib/pig/README.txt
> >>
> >> Finally, set the following as environment variables (uppercase,
> >> underscored), or as Hadoop configuration variables (lowercase, dotted):
> >> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening
> on
> >> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to
> connect to
> >> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
> >>
> >> Hope that helps.
> >> Aaron
> >>
> >>
> >> On 20 Apr 2011, at 11:28, pob wrote:
> >>
> >>
> >> Hello,
> >>
> >> I did cluster configuration by
> http://wiki.apache.org/cassandra/HadoopSupport. When I run pig
> example-script.pig
> >> -x local, everything is fine and i get correct results.
> >>
> >> Problem is occurring with -x mapreduce
> >>
> >> Im getting those errors :>
> >>
> >>
> >> 2011-04-20 01:24:21,791 [main] ERROR
> org.apache.pig.tools.pigstats.PigStats - ERROR:
> java.lang.NumberFormatException: null
> >> 2011-04-20 01:24:21,792 [main] ERROR
> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> >> 2011-04-20 01:24:21,793 [main] INFO
>  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> >>
> >> Input(s):
> >> Failed to read data from "cassandra://Keyspace1/Standard1"
> >>
> >> Output(s):
> >> Failed to produce result in
> "hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
> >>
> >> Counters:
> >> Total records written : 0
> >> Total bytes written : 0
> >> Spillable Memory Manager spill count : 0
> >> Total bags proactively spilled: 0
> >> Total records proactively spilled: 0
> >>
> >> Job DAG:
> >> job_201104200056_0005   ->      null,
> >> null    ->      null,
> >> null
> >>
> >>
> >> 2011-04-20 01:24:21,793 [main] INFO
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed!
> >> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1066: Unable to open iterator for alias topnames. Backend error :
> java.lang.NumberFormatException: null
> >>
> >>
> >>
> >> ====
> >> thats from jobtasks web management - error  from task directly:
> >>
> >> java.lang.RuntimeException: java.lang.NumberFormatException: null
> >> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
> >> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
> >> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
> >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> >> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >> Caused by: java.lang.NumberFormatException: null
> >> at java.lang.Integer.parseInt(Integer.java:417)
> >> at java.lang.Integer.parseInt(Integer.java:499)
> >> at
> org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
> >> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
> >> ... 5 more
> >>
> >>
> >>
> >> Any suggestions where should be problem?
> >>
> >> Thanks,
> >>
> >>
> >>
> >
>
>

Re: pig + hadoop

Reply via email to