Re: Reading Cassandra Data From Pig/Hadoop

2014-05-30 Thread Kevin Burton
There's a pig-with-cassandra script somewhere you should be using. It adds the jars, etc. One issue, is that you need to call register on the .jars from your pig scripts. Honestly, someone should write an example pig setup with modern hadoop, all the right register commands, real UPDATE queries

Re: Reading Cassandra Data From Pig/Hadoop

2014-05-30 Thread James Schappet
To specify your cassandra cluster, you only need to define one node: In you profile or batch command set and export these variables: export PIG_HOME= export PIG_INITIAL_ADDRESS=localhost export PIG_RPC_PORT=9160 # the partitioner must match your cassandra partitioner export PIG_PARTITIONER=or

Reading Cassandra Data From Pig/Hadoop

2014-05-30 Thread Alex McLintock
I am reasonably experienced with Hadoop and Pig but less so with Cassandra. I have been banging my head against the wall as all the documentation assumes I know something... I am using Apache's tarball of Cassandra 1.something and I see that there are some example pig scripts and a shell script to

Re: pig + hadoop

2011-04-20 Thread pob
onf >> directory and in my mapred-site.xml file found there, I set the three >> variables. >> > >> > I don't use environment variables when I run against a cluster. >> > >> > On Apr 19, 2011, at 9:54 PM, Jeffrey Wang wrote: >> > >> >>

Re: pig + hadoop

2011-04-20 Thread pob
wrote: > > > >> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error > for a while before I added that. > >> > >> -Jeffrey > >> > >> From: pob [mailto:peterob...@gmail.com] > >> Sent: Tuesday, April 19, 2011 6:42 PM

Re: pig + hadoop

2011-04-20 Thread pob
gt; > >> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error > for a while before I added that. > >> > >> -Jeffrey > >> > >> From: pob [mailto:peterob...@gmail.com] > >> Sent: Tuesday, April 19, 2011 6:42 PM > >> To

Re: pig + hadoop

2011-04-19 Thread Jeremy Hanna
On Apr 19, 2011, at 9:54 PM, Jeffrey Wang wrote: > >> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error for >> a while before I added that. >> >> -Jeffrey >> >> From: pob [mailto:peterob...@gmail.com] >> Sent: Tuesday, Apr

Re: pig + hadoop

2011-04-19 Thread Jeremy Hanna
To: user@cassandra.apache.org > Subject: Re: pig + hadoop > > Hey Aaron, > > I read it, and all of 3 env variables was exported. The results are same. > > Best, > P > > 2011/4/20 aaron morton > Am guessing but here goes. Looks like the cassandra RPC

RE: pig + hadoop

2011-04-19 Thread Jeffrey Wang
Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error for a while before I added that. -Jeffrey From: pob [mailto:peterob...@gmail.com] Sent: Tuesday, April 19, 2011 6:42 PM To: user@cassandra.apache.org Subject: Re: pig + hadoop Hey Aaron, I read it, and all of 3 env

Re: pig + hadoop

2011-04-19 Thread pob
and one more thing... 2011-04-20 04:09:23,412 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201104200406_0001/attempt_201104200406_0001_m_02_0/output/file.out in any of the configured local directories

Re: pig + hadoop

2011-04-19 Thread pob
Thats from jobtracker: 2011-04-20 03:36:39,519 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_201104200331_0002_m_00 2011-04-20 03:36:42,521 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201104200331_0002_m_00_3: java.lang.NumberFormatExcepti

Re: pig + hadoop

2011-04-19 Thread pob
ad2. it works with -x local , so there cant be issue with pig->DB(Cassandra). im using pig-0.8 from official site + hadoop-0.20.2 from offic. site. thx 2011/4/20 aaron morton > Am guessing but here goes. Looks like the cassandra RPC port is not set, > did you follow these steps in contrib/pi

Re: pig + hadoop

2011-04-19 Thread pob
Hey Aaron, I read it, and all of 3 env variables was exported. The results are same. Best, P 2011/4/20 aaron morton > Am guessing but here goes. Looks like the cassandra RPC port is not set, > did you follow these steps in contrib/pig/README.txt > > Finally, set the following as environment va

Re: pig + hadoop

2011-04-19 Thread aaron morton
Am guessing but here goes. Looks like the cassandra RPC port is not set, did you follow these steps in contrib/pig/README.txt Finally, set the following as environment variables (uppercase, underscored), or as Hadoop configuration variables (lowercase, dotted): * PIG_RPC_PORT or cassandra.thrift.

pig + hadoop

2011-04-19 Thread pob
Hello, I did cluster configuration by http://wiki.apache.org/cassandra/HadoopSupport. When I run pig example-script.pig -x local, everything is fine and i get correct results. Problem is occurring with -x mapreduce Im getting those errors :> 2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.