There's a pig-with-cassandra script somewhere you should be using. It adds the jars, etc.
One issue, is that you need to call register on the .jars from your pig scripts. Honestly, someone should write an example pig setup with modern hadoop, all the right register commands, real UPDATE queries encoded, and explain the whole thing. Took me like 2 days to get working and there are also gotchas in your pig scripts. And the fact that the output from cql is not encoded in tuples but the input must be is insane and maddening and VERY VERY VERY prone to error. On Fri, May 30, 2014 at 10:10 AM, James Schappet <jschap...@gmail.com> wrote: > To specify your cassandra cluster, you only need to define one node: > > In you profile or batch command set and export these variables: > > export PIG_HOME=<PATH TO PIG INSTALL> > > export PIG_INITIAL_ADDRESS=localhost > > export PIG_RPC_PORT=9160 > > # the partitioner must match your cassandra partitioner > export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner > > > > > http://www.schappet.com/pig_cassandra_bulk_load/ > > —Jimmy > > > > On May 30, 2014, at 11:50 AM, Alex McLintock <a...@owal.co.uk> wrote: > > I am reasonably experienced with Hadoop and Pig but less so with > Cassandra. I have been banging my head against the wall as all the > documentation assumes I know something... > > I am using Apache's tarball of Cassandra 1.something and I see that there > are some example pig scripts and a shell script to run them with the > cassandra jars. > > What I don't understand is how you tell the pig script which machine the > cassandra cluster talks to. You only specify the keyspace right - which > roughly corresponds to the database/table, but not which cluster. > > Can you tell what I have missed? Does the hadoop nodes HAVE to be on the > same machines as the Cassandra nodes? > > I am using CQL storage I think. > > eg > > > > -- CqlStorage > libdata = LOAD 'cql://libdata/libout' USING CqlStorage(); > > book_by_mail = FILTER libdata BY C_OUT_TY == 'BM'; > > etc etc > > > > Thanks all... > > > > > > > -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile <https://plus.google.com/102718274791889610666/posts> <http://spinn3r.com> War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.