That error is coming from the frontend: the jars must also be on the local classpath. Take a look at how contrib/pig/bin/pig_cassandra sets up $PIG_CLASSPATH.
-----Original Message----- From: "Christian Decker" <> Sent: Friday, August 13, 2010 11:30am To: Subject: Cassandra and Pig Hi all, I'm trying to get Pig to read data from a Cassandra cluster, which I thought trivial since Cassandra already provides me with the CassandraStorage class. Problem is that once I try executing a simple script like this: register /path/to/pig-0.7.0-core.jar;register /path/to/libthrift-r917130.jar; register /path/to/cassandra_loadfunc.jarrows = LOAD 'cassandra://Keyspace1/Standard1' USING org.apache.cassandra.hadoop.pig.CassandraStorage();cols = FOREACH rows GENERATE flatten($1);colnames = FOREACH cols GENERATE $0;namegroups = GROUP colnames BY $0;namecounts = FOREACH namegroups GENERATE COUNT($1), group;orderednames = ORDER namecounts BY $0;topnames = LIMIT orderednames 50;dump topnames; I just end up with a NoClassDefFoundError: ERROR - org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias topnames at org.apache.pig.PigServer.openIterator( at at at at at at org.apache.pig.Main.main( Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias topnames at at org.apache.pig.PigServer.openIterator( ... 6 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117: Unexpected error when launching map reduce job. at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig( at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute( at org.apache.pig.PigServer.executeCompiledLogicalPlan( at ... 7 more Caused by: java.lang.RuntimeException: Could not resolve error that occured when launching map reduce job: java.lang.NoClassDefFoundError: org/apache/thrift/TBase at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException( at java.lang.Thread.dispatchUncaughtException( I cannot think of a reason as to why. As far as I understood it Pig takes the jar files in the script, unpackages them, creates the execution plan for the script itself and then bundles it into a single jar again, then submits it to the HDFS from where it will be executed in Hadoop, right? I also checked that the class in question actually is in the libthrift jar, so what's going wrong? Regards, Chris