hmmm, did you build the cassandra src in the root of your cassandra directory with ant? sounds like it can't find that cassandra class. That's required.
On Jun 20, 2011, at 2:05 PM, Sasha Dolgy wrote: > Hi ... I still have the same problem with pig-0.8.0-cdh3u0... > > Maybe I'm doing something wrong. Where does > org/apache/cassandra/db/marshal/TypeParser exist, or should exist? > > It's not in the $CASSANDRA_HOME/libs or > /usr/local/src/pig-0.8.0-cdh3u0/lib or > /usr/local/src/apache-cassandra-0.8.0-src/build/lib/jars > > > for jar in `ls *.jar` > do > jar -tf $jar | grep TypeParser > if [ $? -eq 0 ]; then > echo $jar > fi > done > > Shows me nothing in all the lib dirs.... > > > > On Mon, Jun 20, 2011 at 8:44 PM, Jeremy Hanna > <jeremy.hanna1...@gmail.com> wrote: >> Try running with cdh3u0 version of pig and see if it has the same problem. >> They backported the patch (to pig 0.9 which should be out in time for the >> hadoop summit next week) that adds the updated jackson dependency for avro. >> The download URL for that is - >> http://archive.cloudera.com/cdh/3/pig-0.8.0-cdh3u0.tar.gz >> >> Alternatively, I believe today brisk beta 2 will be out which has pig >> integrated. Not sure if that would work for your current environment though. >> >> See if that works. >> On Jun 20, 2011, at 1:09 PM, Sasha Dolgy wrote: >> >>> Been trying for the past little bit to try and get the PIG integration >>> working with Cassandra 0.8.0 >>> >>> 1. Downloaded the src for 0.8.0 and ran ant build >>> 2. went into contrib/pig and ran ant ... gives me: >>> /usr/local/src/apache-cassandra-0.8.0-src/contrib/pig/build/cassandra_storage.jar >>> and is copied into the lib/ directory >>> 3. Downloaded pig-0.8.1, modified the ivy/libraries.properties so >>> that it uses Jackson 1.8.2 .. and ran ant. it compiles and gives me >>> two jars: pig-0.8.1-SNAPSHOT-core.jar and pig-0.8.1-SNAPSHOT.jar >>> ----- I did try to run it with Jackson 1.4 as the >>> contrib/pig/README.txt suggested, but that failed... The referenced >>> JIRA ticket (PIG-1863) suggests 1.6.0 (still produces the same >>> results) >>> >>> Environment variables are set: >>> java version "1.6.0_24" >>> >>> PIG_INITIAL_ADDRESS=localhost >>> PIG_HOME=/usr/local/src/pig-0.8.1 >>> PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner >>> PIG_RPC_PORT=9160 >>> CASSANDRA_HOME=/usr/local/src/apache-cassandra-0.8.0-src >>> >>> I then start up cassandra ... no issues. I connect and create a new >>> keyspace called foo with a column family called bar and a CF called >>> foo...Inside the CF bar, I create a few rows, with random columns .... >>> 4 Rows. >>> >>> From contrib/pig I run: bin/pig_cassandra -x local ... immediately >>> get the error: >>> >>> [: 45: /usr/local/src/pig-0.8.1/pig-0.8.1-core.jar: unexpected operator >>> >>> -- this is a reference to this line: if [ ! -e $PIG_JAR ]; then >>> >>> *** Problem here is that $PIG_JAR is a reference to two files ... >>> pig-0.8.1-core.jar & pig.jar ... >>> >>> Changing line 44 to PIG_JAR=$PIG_HOME/pig*core*.jar fixes this ... (or >>> even referencing $PIG_HOME/build/pig*core*.jar or just pig.jar >>> >>> Try again to run: bin/pig_cassandra -x local and everything loads up >>> nicely: >>> >>> 2011-06-21 02:07:23,671 [main] INFO org.apache.pig.Main - Logging >>> error messages to: >>> /usr/local/src/apache-cassandra-0.8.0-src/contrib/pig/pig_1308593243668.log >>> 2011-06-21 02:07:23,778 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - >>> Connecting to hadoop file system at: file:/// >>> grunt> register /usr/local/src/pig-0.8.1/pig-0.8.1-core.jar; register >>> /usr/local/src/pig-0.8.1/pig.jar; register >>> /usr/local/src/apache-cassandra-0.8.0-src/lib/avro-1.4.0-fixes.jar; >>> register >>> /usr/local/src/apache-cassandra-0.8.0-src/lib/avro-1.4.0-sources-fixes.jar; >>> register /usr/local/src/apache-cassandra-0.8.0-src/lib/libthrift-0.6.jar; >>> grunt> >>> grunt> rows = LOAD 'cassandra://foo/bar' USING CassandraStorage(); >>> grunt> STORE rows into 'cassandra://foo/foo' USING CassandraStorage(); >>> 2011-06-21 02:04:53,271 [main] INFO >>> org.apache.pig.tools.pigstats.ScriptState - Pig features used in the >>> script: UNKNOWN >>> 2011-06-21 02:04:53,271 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - >>> pig.usenewlogicalplan is set to true. New logical plan will be used. >>> 2011-06-21 02:04:53,324 [main] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics >>> with processName=JobTracker, sessionId= >>> 2011-06-21 02:04:53,447 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - >>> (Name: rows: Store(cassandra://foo/foo:CassandraStorage) - scope-1 >>> Operator Key: scope-1) >>> 2011-06-21 02:04:53,458 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler >>> - File concatenation threshold: 100 optimistic? false >>> 2011-06-21 02:04:53,477 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - MR plan size before optimization: 1 >>> 2011-06-21 02:04:53,477 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - MR plan size after optimization: 1 >>> 2011-06-21 02:04:53,480 [main] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> 2011-06-21 02:04:53,494 [main] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> 2011-06-21 02:04:53,494 [main] INFO >>> org.apache.pig.tools.pigstats.ScriptState - Pig script settings are >>> added to the job >>> 2011-06-21 02:04:53,556 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >>> - mapred.job.reduce.markreset.buffer.percent is not set, set to >>> default 0.3 >>> 2011-06-21 02:04:59,700 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >>> - Setting up single store job >>> 2011-06-21 02:04:59,718 [main] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> 2011-06-21 02:04:59,719 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - 1 map-reduce job(s) waiting for submission. >>> 2011-06-21 02:04:59,948 [Thread-5] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> 2011-06-21 02:04:59,960 [Thread-5] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> 2011-06-21 02:04:59,980 [Thread-5] INFO >>> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total >>> input paths (combined) to process : 1 >>> 2011-06-21 02:05:00,220 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - 0% complete >>> 2011-06-21 02:05:00,322 [Thread-14] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> 2011-06-21 02:05:00,340 [Thread-14] INFO >>> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total >>> input paths (combined) to process : 1 >>> 2011-06-21 02:05:00,372 [Thread-14] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> 2011-06-21 02:05:00,374 [Thread-14] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> 2011-06-21 02:05:00,378 [Thread-14] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> 2011-06-21 02:05:00,381 [Thread-14] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> 2011-06-21 02:05:00,491 [Thread-14] WARN >>> org.apache.hadoop.mapred.LocalJobRunner - job_local_0001 >>> java.lang.NoClassDefFoundError: org/apache/cassandra/db/marshal/TypeParser >>> at >>> org.apache.cassandra.hadoop.pig.CassandraStorage.getDefaultMarshallers(Unknown >>> Source) >>> at >>> org.apache.cassandra.hadoop.pig.CassandraStorage.columnToTuple(Unknown >>> Source) >>> at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(Unknown >>> Source) >>> at >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187) >>> at >>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) >>> at >>> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) >>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >>> at >>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) >>> Caused by: java.lang.ClassNotFoundException: >>> org.apache.cassandra.db.marshal.TypeParser >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:248) >>> ... 10 more >>> 2011-06-21 02:05:00,818 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - HadoopJobId: job_local_0001 >>> 2011-06-21 02:05:05,408 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - job job_local_0001 has failed! Stop running all dependent jobs >>> 2011-06-21 02:05:05,411 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - 100% complete >>> 2011-06-21 02:05:05,412 [main] ERROR >>> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) >>> failed! >>> 2011-06-21 02:05:05,412 [main] INFO >>> org.apache.pig.tools.pigstats.PigStats - Detected Local mode. Stats >>> reported below may be incomplete >>> 2011-06-21 02:05:05,413 [main] INFO >>> org.apache.pig.tools.pigstats.PigStats - Script Statistics: >>> >>> HadoopVersion PigVersion UserId StartedAt FinishedAt >>> Features >>> 0.20.2 0.8.1 root 2011-06-21 02:04:53 2011-06-21 02:05:05 >>> UNKNOWN >>> >>> Failed! >>> >>> Failed Jobs: >>> JobId Alias Feature Message Outputs >>> job_local_0001 rows MAP_ONLY Message: Job failed! >>> cassandra://foo/foo, >>> >>> Input(s): >>> Failed to read data from "cassandra://foo/bar" >>> >>> Output(s): >>> Failed to produce result in "cassandra://foo/foo" >>> >>> Job DAG: >>> job_local_0001 >>> >>> >>> 2011-06-21 02:05:05,413 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - Failed! >>> 2011-06-21 02:05:05,416 [main] INFO >>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >>> Metrics with processName=JobTracker, sessionId= - already initialized >>> grunt> >>> >>> >>> Any help or insight is appreciated .... >> >> > > > > -- > Sasha Dolgy > sasha.do...@gmail.com