Hi ... I still have the same problem with pig-0.8.0-cdh3u0... Maybe I'm doing something wrong. Where does org/apache/cassandra/db/marshal/TypeParser exist, or should exist?
It's not in the $CASSANDRA_HOME/libs or /usr/local/src/pig-0.8.0-cdh3u0/lib or /usr/local/src/apache-cassandra-0.8.0-src/build/lib/jars for jar in `ls *.jar` do jar -tf $jar | grep TypeParser if [ $? -eq 0 ]; then echo $jar fi done Shows me nothing in all the lib dirs.... On Mon, Jun 20, 2011 at 8:44 PM, Jeremy Hanna <jeremy.hanna1...@gmail.com> wrote: > Try running with cdh3u0 version of pig and see if it has the same problem. > They backported the patch (to pig 0.9 which should be out in time for the > hadoop summit next week) that adds the updated jackson dependency for avro. > The download URL for that is - > http://archive.cloudera.com/cdh/3/pig-0.8.0-cdh3u0.tar.gz > > Alternatively, I believe today brisk beta 2 will be out which has pig > integrated. Not sure if that would work for your current environment though. > > See if that works. > On Jun 20, 2011, at 1:09 PM, Sasha Dolgy wrote: > >> Been trying for the past little bit to try and get the PIG integration >> working with Cassandra 0.8.0 >> >> 1. Downloaded the src for 0.8.0 and ran ant build >> 2. went into contrib/pig and ran ant ... gives me: >> /usr/local/src/apache-cassandra-0.8.0-src/contrib/pig/build/cassandra_storage.jar >> and is copied into the lib/ directory >> 3. Downloaded pig-0.8.1, modified the ivy/libraries.properties so >> that it uses Jackson 1.8.2 .. and ran ant. it compiles and gives me >> two jars: pig-0.8.1-SNAPSHOT-core.jar and pig-0.8.1-SNAPSHOT.jar >> ----- I did try to run it with Jackson 1.4 as the >> contrib/pig/README.txt suggested, but that failed... The referenced >> JIRA ticket (PIG-1863) suggests 1.6.0 (still produces the same >> results) >> >> Environment variables are set: >> java version "1.6.0_24" >> >> PIG_INITIAL_ADDRESS=localhost >> PIG_HOME=/usr/local/src/pig-0.8.1 >> PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner >> PIG_RPC_PORT=9160 >> CASSANDRA_HOME=/usr/local/src/apache-cassandra-0.8.0-src >> >> I then start up cassandra ... no issues. I connect and create a new >> keyspace called foo with a column family called bar and a CF called >> foo...Inside the CF bar, I create a few rows, with random columns .... >> 4 Rows. >> >> From contrib/pig I run: bin/pig_cassandra -x local ... immediately >> get the error: >> >> [: 45: /usr/local/src/pig-0.8.1/pig-0.8.1-core.jar: unexpected operator >> >> -- this is a reference to this line: if [ ! -e $PIG_JAR ]; then >> >> *** Problem here is that $PIG_JAR is a reference to two files ... >> pig-0.8.1-core.jar & pig.jar ... >> >> Changing line 44 to PIG_JAR=$PIG_HOME/pig*core*.jar fixes this ... (or >> even referencing $PIG_HOME/build/pig*core*.jar or just pig.jar >> >> Try again to run: bin/pig_cassandra -x local and everything loads up nicely: >> >> 2011-06-21 02:07:23,671 [main] INFO org.apache.pig.Main - Logging >> error messages to: >> /usr/local/src/apache-cassandra-0.8.0-src/contrib/pig/pig_1308593243668.log >> 2011-06-21 02:07:23,778 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - >> Connecting to hadoop file system at: file:/// >> grunt> register /usr/local/src/pig-0.8.1/pig-0.8.1-core.jar; register >> /usr/local/src/pig-0.8.1/pig.jar; register >> /usr/local/src/apache-cassandra-0.8.0-src/lib/avro-1.4.0-fixes.jar; >> register >> /usr/local/src/apache-cassandra-0.8.0-src/lib/avro-1.4.0-sources-fixes.jar; >> register /usr/local/src/apache-cassandra-0.8.0-src/lib/libthrift-0.6.jar; >> grunt> >> grunt> rows = LOAD 'cassandra://foo/bar' USING CassandraStorage(); >> grunt> STORE rows into 'cassandra://foo/foo' USING CassandraStorage(); >> 2011-06-21 02:04:53,271 [main] INFO >> org.apache.pig.tools.pigstats.ScriptState - Pig features used in the >> script: UNKNOWN >> 2011-06-21 02:04:53,271 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - >> pig.usenewlogicalplan is set to true. New logical plan will be used. >> 2011-06-21 02:04:53,324 [main] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics >> with processName=JobTracker, sessionId= >> 2011-06-21 02:04:53,447 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - >> (Name: rows: Store(cassandra://foo/foo:CassandraStorage) - scope-1 >> Operator Key: scope-1) >> 2011-06-21 02:04:53,458 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler >> - File concatenation threshold: 100 optimistic? false >> 2011-06-21 02:04:53,477 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >> - MR plan size before optimization: 1 >> 2011-06-21 02:04:53,477 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >> - MR plan size after optimization: 1 >> 2011-06-21 02:04:53,480 [main] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> 2011-06-21 02:04:53,494 [main] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> 2011-06-21 02:04:53,494 [main] INFO >> org.apache.pig.tools.pigstats.ScriptState - Pig script settings are >> added to the job >> 2011-06-21 02:04:53,556 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >> - mapred.job.reduce.markreset.buffer.percent is not set, set to >> default 0.3 >> 2011-06-21 02:04:59,700 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >> - Setting up single store job >> 2011-06-21 02:04:59,718 [main] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> 2011-06-21 02:04:59,719 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >> - 1 map-reduce job(s) waiting for submission. >> 2011-06-21 02:04:59,948 [Thread-5] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> 2011-06-21 02:04:59,960 [Thread-5] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> 2011-06-21 02:04:59,980 [Thread-5] INFO >> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total >> input paths (combined) to process : 1 >> 2011-06-21 02:05:00,220 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >> - 0% complete >> 2011-06-21 02:05:00,322 [Thread-14] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> 2011-06-21 02:05:00,340 [Thread-14] INFO >> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total >> input paths (combined) to process : 1 >> 2011-06-21 02:05:00,372 [Thread-14] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> 2011-06-21 02:05:00,374 [Thread-14] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> 2011-06-21 02:05:00,378 [Thread-14] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> 2011-06-21 02:05:00,381 [Thread-14] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> 2011-06-21 02:05:00,491 [Thread-14] WARN >> org.apache.hadoop.mapred.LocalJobRunner - job_local_0001 >> java.lang.NoClassDefFoundError: org/apache/cassandra/db/marshal/TypeParser >> at >> org.apache.cassandra.hadoop.pig.CassandraStorage.getDefaultMarshallers(Unknown >> Source) >> at >> org.apache.cassandra.hadoop.pig.CassandraStorage.columnToTuple(Unknown >> Source) >> at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(Unknown >> Source) >> at >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187) >> at >> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) >> at >> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> at >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) >> Caused by: java.lang.ClassNotFoundException: >> org.apache.cassandra.db.marshal.TypeParser >> at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:248) >> ... 10 more >> 2011-06-21 02:05:00,818 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >> - HadoopJobId: job_local_0001 >> 2011-06-21 02:05:05,408 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >> - job job_local_0001 has failed! Stop running all dependent jobs >> 2011-06-21 02:05:05,411 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >> - 100% complete >> 2011-06-21 02:05:05,412 [main] ERROR >> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) >> failed! >> 2011-06-21 02:05:05,412 [main] INFO >> org.apache.pig.tools.pigstats.PigStats - Detected Local mode. Stats >> reported below may be incomplete >> 2011-06-21 02:05:05,413 [main] INFO >> org.apache.pig.tools.pigstats.PigStats - Script Statistics: >> >> HadoopVersion PigVersion UserId StartedAt FinishedAt >> Features >> 0.20.2 0.8.1 root 2011-06-21 02:04:53 2011-06-21 02:05:05 >> UNKNOWN >> >> Failed! >> >> Failed Jobs: >> JobId Alias Feature Message Outputs >> job_local_0001 rows MAP_ONLY Message: Job failed! >> cassandra://foo/foo, >> >> Input(s): >> Failed to read data from "cassandra://foo/bar" >> >> Output(s): >> Failed to produce result in "cassandra://foo/foo" >> >> Job DAG: >> job_local_0001 >> >> >> 2011-06-21 02:05:05,413 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >> - Failed! >> 2011-06-21 02:05:05,416 [main] INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM >> Metrics with processName=JobTracker, sessionId= - already initialized >> grunt> >> >> >> Any help or insight is appreciated .... > > -- Sasha Dolgy sasha.do...@gmail.com