Re: pig integration & NoClassDefFoundError TypeParser

Jeremy Hanna Mon, 20 Jun 2011 12:12:00 -0700

hmmm, did you build the cassandra src in the root of your cassandra directory 
with ant?  sounds like it can't find that cassandra class.  That's required.


On Jun 20, 2011, at 2:05 PM, Sasha Dolgy wrote:

> Hi ... I still have the same problem with pig-0.8.0-cdh3u0...
> 
> Maybe I'm doing something wrong.  Where does
> org/apache/cassandra/db/marshal/TypeParser exist, or should exist?
> 
> It's not in the $CASSANDRA_HOME/libs or
> /usr/local/src/pig-0.8.0-cdh3u0/lib or
> /usr/local/src/apache-cassandra-0.8.0-src/build/lib/jars
> 
> 
> for jar in `ls *.jar`
>  do
>  jar -tf $jar | grep TypeParser
>  if [ $? -eq 0 ]; then
>     echo $jar
>  fi
>  done
> 
> Shows me nothing in all the lib dirs....
> 
> 
> 
> On Mon, Jun 20, 2011 at 8:44 PM, Jeremy Hanna
> <jeremy.hanna1...@gmail.com> wrote:
>> Try running with cdh3u0 version of pig and see if it has the same problem.  
>> They backported the patch (to pig 0.9 which should be out in time for the 
>> hadoop summit next week) that adds the updated jackson dependency for avro.  
>> The download URL for that is - 
>> http://archive.cloudera.com/cdh/3/pig-0.8.0-cdh3u0.tar.gz
>> 
>> Alternatively, I believe today brisk beta 2 will be out which has pig 
>> integrated.  Not sure if that would work for your current environment though.
>> 
>> See if that works.
>> On Jun 20, 2011, at 1:09 PM, Sasha Dolgy wrote:
>> 
>>> Been trying for the past little bit to try and get the PIG integration
>>> working with Cassandra 0.8.0
>>> 
>>> 1.  Downloaded the src for 0.8.0 and ran ant build
>>> 2.  went into contrib/pig and ran ant ... gives me:
>>> /usr/local/src/apache-cassandra-0.8.0-src/contrib/pig/build/cassandra_storage.jar
>>> and is copied into the lib/ directory
>>> 3.  Downloaded pig-0.8.1, modified the ivy/libraries.properties so
>>> that it uses Jackson 1.8.2 .. and ran ant.  it compiles and gives me
>>> two jars:  pig-0.8.1-SNAPSHOT-core.jar and pig-0.8.1-SNAPSHOT.jar
>>> ----- I did try to run it with Jackson 1.4 as the
>>> contrib/pig/README.txt suggested, but that failed...  The referenced
>>> JIRA ticket (PIG-1863) suggests 1.6.0 (still produces the same
>>> results)
>>> 
>>> Environment variables are set:
>>> java version "1.6.0_24"
>>> 
>>> PIG_INITIAL_ADDRESS=localhost
>>> PIG_HOME=/usr/local/src/pig-0.8.1
>>> PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner
>>> PIG_RPC_PORT=9160
>>> CASSANDRA_HOME=/usr/local/src/apache-cassandra-0.8.0-src
>>> 
>>> I then start up cassandra ... no issues.  I connect and create a new
>>> keyspace called foo with a column family called bar and a CF called
>>> foo...Inside the CF bar, I create a few rows, with random columns ....
>>> 4 Rows.
>>> 
>>> From contrib/pig I run:  bin/pig_cassandra -x local ... immediately
>>> get the error:
>>> 
>>> [: 45: /usr/local/src/pig-0.8.1/pig-0.8.1-core.jar: unexpected operator
>>> 
>>> -- this is a reference to this line:  if [ ! -e $PIG_JAR ]; then
>>> 
>>> *** Problem here is that $PIG_JAR is a reference to two files ...
>>> pig-0.8.1-core.jar & pig.jar ...
>>> 
>>> Changing line 44 to PIG_JAR=$PIG_HOME/pig*core*.jar fixes this ... (or
>>> even referencing $PIG_HOME/build/pig*core*.jar or just pig.jar
>>> 
>>> Try again to run:  bin/pig_cassandra -x local and everything loads up 
>>> nicely:
>>> 
>>> 2011-06-21 02:07:23,671 [main] INFO  org.apache.pig.Main - Logging
>>> error messages to:
>>> /usr/local/src/apache-cassandra-0.8.0-src/contrib/pig/pig_1308593243668.log
>>> 2011-06-21 02:07:23,778 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
>>> Connecting to hadoop file system at: file:///
>>> grunt> register /usr/local/src/pig-0.8.1/pig-0.8.1-core.jar; register
>>> /usr/local/src/pig-0.8.1/pig.jar; register
>>> /usr/local/src/apache-cassandra-0.8.0-src/lib/avro-1.4.0-fixes.jar;
>>> register 
>>> /usr/local/src/apache-cassandra-0.8.0-src/lib/avro-1.4.0-sources-fixes.jar;
>>> register /usr/local/src/apache-cassandra-0.8.0-src/lib/libthrift-0.6.jar;
>>> grunt>
>>> grunt> rows = LOAD 'cassandra://foo/bar' USING CassandraStorage();
>>> grunt> STORE rows into 'cassandra://foo/foo' USING CassandraStorage();
>>> 2011-06-21 02:04:53,271 [main] INFO
>>> org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
>>> script: UNKNOWN
>>> 2011-06-21 02:04:53,271 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
>>> pig.usenewlogicalplan is set to true. New logical plan will be used.
>>> 2011-06-21 02:04:53,324 [main] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics
>>> with processName=JobTracker, sessionId=
>>> 2011-06-21 02:04:53,447 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
>>> (Name: rows: Store(cassandra://foo/foo:CassandraStorage) - scope-1
>>> Operator Key: scope-1)
>>> 2011-06-21 02:04:53,458 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler
>>> - File concatenation threshold: 100 optimistic? false
>>> 2011-06-21 02:04:53,477 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>>> - MR plan size before optimization: 1
>>> 2011-06-21 02:04:53,477 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>>> - MR plan size after optimization: 1
>>> 2011-06-21 02:04:53,480 [main] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2011-06-21 02:04:53,494 [main] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2011-06-21 02:04:53,494 [main] INFO
>>> org.apache.pig.tools.pigstats.ScriptState - Pig script settings are
>>> added to the job
>>> 2011-06-21 02:04:53,556 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
>>> - mapred.job.reduce.markreset.buffer.percent is not set, set to
>>> default 0.3
>>> 2011-06-21 02:04:59,700 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
>>> - Setting up single store job
>>> 2011-06-21 02:04:59,718 [main] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2011-06-21 02:04:59,719 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - 1 map-reduce job(s) waiting for submission.
>>> 2011-06-21 02:04:59,948 [Thread-5] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2011-06-21 02:04:59,960 [Thread-5] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2011-06-21 02:04:59,980 [Thread-5] INFO
>>> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
>>> input paths (combined) to process : 1
>>> 2011-06-21 02:05:00,220 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - 0% complete
>>> 2011-06-21 02:05:00,322 [Thread-14] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2011-06-21 02:05:00,340 [Thread-14] INFO
>>> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
>>> input paths (combined) to process : 1
>>> 2011-06-21 02:05:00,372 [Thread-14] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2011-06-21 02:05:00,374 [Thread-14] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2011-06-21 02:05:00,378 [Thread-14] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2011-06-21 02:05:00,381 [Thread-14] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2011-06-21 02:05:00,491 [Thread-14] WARN
>>> org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
>>> java.lang.NoClassDefFoundError: org/apache/cassandra/db/marshal/TypeParser
>>>        at 
>>> org.apache.cassandra.hadoop.pig.CassandraStorage.getDefaultMarshallers(Unknown
>>> Source)
>>>        at 
>>> org.apache.cassandra.hadoop.pig.CassandraStorage.columnToTuple(Unknown
>>> Source)
>>>        at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(Unknown
>>> Source)
>>>        at 
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>>>        at 
>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
>>>        at 
>>> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>>>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>>        at 
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>>> Caused by: java.lang.ClassNotFoundException:
>>> org.apache.cassandra.db.marshal.TypeParser
>>>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>>        at java.security.AccessController.doPrivileged(Native Method)
>>>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>>        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>>>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>>        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>>>        ... 10 more
>>> 2011-06-21 02:05:00,818 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - HadoopJobId: job_local_0001
>>> 2011-06-21 02:05:05,408 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - job job_local_0001 has failed! Stop running all dependent jobs
>>> 2011-06-21 02:05:05,411 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - 100% complete
>>> 2011-06-21 02:05:05,412 [main] ERROR
>>> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s)
>>> failed!
>>> 2011-06-21 02:05:05,412 [main] INFO
>>> org.apache.pig.tools.pigstats.PigStats - Detected Local mode. Stats
>>> reported below may be incomplete
>>> 2011-06-21 02:05:05,413 [main] INFO
>>> org.apache.pig.tools.pigstats.PigStats - Script Statistics:
>>> 
>>> HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      
>>> Features
>>> 0.20.2  0.8.1   root    2011-06-21 02:04:53     2011-06-21 02:05:05     
>>> UNKNOWN
>>> 
>>> Failed!
>>> 
>>> Failed Jobs:
>>> JobId   Alias   Feature Message Outputs
>>> job_local_0001  rows    MAP_ONLY        Message: Job failed!
>>> cassandra://foo/foo,
>>> 
>>> Input(s):
>>> Failed to read data from "cassandra://foo/bar"
>>> 
>>> Output(s):
>>> Failed to produce result in "cassandra://foo/foo"
>>> 
>>> Job DAG:
>>> job_local_0001
>>> 
>>> 
>>> 2011-06-21 02:05:05,413 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - Failed!
>>> 2011-06-21 02:05:05,416 [main] INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
>>> Metrics with processName=JobTracker, sessionId= - already initialized
>>> grunt>
>>> 
>>> 
>>> Any help or insight is appreciated ....
>> 
>> 
> 
> 
> 
> -- 
> Sasha Dolgy
> sasha.do...@gmail.com

Re: pig integration & NoClassDefFoundError TypeParser

Reply via email to