I am using Pig with Cassandra (Cassandra 2.1.2, Pig 0.14, Hadoop 2.6.0 combo).
When I use CqlStorage() I get org.apache.pig.backend.executionengine.ExecException: ERROR 2118: org.apache.cassandra.exceptions.ConfigurationException: Unable to find inputformat class 'org.apache.cassandra.hadoop.cql3.CqlPagingInputFormat/ When I use CqlNativeStorage() I get java.lang.NoSuchMethodError: com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set; Pig classpath looks like this: ยป echo $PIG_CLASSPATH /home/naishe/apps/apache-cassandra-2.1.2/lib/airline-0.6.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/antlr-runtime-3.5.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/apache-cassandra-2.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/apache-cassandra-clientutil-2.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/apache-cassandra-thrift-2.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-cli-1.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-codec-1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-lang3-3.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-math3-3.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/compress-lzf-0.8.4.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/concurrentlinkedhashmap-lru-1.4.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/disruptor-3.0.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/ *guava-16.0.jar* :/home/naishe/apps/apache-cassandra-2.1.2/lib/high-scale-lib-1.0.6.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jackson-core-asl-1.9.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jackson-mapper-asl-1.9.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jamm-0.2.8.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/javax.inject.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jbcrypt-0.3m.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jline-1.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jna-4.0.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/json-simple-1.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/libthrift-0.9.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/logback-classic-1.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/logback-core-1.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/lz4-1.2.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/metrics-core-2.2.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/netty-all-4.0.23.Final.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/reporter-config-2.1.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/slf4j-api-1.7.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/snakeyaml-1.11.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/snappy-java-1.0.5.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/stream-2.5.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/stringtemplate-4.0.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/super-csv-2.1.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/thrift-server-0.3.7.jar::/home/naishe/.m2/repository/com/datastax/cassandra/cassandra-driver-core/2.1.2/cassandra-driver-core-2.1.2.jar:/home/naishe/.m2/repository/org/apache/cassandra/cassandra-all/2.1.2/cassandra-all-2.1.2.jar I have read somewhere that it is due to version conflict with Guava library. So, I tried using Guava 11.0.2, that did not help. ( http://stackoverflow.com/questions/27089126/nosuchmethoderror-sets-newconcurrenthashset-while-running-jar-using-hadoop#comment42687234_27089126 ) Here is the Pig latin that I was trying to execute. grunt> alice = LOAD 'cql://hadoop_test/lines' USING CqlNativeStorage(); 2015-01-22 09:28:54,133 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS grunt> B = foreach alice generate flatten(TOKENIZE((chararray)$0)) as word; grunt> C = group B by word; grunt> D = foreach C generate COUNT(B) as word_count, group as word; grunt> dump D; 2015-01-22 09:29:06,808 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY [ -- snip -- ] 2015-01-22 09:29:11,254 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 2015-01-22 09:29:11,588 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Starting flush of map output 2015-01-22 09:29:11,600 [Thread-22] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete. 2015-01-22 09:29:11,620 [Thread-22] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1857630817_0001 java.lang.Exception: java.lang.NoSuchMethodError: com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set; at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.lang.NoSuchMethodError: com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set; at com.datastax.driver.core.Cluster$ConnectionReaper.<init>(Cluster.java:2065) at com.datastax.driver.core.Cluster$Manager.<init>(Cluster.java:1163) However when I use alice = LOAD 'cql://hadoop_test/lines' USING CqlStorage(); in above script, I get org.apache.pig.backend.executionengine.ExecException: ERROR 2118: org.apache.cassandra.exceptions.ConfigurationException: Unable to find inputformat class 'org.apache.cassandra.hadoop.cql3.CqlPagingInputFormat' Thanks for reading this mail.