Hi. We got the same problem here. Even the wordcount map/reduce example in the source tar works fine with one node, but fails with the same exception on a two node cluster. CASSANDRA-3044 mentioned that a temporary work around is to disable node auto discovery. Can anyone tell me how to do that in the wordcount example? Thanks.
On Fri, Sep 2, 2011 at 12:10 AM, Jian Fang <jian.fang.subscr...@gmail.com>wrote: > Thanks. How soon 0.8.5 will be out? Is there any 0.8.5 snapshot version > available? > > > On Thu, Sep 1, 2011 at 11:57 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > >> Sounds like https://issues.apache.org/jira/browse/CASSANDRA-3044, >> fixed for 0.8.5 >> >> On Thu, Sep 1, 2011 at 10:54 AM, Jian Fang >> <jian.fang.subscr...@gmail.com> wrote: >> > Hi, >> > >> > I upgraded Cassandra from 0.8.2 to 0.8.4 and run a hadoop job to read >> data >> > from Cassandra, but >> > got the following errors: >> > >> > 11/09/01 11:42:46 INFO hadoop.SalesRankLoader: Start Cassandra reader... >> > Exception in thread "main" java.io.IOException: Could not get input >> splits >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:157) >> > at >> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885) >> > at >> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779) >> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) >> > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447) >> > at >> > com.barnesandnoble.hadoop.SalesRankLoader.run(SalesRankLoader.java:359) >> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> > at >> > com.barnesandnoble.hadoop.SalesRankLoader.main(SalesRankLoader.java:408) >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> > at >> > >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> > at >> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> > at java.lang.reflect.Method.invoke(Method.java:597) >> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> > Caused by: java.util.concurrent.ExecutionException: >> > java.lang.IllegalArgumentException: protocol = socket host = null >> > at >> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) >> > at java.util.concurrent.FutureTask.get(FutureTask.java:83) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:153) >> > ... 12 more >> > Caused by: java.lang.IllegalArgumentException: protocol = socket host = >> null >> > at >> > sun.net.spi.DefaultProxySelector.select(DefaultProxySelector.java:151) >> > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:358) >> > at java.net.Socket.connect(Socket.java:529) >> > at org.apache.thrift.transport.TSocket.open(TSocket.java:178) >> > at >> > >> org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.createConnection(ColumnFamilyInputFormat.java:243) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:217) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:70) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:190) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:175) >> > at >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> > at java.lang.Thread.run(Thread.java:662) >> > >> > The code used to work for 0.8.2 and it is really strange to see the host >> = >> > null. My code is very similar to the word count example, >> > >> > logger.info("Start Cassandra reader..."); >> > Job job2 = new Job(getConf(), "SalesRankCassandraReader"); >> > job2.setJarByClass(SalesRankLoader.class); >> > job2.setMapperClass(CassandraReaderMapper.class); >> > job2.setReducerClass(CassandraToFilesystem.class); >> > job2.setOutputKeyClass(Text.class); >> > job2.setOutputValueClass(IntWritable.class); >> > job2.setMapOutputKeyClass(Text.class); >> > job2.setMapOutputValueClass(IntWritable.class); >> > FileOutputFormat.setOutputPath(job2, new Path(outPath)); >> > >> > job2.setInputFormatClass(ColumnFamilyInputFormat.class); >> > >> > ConfigHelper.setRpcPort(job2.getConfiguration(), "9260"); >> > ConfigHelper.setInitialAddress(job2.getConfiguration(), >> > "dnjsrcha02"); >> > ConfigHelper.setPartitioner(job2.getConfiguration(), >> > "org.apache.cassandra.dht.RandomPartitioner"); >> > ConfigHelper.setInputColumnFamily(job2.getConfiguration(), >> KEYSPACE, >> > columnFamily); >> > // ConfigHelper.setInputSplitSize(job2.getConfiguration(), 5000); >> > ConfigHelper.setRangeBatchSize(job2.getConfiguration(), >> batchSize); >> > SlicePredicate predicate = new >> > >> SlicePredicate().setColumn_names(Arrays.asList(ByteBufferUtil.bytes(columnName))); >> > ConfigHelper.setInputSlicePredicate(job2.getConfiguration(), >> > predicate); >> > >> > job2.waitForCompletion(true); >> > >> > The Cassandra cluster includes 6 nodes and I am pretty sure they work >> fine. >> > >> > Please help. >> > >> > Thanks, >> > >> > John >> > >> > >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com >> > >