Hi everyone, I've been working on a rewrite of the Cassandra InputFormat for Hadoop 2 using the DataStax Java driver instead of the Thrift API.
I have a prototype working now, but there is one bit of code that I have not been able to replace with code for the Java driver. In the InputFormat#getSplits method, the old code has a call like the following: map = client.describe_ring(ConfigHelper.getInputKeyspace(conf)); This gets a list of the distinct token ranges for the Cassandra cluster. The rest of "getSplits" then takes these key ranges, breaks them up into subranges (to match the user-specified input split size), and then gets the replica nodes for the various token ranges (as the locations for the splits). Does anyone know how I can do the following with the native protocol? - Get the distinct token ranges for the C* cluster - Get the set of replica nodes for a given range of tokens? I tried looking around in Cluster and Metadata, among other places, in the API docs, but I didn't see anything that looked like it would do what I want. Thanks! Best regards, Clint