I would start by looking at sstable2json. It may be simplest for you to run sstable2json and then process the resulting json. If that's not adequate, modifying the sstable2json code is probably your best bet.
On Mon, May 25, 2015 at 11:12 AM, Malcolm Matalka <malc...@spotify.com> wrote: > Hello, > > For efficiency reasons I am trying to parse the raw SSTable files in > order to transform them into another format. I understand this is > like poking a sleeping beast and there aren't many guarantees around > this but I'm asking if anyone has any pointers to make this possible? > In a search I have stumbled upon FullContact's SSTable parser, but it > does not parse the complicated data structures that CQL supports. In > attempting to reverse engineer how Cassandra handles the actual data > there are a few cases that are unclear and I'm concerned that my > attempts to interpret them will result in a fragile result. > > Are there any suggestions? Existing libraries? Tips on how Cassandra > parses the data itself? Pointers into the code to read? SSTable > design doc? > > Thanks, > /Malcolm > -- Tyler Hobbs DataStax <http://datastax.com/>