Hi Carlos, Please check if the JIRA : https://issues.apache.org/jira/browse/CASSANDRA-11467 fixes your problem. We had been facing row count issue with thrift cf / compact storage and this fixed it. Above is fixed in latest 2.1.14. Its a two line fix. So, you can also prepare a custom jar and check if that works. ThanksAnuj Sent from Yahoo Mail on Android On Thu, 21 Apr, 2016 at 9:29 PM, Carlos Alonso<i...@mrcalonso.com> wrote: Hi guys. I've been struggling for the last days to find a reliable and stable way to count keys in a thrift column family. My idea is to basically iterate the whole ring using the token function, as documented here: https://docs.datastax.com/en/cql/3.1/cql/cql_using/paging_c.html in batches of 10000 records The only corner case is that if there were more than 10000 records in a single partition (not the case, but the program should still handle it) it explores the partition in depth by getting all records for that particular token (see below). In the end, all keys are saved into a hash to guarantee uniqueness. The count of unique keys is always different (and random, sometimes more keys, sometimes less are retrieved) and, of course, I'm sure no activity is going on in that cf. I'm running Cassandra 2.1.11 with MurMur3 partitioner. RF=3 and CL=QUORUM the column family structure is CREATE TABLE tbl ( key blob, column1 ascii, value blob, PRIMARY KEY(key, column1)) and I'm running the following script connection = open_cql_connectionresults = connection.execute("SELECT token(key), key FROM tbl LIMIT 10000") keys_hash = {} // Hash to save the keys to guarantee uniquenesslast_token = niltoken = nil while results != nil results.each do |row| keys_hash[row['key']] = true token = row['token(key)'] end if token == last_token results = connection.execute("SELECT token(key), key FROM tbl WHERE token(key) = #{token}") else results = connection.execute("SELECT token(key), key FROM tbl WHERE token(key) >= #{token} LIMIT 10000") end last_token = tokenend
puts keys.keys.count What am I missing? Thanks! Carlos Alonso | Software Engineer | @calonso