Hi Carlos,
Please check if the JIRA : 
https://issues.apache.org/jira/browse/CASSANDRA-11467 fixes your problem.
We had been facing row count issue with thrift cf / compact storage and this 
fixed it.
Above is fixed in latest 2.1.14. Its a two line fix. So, you can also prepare a 
custom jar and check if that works.
ThanksAnuj
Sent from Yahoo Mail on Android 
 
  On Thu, 21 Apr, 2016 at 9:29 PM, Carlos Alonso<i...@mrcalonso.com> wrote:   
Hi guys.
I've been struggling for the last days to find a reliable and stable way to 
count keys in a thrift column family.
My idea is to basically iterate the whole ring using the token function, as 
documented here: 
https://docs.datastax.com/en/cql/3.1/cql/cql_using/paging_c.html in batches of 
10000 records
The only corner case is that if there were more than 10000 records in a single 
partition (not the case, but the program should still handle it) it explores 
the partition in depth by getting all records for that particular token (see 
below). In the end, all keys are saved into a hash to guarantee uniqueness. The 
count of unique keys is always different (and random, sometimes more keys, 
sometimes less are retrieved) and, of course, I'm sure no activity is going on 
in that cf.
I'm running Cassandra 2.1.11 with MurMur3 partitioner. RF=3 and CL=QUORUM
the column family structure is
CREATE TABLE tbl (    key blob,    column1 ascii,    value blob,    PRIMARY 
KEY(key, column1))
and I'm running the following script
connection = open_cql_connectionresults = connection.execute("SELECT 
token(key), key FROM tbl LIMIT 10000")
keys_hash = {} // Hash to save the keys to guarantee uniquenesslast_token = 
niltoken = nil
while results != nil  results.each do |row|    keys_hash[row['key']] = true    
token = row['token(key)']  end  if token == last_token    results = 
connection.execute("SELECT token(key), key FROM tbl WHERE token(key) = 
#{token}")  else    results = connection.execute("SELECT token(key), key FROM 
tbl WHERE token(key) >= #{token} LIMIT 10000")  end  last_token = tokenend

puts keys.keys.count
What am I missing?
Thanks!
Carlos Alonso | Software Engineer | @calonso
  

Reply via email to