It will give you an estimate of the number of partition keys. In newer versions it will merge a sketch of the keys and using HyperLogLog++ <http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/40671.pdf> (p=13, sp=25) it will come up with an estimate of the cardinality. I would say its safe to assume that its 2-ish% of the actual value. That does not include the memtable data however so thats added on top. So things in both memtable and sstables will be double counted. It should still be a fair estimate.
Before 2.1.6 it used the index and could be off by a lot in wide rows/updated/many sstable use cases. --- Chris Lohfink On Sun, Jan 24, 2016 at 6:32 PM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > Does the nodetool tablestats output line for "Number of keys (estimate)" > indicate partition keys or CQL row primary keys (PK)? > > We currently don't have doc on this and I couldn't get a solid answer from > a quick examination of the code. > > Since it is an estimate, roughly what is the nature of the estimation? > > In particular, for a very wide partition with many CQL rows (even > millions) is it estimating that as roughly one key or will the number of > sstables that the partition spans make it a large number? > > Thanks. > > -- Jack Krupansky >