I recently changed the default_validation_class on a bunch of CFs from
BytesType to UTF8Type and I observed two things: first I saw a number of
compactions during the migration that showed ~200% to ~400% of original
in the log entry. Second, it seems that compaction speed has now halved.
I'm using v1.0.1, level compaction and compression. Before I create
tests I thought I'd quickly ask: is there any difference in storage
efficiency between BytesType, UTF8Type, and AsciiType when storing plain
us-ascii strings? And is there any expected compaction speed difference?
(It would be nice to have some docs about the expected storage space
used for the various data types.)
Thanks much!
Thorsten