I'm guessing something else is responsible for the compaction
difference you're seeing -- Bytes, UTF8, and Ascii types all use the
same lexical byte comparison code.  The only place you should expect
to lose a small amount of performance by using the latter two is on
insert when it sanity-checks the input.

On Sat, Nov 19, 2011 at 12:43 PM, Thorsten von Eicken
<t...@rightscale.com> wrote:
> I recently changed the default_validation_class on a bunch of CFs from
> BytesType to UTF8Type and I observed two things: first I saw a number of
> compactions during the migration that showed ~200% to ~400% of original
> in the log entry. Second, it seems that compaction speed has now halved.
> I'm using v1.0.1, level compaction and compression. Before I create
> tests I thought I'd quickly ask: is there any difference in storage
> efficiency between BytesType, UTF8Type, and AsciiType when storing plain
> us-ascii strings? And is there any expected compaction speed difference?
> (It would be nice to have some docs about the expected storage space
> used for the various data types.)
> Thanks much!
> Thorsten
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Reply via email to