On 02/11/2011 05:06 AM, Patrik Modesto wrote: > Hi all! > > I'm thinking if size of a column name could matter for a large dataset > in Cassandra (I mean lots of rows). For example what if I have a row > with 10 columns each has 10 bytes value and 10 bytes name. Do I have > half the row size just of the column names and the other half of the > data (not counting storage overhead)? What if I have 10M of these > rows? Is there a difference? Should I use some 3bytes codes for a > column name to save memory/bandwidth? > > Thanks, > Patrik
You are correct that you can for small row/column key values they key itself can represent a large proportion of the total size. I think you will find the consensus on this list is that trying to be clever with names is usually not worth the additional complexity. The right solution to this is https://issues.apache.org/jira/browse/CASSANDRA-47.