i insert 500,000,000 rows each of which has a key of 20 bytes and a column of 110 bytes.
and the repilcationfactor is set to 3, so i expect the load of the cluster should be 0.5 billion * 130 * 3 = 195 G bytes. but in the fact the load i get through "nodetool -h localhost ring" is about 443G. i think there is some other additional datas such as index , checksum ,and the column name be stored. but am i right ? is that all ? why the difference is so big ? hope i have explained my problem clearly 2010-04-30 Bingbing Liu