Dont forget to count timestamps for each column. 2010/4/30 Bingbing Liu <rucb...@gmail.com>
> hi, > > thanks for your help. > > i run the nodetool -h **** compact > > but the load keep the same , is there anyone can tell me why? > > > 2010-04-30 > ------------------------------ > Bingbing Liu > ------------------------------ > *发件人:* casablinca126.com > *发送时间:* 2010-04-30 15:52:09 > *收件人:* user@cassandra.apache.org > *抄送:* > *主题:* Re: why the sum of all the nodes' loads is much bigger than the size > of the inserted data? > hi, > Have you ever run anti-compaction(more than 1 time, maybe), but never run > cleanup on > the anti-compaction node? > > cheers, > Cao Jiguang > > > 2010-04-30 > ------------------------------ > casablinca126.com > ------------------------------ > *发件人:* Bingbing Liu > *发送时间:* 2010-04-30 15:24:45 > *收件人:* user > *抄送:* > *主题:* why the sum of all the nodes' loads is much bigger than the size of > the inserted data? > > i insert 500,000,000 rows each of which has a key of 20 bytes and a column of > 110 bytes. > > > and the repilcationfactor is set to 3, so i expect the load of the cluster > should be 0.5 billion * 130 * 3 = 195 G bytes. > > > but in the fact the load i get through "nodetool -h localhost ring" is about > 443G. > > > i think there is some other additional datas such as index , checksum ,and > the column name be stored. > > but am i right ? is that all ? why the difference is so big ? > > hope i have explained my problem clearly > > > > 2010-04-30 > ------------------------------ > Bingbing Liu >