I believe you need to move the nodes on the ring. What was the load on the nodes before you added 5 new nodes? Its just that you are getting data in certain token range more than others.
-Naren On Thu, Jan 19, 2012 at 3:22 AM, Marcel Steinbach <marcel.steinb...@chors.de > wrote: > On 18.01.2012, at 02:19, Maki Watanabe wrote: > > Are there any significant difference of number of sstables on each nodes? > > No, no significant difference there. Actually, node 8 is among those with > more sstables but with the least load (20GB) > > On 17.01.2012, at 20:14, Jeremiah Jordan wrote: > > Are you deleting data or using TTL's? Expired/deleted data won't go away > until the sstable holding it is compacted. So if compaction has happened > on some nodes, but not on others, you will see this. The disparity is > pretty big 400Gb to 20GB, so this probably isn't the issue, but with our > data using TTL's if I run major compactions a couple times on that column > family it can shrink ~30%-40%. > > Yes, we do delete data. But I agree, the disparity is too big to blame > only the deletions. > > Also, initially, we started out with 3 nodes and upgraded to 8 a few weeks > ago. After adding the node, we did > compactions and cleanups and didn't have a balanced cluster. So that > should have removed outdated data, right? > > 2012/1/18 Marcel Steinbach <marcel.steinb...@chors.de>: > > We are running regular repairs, so I don't think that's the problem. > > And the data dir sizes match approx. the load from the nodetool. > > Thanks for the advise, though. > > > Our keys are digits only, and all contain a few zeros at the same > > offsets. I'm not that familiar with the md5 algorithm, but I doubt that it > > would generate 'hotspots' for those kind of keys, right? > > > On 17.01.2012, at 17:34, Mohit Anchlia wrote: > > > Have you tried running repair first on each node? Also, verify using > > df -h on the data dirs > > > On Tue, Jan 17, 2012 at 7:34 AM, Marcel Steinbach > > <marcel.steinb...@chors.de> wrote: > > > Hi, > > > > we're using RP and have each node assigned the same amount of the token > > space. The cluster looks like that: > > > > Address Status State Load Owns Token > > > > 205648943402372032879374446248852460236 > > > 1 Up Normal 310.83 GB 12.50% > > 56775407874461455114148055497453867724 > > > 2 Up Normal 470.24 GB 12.50% > > 78043055807020109080608968461939380940 > > > 3 Up Normal 271.57 GB 12.50% > > 99310703739578763047069881426424894156 > > > 4 Up Normal 282.61 GB 12.50% > > 120578351672137417013530794390910407372 > > > 5 Up Normal 248.76 GB 12.50% > > 141845999604696070979991707355395920588 > > > 6 Up Normal 164.12 GB 12.50% > > 163113647537254724946452620319881433804 > > > 7 Up Normal 76.23 GB 12.50% > > 184381295469813378912913533284366947020 > > > 8 Up Normal 19.79 GB 12.50% > > 205648943402372032879374446248852460236 > > > > I was under the impression, the RP would distribute the load more evenly. > > > Our row sizes are 0,5-1 KB, hence, we don't store huge rows on a single > > node. Should we just move the nodes so that the load is more even > > distributed, or is there something off that needs to be fixed first? > > > > Thanks > > > Marcel > > > <hr style="border-color:blue"> > > > <p>chors GmbH > > > <br><hr style="border-color:blue"> > > > <p>specialists in digital and direct marketing solutions<br> > > > Haid-und-Neu-Straße 7<br> > > > 76131 Karlsruhe, Germany<br> > > > www.chors.com</p> > > > <p>Managing Directors: Dr. Volker Hatz, Markus Plattner<br>Amtsgericht > > Montabaur, HRB 15029</p> > > > <p style="font-size:9px">This e-mail is for the intended recipient only and > > may contain confidential or privileged information. If you have received > > this e-mail by mistake, please contact us immediately and completely delete > > it (and any attachments) and do not forward it or inform any other person > of > > its contents. If you send us messages by e-mail, we take this as your > > authorization to correspond with you by e-mail. E-mail transmission cannot > > be guaranteed to be secure or error-free as information could be > > intercepted, amended, corrupted, lost, destroyed, arrive late or > incomplete, > > or contain viruses. Neither chors GmbH nor the sender accept liability for > > any errors or omissions in the content of this message which arise as a > > result of its e-mail transmission. Please note that all e-mail > > communications to and from chors GmbH may be monitored.</p> > > > > > > > -- > w3m > > > -- Narendra Sharma Software Engineer *http://www.aeris.com <http://www.persistentsys.com>* *http://narendrasharma.blogspot.com/*