Thanks you Ben for the links
On Tue, Apr 29, 2014 at 3:40 AM, Ben Bromhead <b...@instaclustr.com> wrote: > Some imbalance is expected and considered normal: > > See http://wiki.apache.org/cassandra/VirtualNodes/Balance > > As well as > > https://issues.apache.org/jira/browse/CASSANDRA-7032 > > Ben Bromhead > Instaclustr | www.instaclustr.com | > @instaclustr<http://twitter.com/instaclustr> | > +61 415 936 359 > > On 29 Apr 2014, at 7:30 am, DuyHai Doan <doanduy...@gmail.com> wrote: > > Hello all > > Some update about the issue. > > After wiping completely all sstable/commitlog/saved_caches folder and > restart the cluster from scratch, we still experience weird figures. After > the restart, nodetool status does not show an exact balance of 50% of data > for each node : > > > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID Rack > UN host1 48.57 KB 256 *51.6%* d00de0d1-836f-4658-af64-3a12c00f47d6 rack1 > UN host2 48.57 KB 256 *48.4%* e9d2505b-7ba7-414c-8b17-af3bbe79ed9c rack1 > > > As you can see, the % is very close to 50% but not exactly 50% > > What can explain that ? Can it be network connection issue during token > initial shuffle phase ? > > P.S: both host1 and host2 are supposed to have exactly the same hardware > > Regards > > Duy Hai DOAN > > > On Thu, Apr 24, 2014 at 11:20 PM, Batranut Bogdan <batra...@yahoo.com>wrote: > >> I don't know about hector but the datastax java driver needs just one ip >> from the cluster and it will discover the rest of the nodes. Then by >> default it will do a round robin when sending requests. So if Hector does >> the same the patterb will againg appear. >> Did you look at the size of the dirs? >> That documentation is for C* 0.8. It's old. But depending on your boxes >> you might reach CPU bottleneck. Might want to google for write path in >> cassandra.. According to that, there is not much to do when writes come >> in... >> On Friday, April 25, 2014 12:00 AM, DuyHai Doan <doanduy...@gmail.com> >> wrote: >> I did some experiments. >> >> Let's say we have node1 and node2 >> >> First, I configured Hector with node1 & node2 as hosts and I saw that >> only node1 has high CPU load >> >> To eliminate the "client connection" issue, I re-test with only node2 >> provided as host for Hector. Same pattern. CPU load is above 50% on node1 >> and below 10% on node2. >> >> It means that node2 is playing as coordinator and forward many write/read >> request to node1 >> >> Why did I look at CPU load and not iostat & al ? >> >> Because I have a very intensive write work load with read-only-once >> pattern. I've read here ( >> http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning) >> that heavy write in C* is more CPU bound but maybe the info may be outdated >> and no longer true >> >> Regards >> >> Duy Hai DOAN >> >> >> On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler >> <mich...@pbandjelly.org>wrote: >> >> On 04/24/2014 10:29 AM, DuyHai Doan wrote: >> >> Client used = Hector 1.1-4 >> Default Load Balancing connection policy >> Both nodes addresses are provided to Hector so according to its >> connection policy, the client should switch alternatively between both >> nodes >> >> >> OK, so is only one connection being established to one node for one bulk >> write operation? Or are multiple connections being made to both nodes and >> writes performed on both? >> >> -- >> Michael >> >> >> >> >> > >