Some imbalance is expected and considered normal: See http://wiki.apache.org/cassandra/VirtualNodes/Balance
As well as https://issues.apache.org/jira/browse/CASSANDRA-7032 Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359 On 29 Apr 2014, at 7:30 am, DuyHai Doan <doanduy...@gmail.com> wrote: > Hello all > > Some update about the issue. > > After wiping completely all sstable/commitlog/saved_caches folder and > restart the cluster from scratch, we still experience weird figures. After > the restart, nodetool status does not show an exact balance of 50% of data > for each node : > > > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID Rack > UN host1 48.57 KB 256 51.6% d00de0d1-836f-4658-af64-3a12c00f47d6 rack1 > UN host2 48.57 KB 256 48.4% e9d2505b-7ba7-414c-8b17-af3bbe79ed9c rack1 > > > As you can see, the % is very close to 50% but not exactly 50% > > What can explain that ? Can it be network connection issue during token > initial shuffle phase ? > > P.S: both host1 and host2 are supposed to have exactly the same hardware > > Regards > > Duy Hai DOAN > > > On Thu, Apr 24, 2014 at 11:20 PM, Batranut Bogdan <batra...@yahoo.com> wrote: > I don't know about hector but the datastax java driver needs just one ip from > the cluster and it will discover the rest of the nodes. Then by default it > will do a round robin when sending requests. So if Hector does the same the > patterb will againg appear. > Did you look at the size of the dirs? > That documentation is for C* 0.8. It's old. But depending on your boxes you > might reach CPU bottleneck. Might want to google for write path in > cassandra.. According to that, there is not much to do when writes come > in... > On Friday, April 25, 2014 12:00 AM, DuyHai Doan <doanduy...@gmail.com> wrote: > I did some experiments. > > Let's say we have node1 and node2 > > First, I configured Hector with node1 & node2 as hosts and I saw that only > node1 has high CPU load > > To eliminate the "client connection" issue, I re-test with only node2 > provided as host for Hector. Same pattern. CPU load is above 50% on node1 and > below 10% on node2. > > It means that node2 is playing as coordinator and forward many write/read > request to node1 > > Why did I look at CPU load and not iostat & al ? > > Because I have a very intensive write work load with read-only-once pattern. > I've read here > (http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning) that > heavy write in C* is more CPU bound but maybe the info may be outdated and no > longer true > > Regards > > Duy Hai DOAN > > > On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler <mich...@pbandjelly.org> > wrote: > On 04/24/2014 10:29 AM, DuyHai Doan wrote: > Client used = Hector 1.1-4 > Default Load Balancing connection policy > Both nodes addresses are provided to Hector so according to its > connection policy, the client should switch alternatively between both nodes > > OK, so is only one connection being established to one node for one bulk > write operation? Or are multiple connections being made to both nodes and > writes performed on both? > > -- > Michael > > > >