I meant to say I’m *not* overloading my cluster. On Jun 12, 2015, at 6:52 PM, Robert Wille <rwi...@fold3.com> wrote:
> I am preparing to migrate a large amount of data to Cassandra. In order to > test my migration code, I’ve been doing some dry runs to a test cluster. My > test cluster is 2.0.15, 3 nodes, RF=1 and CL=QUORUM. I know RF=1 and > CL=QUORUM is a weird combination, but my production cluster that will > eventually receive this data is RF=3. I am running with RF=1 so its faster > while I work out the kinks in the migration. > > There are a few things that have puzzled me, after writing several 10’s of > millions records to my test cluster. > > My main concern is that I have a few tens of thousands of dropped mutation > messages. I’m overloading my cluster. I never have more than about 10% CPU > utilization (even my I/O wait is negligible). A curious thing about that is > that the driver hasn’t thrown any exceptions, even though mutations have been > dropped. I’ve seen dropped mutation messages on my production cluster, but > like this, I’ve never gotten errors back from the client. I had always > assumed that one node dropped mutation messages, but the other two did not, > and so quorum was satisfied. With RF=1, I don’t understand how mutation > messages are being dropped and the client doesn’t tell me about it. Does this > mean my cluster is missing data, and I have no idea? > > Each node has a couple dozen all-time blocked FlushWriters. Is that bad? > > I have around 100 dropped counter mutations, which is very weird because I > don’t write any counters. I have counters in my schema for tracking view > counts, but the migration code doesn’t write them. How could I get dropped > counter mutation messages when I don’t modify them? > > Any insights would be appreciated. Thanks in advance. > > Robert >