On Aug 23, 2011, at 3:43 AM, aaron morton wrote: > Dropped messages in ReadRepair is odd. Are you also dropping mutations ? > > There are two tasks performed on the ReadRepair stage. The digests are > compared on this stage, and secondly the repair happens on the stage. > Comparing digests is quick. Doing the repair could take a bit longer, all the > cf's returned are collated, filtered and deletes removed. > > We don't do background Read Repair on range scans, they do have foreground > digest checking though. > > What CL are you using ?
CL.ONE for hadoop writes, CL.QUORUM for hadoop reads > > begin crazy theory: > > Could there be a very big row that is out of sync ? The increased RR > would be resulting in mutations been sent back to the replicas. Which would > give you a hot spot in mutations. > > Check max compacted row size on the hot nodes. > > Turn the logging up to DEBUG on the hot machines for > o.a.c.service.RowRepairResolver and look for the "resolve:…" message it has > the time taken. The max compacted size didn't seem unreasonable - about a MB. I turned up logging to DEBUG for that class and I get plenty of dropped READ_REPAIR messages, but nothing coming out of DEBUG in the logs to indicate the time taken that I can see. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 23/08/2011, at 7:52 PM, Jeremy Hanna wrote: > >> >> On Aug 23, 2011, at 2:25 AM, Peter Schuller wrote: >> >>>> We've been having issues where as soon as we start doing heavy writes (via >>>> hadoop) recently, it really hammers 4 nodes out of 20. We're using random >>>> partitioner and we've set the initial tokens for our 20 nodes according to >>>> the general spacing formula, except for a few token offsets as we've >>>> replaced dead nodes. >>> >>> Is the hadoop job iterating over keys in the cluster in token order >>> perhaps, and you're generating writes to those keys? That would >>> explain a "moving hotspot" along the cluster. >> >> Yes - we're iterating over all the keys of particular column families, doing >> joins using pig as we enrich and perform measure calculations. When we >> write, we're usually writing out for a certain small subset of keys which >> shouldn't have hotspots with RandomPartitioner afaict. >> >>> >>> -- >>> / Peter Schuller (@scode on twitter) >> >