I could be totally wrong here, but If you are doing a QUORUM read and there is a bad value encountered from the QUORUM won't a repair happen? I thought read_repair_chance 0 just means it won't query extra nodes to check for bad values.
-Jeremiah On Oct 17, 2011, at 4:22 PM, Jeremy Hanna wrote: > Even after disabling hinted handoff and setting read_repair_chance to 0 on > all our column families, we were still experiencing massive writes. > Apparently the read_repair_chance is completely ignored at any CL higher than > CL.ONE. So we were doing CL.QUORUM on reads and writes and seeing massive > writes still. It was because of the background read repairs being done. We > did extensive logging and checking and that's all it could be as no mutations > were coming in via thrift to those column families. > > In any case, just wanted to give some follow-up here as it's been an > inexplicable rock in our backpack and hopefully clears up where that setting > is actually used. I'll update the storage configuration wiki to include that > caveat as well. > > On Sep 10, 2011, at 5:14 PM, Jeremy Hanna wrote: > >> Thanks for the insights. I may first try disabling hinted handoff for one >> run of our data pipeline and see if it exhibits the same behavior. Will >> post back if I see anything enlightening there. >> >> On Sep 10, 2011, at 5:04 PM, Chris Goffinet wrote: >> >>> You could tail the commit log with `strings` to see what keys are being >>> inserted. >>> >>> On Sat, Sep 10, 2011 at 2:24 PM, Jonathan Ellis <jbel...@gmail.com> wrote: >>> Two possibilities: >>> >>> 1) Hinted handoff (this will show up in the logs on the sending >>> machine, on the receiving one it will just look like any other write) >>> >>> 2) You have something doing writes that you're not aware of, I guess >>> you could track that down using wireshark to see where the write >>> messages are coming from >>> >>> On Sat, Sep 10, 2011 at 3:56 PM, Jeremy Hanna >>> <jeremy.hanna1...@gmail.com> wrote: >>>> Oh and we're running 0.8.4 and the RF is 3. >>>> >>>> On Sep 10, 2011, at 3:49 PM, Jeremy Hanna wrote: >>>> >>>>> In addition, the mutation stage and the read stage are backed up like: >>>>> >>>>> Pool Name Active Pending Blocked >>>>> ReadStage 32 773 0 >>>>> RequestResponseStage 0 0 0 >>>>> ReadRepairStage 0 0 0 >>>>> MutationStage 158 525918 0 >>>>> ReplicateOnWriteStage 0 0 0 >>>>> GossipStage 0 0 0 >>>>> AntiEntropyStage 0 0 0 >>>>> MigrationStage 0 0 0 >>>>> StreamStage 0 0 0 >>>>> MemtablePostFlusher 1 5 0 >>>>> FILEUTILS-DELETE-POOL 0 0 0 >>>>> FlushWriter 2 5 0 >>>>> MiscStage 0 0 0 >>>>> FlushSorter 0 0 0 >>>>> InternalResponseStage 0 0 0 >>>>> HintedHandoff 0 0 0 >>>>> CompactionManager n/a 29 >>>>> MessagingService n/a 0,34 >>>>> >>>>> On Sep 10, 2011, at 3:38 PM, Jeremy Hanna wrote: >>>>> >>>>>> We are experiencing massive writes to column families when only doing >>>>>> reads from Cassandra. A set of 5 hadoop jobs are reading from Cassandra >>>>>> and then writing out to hdfs. That is the only thing operating on the >>>>>> cluster. We are reading at CL.QUORUM with hadoop and have written with >>>>>> CL.QUORUM. Read repair chance is set to 0.0 on all column families. >>>>>> However, in the logs, I'm seeing flush after flush of memtables and >>>>>> compactions taking place. Is there something else that would be writing >>>>>> based on the above description? >>>>>> >>>>>> Jeremy >>>>> >>>> >>>> >>> >>> >>> >>> -- >>> Jonathan Ellis >>> Project Chair, Apache Cassandra >>> co-founder of DataStax, the source for professional Cassandra support >>> http://www.datastax.com >>> >> >