Peter, thank you for the extremely detailed reply. To now answer my own question, the critical points that are different from what I said earlier are: that CL.ONE does prefer *one* node (which one depending on snitch) and that RR uses digests (which are not mentioned on the wiki page [1]) instead of comparing raw requests. Totally tangential, but in the case of CL.ONE with narrow rows making the request and taking the fastest would probably be better, but having things work both ways depending on row size sounds painfully complicated. (As Aaron points out this is not how things work now.)
I am assuming that RR digests save on bandwidth, but to generate the digest with a row cache miss the same number of disk seeks are required (my nemesis is disk io). So to increase pinny-ness I'll further reduce RR chance and set a badness threshold. Thanks all. [1] http://wiki.apache.org/cassandra/ReadRepair