Hi again.
It would be great if someone could comment whether the following is true
or not.
I tried to understand the consequences of using
|-Dcassandra.dynamic_snitch=true for the read path |and that's what I
came up with:
1) If using CL > 1 than using the dynamic snitch will result in a data
read from node with the lowest latency (little simplified) even if the
proxy node contains the data but has a higher latency that other
possible nodes which means that it is not necessary to do load-based
balancing on the client side.
2) If using CL =1 than the proxy node will always return the data itself
even when there is another node with less load.
3) Digest requests will be sent to all other living peer nodes for that
key and will result in a data read on all nodes to calculate the digest.
The only difference is that the data is not sent back but IO-wise it is
just as expensive.
The next one goes a little further:
We read / write with quorum / rf = 3.
It seems to me that it wouldn't be hard to patch the StorageProxy to
send only one read request and one digest request. Only if one of the
requests fail we would have to query the remaining node. We don't need
read repair because we have to repair once a week anyways and quorum
guarantees consistency. This way we could reduce read load significantly
which should compensate for latency increase by failing reads. Am I
missing something?
Best,
Daniel