Thanks Mark for the very detailed explanation.

 However what's about timestamp checking ? You're saying that the
coordinator checks for the digest of data (cell value) from both nodes but
if the cell name have different timestamp would it still request a full
data read to the node having the most recent time ?


On Fri, Jul 25, 2014 at 11:25 PM, Mark Reddy <[email protected]> wrote:

> Hi Brian,
>
> A read request will be handled in the following manner:
>
> Once the coordinator receives a read request it will firstly determine the
> replicas responsible for the data. From there those replicas are sorted by
> "proximity" to the coordinator. The closest node as determined by proximity
> sorting will be sent a command to perform an actual data read i.e. return
> the data to the coordinator
>
> If you have a Replication Factor (RF) of 3 and are reading at CL.QUORUM,
> one additional node will be sent a digest query. A digest query is like a
> read query except that instead of the receiving node actually returning the
> data, it only returns a digest (hash) of the would-be data. The reason for
> this is to discover whether the two nodes contacted agree on what the
> current data is, without sending the data over the network. Obviously for
> large data sets this is an effective bandwidth saver.
>
> Back on the coordinator node if the data and the digest match the data is
> returned to the client. If the data and digest do not match, a full data
> read is performed against the contacted replicas in order to guarantee that
> the most recent data is returned.
>
> Asynchronously in the background, the third replica is checked for
> consistency with the first two, and if needed, a read repair is initiated
> for that node.
>
>
> Mark
>
>
>
> On Fri, Jul 25, 2014 at 9:12 PM, Brian Tarbox <[email protected]>
> wrote:
>
>> We're considering a C* setup with very large columns and I have a
>> question about the details of read.
>>
>> I understand that a read request gets handled by the coordinator which
>> sends read requests to <quorum> of the nodes holding replicas of the data,
>> and once <quorum> nodes have replied with consistent data it is returned to
>> the client.
>>
>> My understanding is that each of the nodes actually sends the full data
>> being requested to the coordinator (which in the case of very large columns
>> would involve lots of network traffic).  Is that right?
>>
>> The alternative (which I don't think is the case but I've been asked to
>> verify) is that the replicas first send meta-data to the coordinator which
>> then asks one replica to send the actual data.  Again, I don't think this
>> is the case but was asked to confirm.
>>
>> Thanks.
>>
>> --
>> http://about.me/BrianTarbox
>>
>
>

Reply via email to