On Thu, Oct 18, 2012 at 2:31 PM, Jeremy Hanna <jeremy.hanna1...@gmail.com> wrote: > > On Oct 18, 2012, at 3:52 PM, Andrey Ilinykh <ailin...@gmail.com> wrote: > >> On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman >> <mkjell...@barracuda.com> wrote: >>> Not sure I understand your question (if there is one..) >>> >>> You are more than welcome to do CL ONE and assuming you have hadoop nodes >>> in the right places on your ring things could work out very nicely. If you >>> need to guarantee that you have all the data in your job then you'll need >>> to use QUORUM. >>> >>> If you don't specify a CL in your job config it will default to ONE (at >>> least that's what my read of the ConfigHelper source for 1.1.6 shows) >>> >> I have two questions. >> 1. I can benefit from data locality (and Hadoop) only with CL ONE. Is >> it correct? > > Yes and at QUORUM it's quasi local. The job tracker finds out where a range > is and sends a task to a replica with the data (local). In the case of > CL.QUORUM (see the Read Path section of > http://wiki.apache.org/cassandra/ArchitectureInternals), it will do an actual > read of the data on the node closest (local). Then it will get a digest from > other nodes to verify that they have the same data. So in the case of RF=3 > and QUORUM, it will read the data on the local node where the task is running > and will check the next closest replica for a digest to verify that it is > consistent. Information is sent across the wire and there is the latency of > that, but it's not the data that's sent. > >> 2. With CL QUORUM cassandra reads data from all replicas. In this case >> Hadoop doesn't give me any benefits. Application running outside the >> cluster has the same performance. Is it correct? > > CL QUORUM does not read data from all replicas. Applications running outside > the cluster have to copy the data from the cluster, a much more copy/network > intensive operation than using CL.QUORUM with the built-in Hadoop support. >
Thank you very much, guys! I have a much clearer picture now. Andrey