Re: hadoop consistency level

Michael Kjellman Thu, 18 Oct 2012 13:34:57 -0700

Not sure I understand your question (if there is one..)

You are more than welcome to do CL ONE and assuming you have hadoop nodes
in the right places on your ring things could work out very nicely. If you
need to guarantee that you have all the data in your job then you'll need
to use QUORUM.


If you don't specify a CL in your job config it will default to ONE (at
least that's what my read of the ConfigHelper source for 1.1.6 shows)

On 10/18/12 1:29 PM, "Andrey Ilinykh" <ailin...@gmail.com> wrote:

>On Thu, Oct 18, 2012 at 1:24 PM, Michael Kjellman
><mkjell...@barracuda.com> wrote:
>> Well there is *some* data locality, it's just not guaranteed. My
>> understanding (and someone correct me if I'm wrong) is that
>> ColumnFamilyInputFormat implements InputSplit and the getLocations()
>> method.
>>
>> 
>>http://hadoop.apache.org/docs/mapreduce/current/api/org/apache/hadoop/map
>>re
>> duce/InputSplit.html
>>
>> ColumnFamilySplit.java contains logic to do it's best to determine what
>> node that particular hadoop node contains the data for that mapper.
>>
>But no guarantee local data is in sync with other nodes. Which means
>you have CL ONE. If you want CL QUORUM you have to make remote call,
>no matter if data is local or not.


'Like' us on Facebook for exclusive content and other resources on all 
Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook

Re: hadoop consistency level

Reply via email to