Coordination in a distributed system is difficult.  I don't think we
can fix HH's existing edge cases, without introducing other more
complicated edge cases.

So weekly-or-so repair will remain a common maintenance task for the
forseeable future.

On Wed, Jul 14, 2010 at 4:17 PM, B. Todd Burruss <bburr...@real.com> wrote:
> thx, but disappointing :)
>
> is this just something we have to live with and periodically "repair"
> the nodes?  or is there future work to tighten up the window?
>
> thx
>
>
> On Wed, 2010-07-14 at 12:13 -0700, Jonathan Ellis wrote:
>> On Wed, Jul 14, 2010 at 1:43 PM, B. Todd Burruss <bburr...@real.com> wrote:
>> > there is a window of time from when a node goes down and when the rest
>> > of the cluster actually realizes that it is down.
>> >
>> > what happens to writes during this time frame?  does hinted handoff
>> > record these writes and then "handoff" when the down node returns?  or
>> > does hinted handoff not kick in until the cluster realizes the node is
>> > down?
>>
>> the latter.
>>
>> > ... is the only way these missed writes are repaired is through read
>> > repair and/or manually kicking off "nodetool repair"?
>>
>> yes.
>>
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to