Sorry for not checking source to see if things have changed but i just 
remembered an issue I have forgotten to make jira for.

In old days, nodes would periodically try to deliver queues.

However, this was at some stage changed so it only deliver if a node is being 
marked up.

However, you can definitely have a scenario where  A fails to deliver to B so 
it send the hint to C instead.

However, B is not really down, it just could not accept that packet at that 
time and C always (correctly in this case) thinks B is up and it never tries to 
deliver the hints to B.

Will this change fix this, or do we need to get back the thread that 
periodically tried to deliver hints regardless of node status changes?

Regards,
Terje

On 1 Dec 2011, at 19:10, Sylvain Lebresne <sylv...@datastax.com> wrote:

> You're right, good catch.
> Do you mind opening a ticket on jira
> (https://issues.apache.org/jira/browse/CASSANDRA)?
> 
> --
> Sylvain
> 
> On Thu, Dec 1, 2011 at 10:03 AM, Fredrik L Stigbäck
> <fredrik.l.stigb...@sitevision.se> wrote:
>> Hi,
>> We,re running cassandra 1.0.3.
>> I've done some testing with 2 nodes (node A, node B), replication factor 2.
>> I take node A down, writing some data to node B and then take node A up.
>> Sometimes hints aren't delivered when node A comes up.
>> 
>> I've done some debugging in org.apache.cassandra.db.HintedHandOffManager and
>> sometimes node B ends up in a strange state in method
>> org.apache.cassandra.db.HintedHandOffManager.deliverHints(final InetAddress
>> to), where org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries
>> already has node A in it's Set and therefore no hints will ever be delivered
>> to node A.
>> The only reason for this that I can see is that in
>> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(InetAddress
>> endpoint) the hintStore.isEmpty() check returns true and the endpoint (node
>> A)  isn't removed from
>> org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries. Then no hints
>> will ever be delivered again until node B is restarted.
>> During what conditions will hintStore.isEmpty() return true?
>> Shouldn't the hintStore.isEmpty() check be inside the try {} finally{}
>> clause, removing the endpoint from queuedDeliveries in the finally block?
>> 
>> public void deliverHints(final InetAddress to)
>> {
>>         logger_.debug("deliverHints to {}", to);
>>         if (!queuedDeliveries.add(to))
>>             return;
>>         .......
>> }
>> 
>> private void deliverHintsToEndpoint(InetAddress endpoint) throws
>> IOException, DigestMismatchException, InvalidRequestException,
>> TimeoutException,
>> {
>>         ColumnFamilyStore hintStore =
>> Table.open(Table.SYSTEM_TABLE).getColumnFamilyStore(HINTS_CF);
>>         if (hintStore.isEmpty())
>>             return; // nothing to do, don't confuse users by logging a no-op
>> handoff
>>     try
>>     {
>>         ......
>>     }
>>     finally
>>     {
>>             queuedDeliveries.remove(endpoint);
>>     }
>> }
>> 
>> Regards
>> /Fredrik

Reply via email to