Rahul, I've made some progress in my investigations in the mean time. It seems that the network bandwidth to my remote data center is relatively small, and at the same time my application generates far more write operations that I was expecting, resulting in more replication data to the remote DC.
In the case of a network hickup, or a sudden peek in data generated by my application (or both), it seems that the network capacity to the remote DC is simply not sufficient to keep up with the data. This results in the hints piling up. On top of that, my cassandra nodes are equipped with a moderate amount of memory (4G). This might simply be not enough to keep maintain the hints and other column families in memtables. When the problem occurs, I can see that the node is very busy flushing the hint memtable to disk, which obviously results in high CPU/IO load. I've managed to significantly reduce the number of write/delete operations from my application, which should greatly decrease the rate at which the hints CF is growing in case of time outs to the remote DC. I'm also planning to stick some more memory in the servers. Can you think of other wise things I might have missed? Thanks for your feedback -- it's highly appreciated! Tom On Fri, Dec 6, 2013 at 4:41 PM, Rahul Menon <ra...@apigee.com> wrote: > Tom, > > you should look at phi_convict_threshold and try and increase the value if > you have too much chatter on your network. > > Also, rebuilding the entire node because of a OOM does not make sense, > could you please post the C* version that you are using & the head size you > have configured? > > Thanks > Rahul > > > On Tue, Dec 3, 2013 at 7:41 PM, Tom van den Berge <t...@drillster.com>wrote: > >> Rahul, >> >> This problem occurs every now and then, and currently everything is ok, >> so there are no hints. But whenever it happens, the hints are quickly >> piling up. This results in heap problems on the node ("Heap is 0.813462 >> full..." appears many times). This in turn results in the flushing of the >> 'hints' column family, to relieve memory pressure. According to the log >> message, the size varies between 50 and 60MB). But since the >> HintedHandoffManager is reading from the hints CF, it will probably pull it >> back into a memtable again -- that's at least my understanding of how it >> works. >> >> So I guess that flushing the hints CF while the HintedHandoffManager is >> working on it only makes things worse, and it could be the reason that the >> process never ends. >> >> What I typically see when this happens is that the hints keep piling up, >> and eventually the node comes to a grinding halt (OOM). Then I have to >> rebuild the node entirely (only removing the hints doesn't work). >> >> The reason for hints to start accumulating in the first place might be a >> spike in CF writes that must be replicated to a node in another data >> center. The available bandwidth to that data center might not be able to >> handle the data quickly enough, resulting in stored hints. The >> HintedHandoff task that is started is targeting that remote node. >> >> >> Thanks, >> Tom >> >> >> On Tue, Dec 3, 2013 at 2:22 PM, Rahul Menon <ra...@apigee.com> wrote: >> >>> Tom, >>> >>> Do you know why these hints are piling up? What is the size of the hints >>> cf? >>> >>> Thanks >>> Rahul >>> >>> >>> On Tue, Dec 3, 2013 at 6:41 PM, Tom van den Berge <t...@drillster.com>wrote: >>> >>>> Hi Rahul, >>>> >>>> Thanks for your reply. >>>> >>>> I have never seen message like "Timed out replaying hints to...", which >>>> is a good thing then, I suppose ;) >>>> >>>> Normally, I do see the "Finished hinted handoff..." log message. >>>> However, every now and then this message is not logged, not even after >>>> several hours. This is the problem I'm trying to solve. >>>> >>>> The log messages you describe are quite course-grained; they only tell >>>> you that a task has started or finished, but not how this task is >>>> progressing. And that's exactly what I would like to know if I see that a >>>> task has started, but has not finished after a reasonable amount of time. >>>> >>>> So I guess the only way to see learn the progress is to look inside the >>>> 'hints' column family then.I'll give that a try. >>>> >>>> >>>> Thanks, >>>> Tom >>>> >>>> >>>> On Tue, Dec 3, 2013 at 1:43 PM, Rahul Menon <ra...@apigee.com> wrote: >>>> >>>>> Tom, >>>>> >>>>> You should check the size of the hints column family to determine how >>>>> much are present. The hints are a super column family and its keys are >>>>> destination tokens. You could look at it if you would like. >>>>> >>>>> Hints send and timedouts are logged, you should be seeing something >>>>> like >>>>> >>>>> Timed out replaying hints to {}; aborting ({} delivered >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> OR >>>>> >>>>> Finished hinted handoff of {} rows to endpoint {} >>>>> >>>>> >>>>> >>>>> Thanks >>>>> Rahul >>>>> >>>>> >>>>> On Tue, Dec 3, 2013 at 2:36 PM, Tom van den Berge >>>>> <t...@drillster.com>wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Is there a way to monitor the progress of a hinted handoff task? >>>>>> >>>>>> I found the following two mbeans providing some info: >>>>>> >>>>>> org.apache.cassandra.internal:type=HintedHandoff, which tells me that >>>>>> there is 1 active task, and >>>>>> org.apache.cassandra.db:type=HintedHandoffManager#countPendingHints(), >>>>>> which quite often gives a timeout when executed. >>>>>> >>>>>> Ideally, I would like to see how many hints have been sent (e.g. over >>>>>> the last minute or so), and how many hints are still to be sent >>>>>> (although I >>>>>> assume that's what countPendingHints normally does?) >>>>>> >>>>>> I'm experiencing hinted handoff tasks that are started, but never >>>>>> finish, so I would like to know what the task is doing. >>>>>> >>>>>> My log shows this: >>>>>> >>>>>> INFO [HintedHandoff:1] 2013-12-02 >>>>>> 13:49:05,325 HintedHandOffManager.java (line 297) Started hinted handoff >>>>>> for host: 6f80b942-5b6d-4233-9827-3727591abf55 with IP: /10.55.156.66 >>>>>> (nothing more for [HintedHandoff:1]) >>>>>> >>>>>> The node is up and running, the network connection is ok, no gossip >>>>>> messages appear in the logs. >>>>>> >>>>>> Any idea is welcome. >>>>>> (Casandra 1.2.3) >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Drillster BV >>>>>> Middenburcht 136 >>>>>> 3452MT Vleuten >>>>>> Netherlands >>>>>> >>>>>> +31 30 755 5330 >>>>>> >>>>>> Open your free account at www.drillster.com >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Drillster BV >>>> Middenburcht 136 >>>> 3452MT Vleuten >>>> Netherlands >>>> >>>> +31 30 755 5330 >>>> >>>> Open your free account at www.drillster.com >>>> >>> >>> >> >> >> -- >> >> Drillster BV >> Middenburcht 136 >> 3452MT Vleuten >> Netherlands >> >> +31 30 755 5330 >> >> Open your free account at www.drillster.com >> > > -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com