Are you using OSS C*? On Fri, Mar 29, 2019 at 1:49 AM Jens Fischer <j.fisc...@sonnen.de> wrote:
> Hi, > > I have a Cassandra setup with multiple data centres. The vast majority of > writes are LOCAL_ONE writes to data center DC-A. One node (lets call this > node A1) in DC-A has accumulated large amounts of hint files (~100 GB). In > the logs of this node I see lots of messages like the following: > > INFO [HintsDispatcher:26] 2019-03-28 01:49:25,217 > HintsDispatchExecutor.java:289 - Finished hinted handoff of file > db485ac6-8acd-4241-9e21-7a2b540459de-1553419324363-1.hints to endpoint / > 10.10.2.55: db485ac6-8acd-4241-9e21-7a2b540459de > > The node 10.10.2.55 is in DC-B, lets call this node B1. There is no > indication whatsoever that B1 was down: Nothing in our monitoring, nothing > in the logs of B1, nothing in the logs of A1. Are there any other > situations where hints to B1 are stored at A1? Other than A1's failure > detection detecting B1 as down I mean. For example could the reason for the > hints be that B1 is overloaded and can not handle the intake from the A1? > Or that the network connection between DC-A and DC-B is to slow? > > While researching this I also found the following information on Stack > Overflow from Ben Slater regarding hints and multi-dc replication: > > Another factor here is the consistency level you are using - a LOCAL_* > consistency level will only require writes to be written to the local DC > for the operation to be considered a success (and hints will be stored for > replication to the other DC). > (…) > The hints are the records of writes that have been made in one DC that are > not yet replicated to the other DC (or even nodes within a DC). I think > your options to avoid them are: (1) write with ALL or QUOROM (not LOCAL_*) > consistency - this will slow down your writes but will ensure writes go > into both DCs before the op completes (2) Don't replicate the data to the > second DC (by setting the replication factor to 0 for the second DC in the > keyspace definition) (3) Increase the capacity of the second DC so it can > keep up with the writes (4) Slow down your writes so the second DC can keep > up. > > > Source: https://stackoverflow.com/a/37382726 > > This reads like hints are used for “normal” (async) replication between > data centres, i.e. hints could show up without any nodes being down > whatsoever. This could explain what I am seeing. Does anyone now more about > this? Does that mean I will see hints even if I disable hinted handoff? > > Any pointers or help are greatly appreciated! > > Thanks in advance > Jens > > Geschäftsführer: Christoph Ostermann (CEO), Oliver Koch, Steffen > Schneider, Hermann Schweizer. > Amtsgericht Kempten/Allgäu, Registernummer: 10655, Steuernummer > 127/137/50792, USt.-IdNr. DE272208908 >