Hello Nanda, what is the impact of increasing the duration of the max_hint_window_in_ms >
You might want to be aware of the relations between 'max_hint_window_in_ms ', 'gc_grace_seconds' and 'TTLs' to stay away from side effect and have the desired impact only, my colleague Radovan wrote about this here: http://thelastpickle.com/blog/2018/03/21/hinted-handoff-gc-grace-demystified.html . Other than that, I think the 3 hours were picked when hints were saved into a system table. Previous design was often leading to hints being stuck, especially when they were growing too big. Yet on C*3+, hints are now stored as files. I did not hear about many issues nowadays. I did not hear about people trying to increase this value either. Thus I guess that if you handle the hinted handoff smoothly not to harm cluster when the node goes back up, you consider side effects of changing this value as mentioned by Radovan and you are using Cassandra 3+, I guess you could give it a try. Also keep in mind that hints are an optimization (as it can be disabled). There is no guarantees delivery for hints (or at least it was the case before C*3). This (alone) will not 'allow you' to disable repairs safely. Now I don't have experience with storing hints longer since they are stored in files. If you do, you should probably try it in some test cluster first. But I'd be happy to hear about your experience with it. --------------------------- Also, I think it might be more important to investigate why nodes are going down and fix this instead/first. More hints might mean more pressure on the nodes and you might have counter-productive impacts by increasing the hints storage time. A couple of random commands to investigate why nodes are going down, maybe these commands I often use might be of some help to you: - grep -e "WARN" -e "ERROR" /var/log/cassandra/system.log # Anything in the output there is probably worth your attention. If nodes go down something should appear here. - watch -d nodetool tpstats # Here you might use this on worst node at the worst time to see if any threads are stacking in the 'pending' state. Also check for 'blocked' and 'dropped' If you'd like some help with your 'main issue' first, we would need more details and context. Hope that any of this is of some help :). C*heers, ----------------------- Alain Rodriguez - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com Le ven. 24 mai 2019 à 19:59, Nandakishore Tokala < nandakishore.tok...@gmail.com> a écrit : > HI All, > > what is the impact of increasing the duration of > the max_hint_window_in_ms, as we are seeing nodes are going down and some > times we are not bringing them up in 3 hour's and during the repair, we are > seeing a lot of streaming data, due to the node is down. > > so we are planning to increase the max_hint_window_in_ms time so that we > will less streaming during repair, so is there any drawback in increasing > the max_hint_window_in_ms?, and what is the ideal time for it(6 hrs, 12 > hrs, 24 hrs) > > Thanks > Nanda >