Nitzan, When this was happening on my system it was related to the jvm garbage collection (GC) on the elasticsearch cluster kicking in. When it does so, it stops processing all messages until it is done. I got around the issue by adding another node to my Elasticsearch instance.
You can check your GC by running a query against your ES Node. Note that one of the nodes is the Graylog ES Client curl -XGET 'http://localhost:9200/_nodes/jvm?pretty=true' Before I added the node, I was getting the same thing, journal queue would max out and Out messages stayed at 0. Hope that helps, -Bill On Thursday, February 16, 2017 at 3:37:02 AM UTC-10, Nitzan Haimovich wrote: > > Hi, > > We have a cluster of 3 Graylog nodes. Each node had 8 cores and 32GB > memory. > The cluster works pretty well, we gain a very nice throughput (around > 40,000 msgs for input and output). > We encounter a very strange problem tho - Sometimes, with no clear reason, > one or two nodes suddenly stops to process messages and output to ES. > Then, we have two options: > 1. Wait for it to come back to work. It usually happens after the Journal > get filled. > 2. Restart the Graylog service. > > Any idea why such a thing could happen? > Let me know if you need me to attach more info. > > Thanks! > > Nitzan > -- You received this message because you are subscribed to the Google Groups "Graylog Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/graylog2/ee5de264-7cdc-473c-b674-91b59a689b9e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
