Re: [graylog2] Graylog stopped working

cypherbit Thu, 05 Jan 2017 04:11:28 -0800

Hello,

after deleting the notification for "*Elasticsearch cluster unhealthy (RED) 
(triggered 6 days ago)"* and rebooting the server I didn't get notified of 
this problem again.


I still see:

*Elasticsearch clusterThe possible Elasticsearch cluster states and more 
related information is available in the Graylog documentation.*
*Elasticsearch cluster is yellow. Shards: 4 active, 0 initializing, 0 
relocating, 4 unassigned, What does this mean?*

May I delete the disk journal now and how?

On Tuesday, January 3, 2017 at 8:57:27 AM UTC+1, cyph...@gmail.com wrote:

> Jochen,
>
> thank you, I looked at the following logs:
>
> root@graylog:/var/log/graylog/elasticsearch# nano current
>   GNU nano 
> 2.2.6                                                                   
> File: current
>
> 2017-01-02_09:16:55.57535 [2017-01-02 10:16:55,574][INFO 
> ][node                     ] [Molecule Man] version[2.3.1], pid[924], 
> build[bd98092/2016-04-04T12:25:05Z]
> 2017-01-02_09:16:55.57604 [2017-01-02 10:16:55,576][INFO 
> ][node                     ] [Molecule Man] initializing ...
> 2017-01-02_09:16:56.80747 [2017-01-02 10:16:56,807][INFO 
> ][plugins                  ] [Molecule Man] modules [reindex, 
> lang-expression, lang-groovy], plugins [kopf], sites [kopf]
> 2017-01-02_09:16:56.84193 [2017-01-02 10:16:56,841][INFO 
> ][env                      ] [Molecule Man] using [1] data paths, mounts 
> [[/var/opt/graylog/data (/dev/sdb1)]], net usable_space [85.1gb], net 
> total_space [98.3gb], spins? [possib$
> 2017-01-02_09:16:56.84211 [2017-01-02 10:16:56,842][INFO 
> ][env                      ] [Molecule Man] heap size [1.7gb], compressed 
> ordinary object pointers [true]
> 2017-01-02_09:16:56.84234 [2017-01-02 10:16:56,842][WARN 
> ][env                      ] [Molecule Man] max file descriptors [64000] 
> for elasticsearch process likely too low, consider increasing to at least 
> [65536]
> 2017-01-02_09:17:02.18937 [2017-01-02 10:17:02,189][INFO 
> ][node                     ] [Molecule Man] initialized
> 2017-01-02_09:17:02.19168 [2017-01-02 10:17:02,191][INFO 
> ][node                     ] [Molecule Man] starting ...
> 2017-01-02_09:17:02.56976 [2017-01-02 10:17:02,569][INFO 
> ][transport                ] [Molecule Man] publish_address {
> 192.168.1.22:9300}, bound_addresses {192.168.1.22:9300}
> 2017-01-02_09:17:02.57613 [2017-01-02 10:17:02,576][INFO 
> ][discovery                ] [Molecule Man] graylog/62ruQcNHSOahWbBEe71egw
> 2017-01-02_09:17:12.66122 [2017-01-02 10:17:12,661][INFO 
> ][cluster.service          ] [Molecule Man] new_master {Molecule 
> Man}{62ruQcNHSOahWbBEe71egw}{192.168.1.22}{192.168.1.22:9300}, reason: 
> zen-disco-join(elected_as_master, [0] joins rec$
> 2017-01-02_09:17:12.73775 [2017-01-02 10:17:12,737][INFO 
> ][http                     ] [Molecule Man] publish_address {
> 192.168.1.22:9200}, bound_addresses {192.168.1.22:9200}
> 2017-01-02_09:17:12.73913 [2017-01-02 10:17:12,739][INFO 
> ][node                     ] [Molecule Man] started
> 2017-01-02_09:17:12.98417 [2017-01-02 10:17:12,984][INFO 
> ][gateway                  ] [Molecule Man] recovered [1] indices into 
> cluster_state
> 2017-01-02_09:17:15.92973 [2017-01-02 10:17:15,929][INFO 
> ][cluster.service          ] [Molecule Man] added 
> {{graylog-52498cb4-349d-494a-8c6b-692fd78e3c6c}{56bjekcxQl6kwDCKKmeGuw}{192.168.1.22}{192.168.1.22:9350}{client=true,
>  
> data=false, mas$
> 2017-01-02_09:17:17.20882 [2017-01-02 10:17:17,208][INFO 
> ][cluster.routing.allocation] [Molecule Man] Cluster health status changed 
> from [RED] to [YELLOW] (reason: [shards started [[graylog_0][0], 
> [graylog_0][2], [graylog_0][2], [graylo$
>
>
> root@graylog:/var/log/graylog/elasticsearch# nano graylog.log
> [2016-12-30 07:41:38,399][WARN ][index.translog           ] [Slick] 
> [graylog_0][0] failed to delete unreferenced translog files
> java.nio.file.NoSuchFileException: 
> /var/opt/graylog/data/elasticsearch/graylog/nodes/0/indices/graylog_0/0/translog
>         at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>         at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>         at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>         at 
> sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:427)
>         at java.nio.file.Files.newDirectoryStream(Files.java:457)
>         at 
> org.elasticsearch.index.translog.Translog$OnCloseRunnable.handle(Translog.java:726)
>         at 
> org.elasticsearch.index.translog.Translog$OnCloseRunnable.handle(Translog.java:714)
>         at 
> org.elasticsearch.index.translog.ChannelReference.closeInternal(ChannelReference.java:67)
>         at 
> org.elasticsearch.common.util.concurrent.AbstractRefCounted.decRef(AbstractRefCounted.java:64)
>         at 
> org.elasticsearch.index.translog.TranslogReader.close(TranslogReader.java:143)
>         at 
> org.apache.lucene.util.IOUtils.closeWhileHandlingException(IOUtils.java:129)
>         at 
> org.elasticsearch.index.translog.Translog.recoverFromFiles(Translog.java:354)
>         at 
> org.elasticsearch.index.translog.Translog.<init>(Translog.java:179)
>         at 
> org.elasticsearch.index.engine.InternalEngine.openTranslog(InternalEngine.java:208)
>         at 
> org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:151)
>         at 
> org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25)
>         at 
> org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1515)
>         at 
> org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1499)
>         at 
> org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:972)
>         at 
> org.elasticsearch.index.shard.IndexShard.performTranslogRecovery(IndexShard.java:944)
>         at 
> org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:241)
>         at 
> org.elasticsearch.index.shard.StoreRecoveryService.access$100(StoreRecoveryService.java:56)
>         at 
> org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:129)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
>
> Could it be that the Notification for:
>
> *Elasticsearch cluster unhealthy (RED) (triggered 6 days ago)The 
> Elasticsearch cluster state is RED which means shards are unassigned. This 
> usually indicates a crashed and corrupt cluster and needs to be 
> investigated. Graylog will write into the local disk journal. Read how to 
> fix this in  the Elasticsearch setup documentation.*
>
> Is an old one and now resolved?
>
>
> Although I still get:
>
> *Elasticsearch clusterThe possible Elasticsearch cluster states and more 
> related information is available in the Graylog documentation.*
> *Elasticsearch cluster is yellow. Shards: 4 active, 0 initializing, 0 
> relocating, 4 unassigned, What does this mean?*
>
> As mentioned before, we don't mind loosing all the data, if the 
> configurations, dashboards, streams are all preserved. If this somehow 
> helps in resolving these issues.
>
>
>
>
> On Friday, December 30, 2016 at 11:29:18 AM UTC+1, Jochen Schalanda wrote:
>
>> Hi,
>>
>> you first have to fix the cluster health state of your Elasticsearch 
>> cluster before you should even think about deleting the Graylog disk 
>> journal.
>>
>> Check the Elasticsearch logs for corresponding hints: 
>> http://docs.graylog.org/en/2.1/pages/configuration/file_location.html#omnibus-package
>>
>> Cheers,
>> Jochen
>>
>> On Friday, 30 December 2016 08:01:20 UTC+1, cyph...@gmail.com wrote:
>>>
>>> Thank you again, we're almost there:
>>>
>>> df -m
>>> Filesystem     1M-blocks  Used Available Use% Mounted on
>>> udev                1495     1      1495   1% /dev
>>> tmpfs                300     1       300   1% /run
>>> /dev/dm-0          15282  4902      9582  34% /
>>> none                   1     0         1   0% /sys/fs/cgroup
>>> none                   5     0         5   0% /run/lock
>>> none                1500     0      1500   0% /run/shm
>>> none                 100     0       100   0% /run/user
>>> /dev/sda1            236   121       103  55% /boot
>>> /dev/sdb1         100664  8181     87347   9% /var/opt/graylog/data
>>>
>>>
>>> As you predicted we're still getting errors:
>>>
>>> Elasticsearch cluster unhealthy (RED)
>>> The Elasticsearch cluster state is RED which means shards are 
>>> unassigned. This usually indicates a crashed and corrupt cluster and needs 
>>> to be investigated. Graylog will write into the local disk journal. Read 
>>> how to fix this in the Elasticsearch setup documentation. 
>>> <http://docs.graylog.org/en/2.1/pages/configuration/elasticsearch.html#cluster-status-explained>
>>>
>>> I looked at the above provided link, but don't know how to delete the 
>>> journal, any help with this last step would be appreciated.
>>>
>>>
>>> On Wednesday, December 28, 2016 at 4:59:35 PM UTC+1, Edmundo Alvarez 
>>> wrote:
>>>
>>>> This documentation page covers how to extend the disk space in the OVA: 
>>>> http://docs.graylog.org/en/2.1/pages/configuration/graylog_ctl.html#extend-disk-space
>>>>  
>>>>
>>>> Please note that Graylog's journal is sometimes corrupted when it ran 
>>>> out of disk space. In that case you may need to delete the journal folder. 
>>>>
>>>> Regards, 
>>>> Edmundo 
>>>>
>>>> > On 28 Dec 2016, at 16:04, cyph...@gmail.com wrote: 
>>>> > 
>>>> > Thank you Edmundo. 
>>>> > 
>>>> > It appears we ran out of space. 
>>>> > 
>>>> > df -h 
>>>> > Filesystem      Size  Used Avail Use% Mounted on 
>>>> > udev            1.5G  4.0K  1.5G   1% /dev 
>>>> > tmpfs           300M  388K  300M   1% /run 
>>>> > /dev/dm-0        15G   15G     0 100% / 
>>>> > none            4.0K     0  4.0K   0% /sys/fs/cgroup 
>>>> > none            5.0M     0  5.0M   0% /run/lock 
>>>> > none            1.5G     0  1.5G   0% /run/shm 
>>>> > none            100M     0  100M   0% /run/user 
>>>> > /dev/sda1       236M  121M  103M  55% /boot 
>>>> > 
>>>> > We don't mind loosing all the history, we just want the server up and 
>>>> running. If the space available can be extended even better (keep in mind 
>>>> this is OVA). Any suggestions? 
>>>> > 
>>>> > On Wednesday, December 28, 2016 at 9:18:24 AM UTC+1, Edmundo Alvarez 
>>>> wrote: 
>>>> > Hello, 
>>>> > 
>>>> > I would start by looking into your logs in /var/log/graylog, 
>>>> specially those in the "server" folder, which may give you some errors to 
>>>> start debugging the issue. 
>>>> > 
>>>> > Hope that helps. 
>>>> > 
>>>> > Regards, 
>>>> > Edmundo 
>>>> > 
>>>> > > On 27 Dec 2016, at 20:55, cyph...@gmail.com wrote: 
>>>> > > 
>>>> > > We've been using Graylog OVA 2.1 for a while now, but it stopped 
>>>> working all of the sudden. 
>>>> > > 
>>>> > > We're getting: 
>>>> > > 
>>>> > >  Server currently unavailable 
>>>> > > We are experiencing problems connecting to the Graylog server 
>>>> running on https://graylog:443/api. Please verify that the server is 
>>>> healthy and working correctly. 
>>>> > > You will be automatically redirected to the previous page once we 
>>>> can connect to the server. 
>>>> > > Do you need a hand? We can help you. 
>>>> > > Less details 
>>>> > > This is the last response we received from the server: 
>>>> > > Error message 
>>>> > > cannot GET https://graylog:443/api/system/cluster/node (500) 
>>>> > > 
>>>> > > 
>>>> > > ubuntu@graylog:~$ sudo graylog-ctl status 
>>>> > > run: elasticsearch: (pid 32780) 74s; run: log: (pid 951) 10764s 
>>>> > > down: etcd: 0s, normally up, want up; run: log: (pid 934) 10764s 
>>>> > > run: graylog-server: (pid 33146) 35s; run: log: (pid 916) 10764s 
>>>> > > down: mongodb: 0s, normally up, want up; run: log: (pid 924) 10764s 
>>>> > > run: nginx: (pid 32974) 57s; run: log: (pid 914) 10764s 
>>>> > > 
>>>> > > 
>>>> > > How can we begin to troubleshoot the issue, which logs to view...? 
>>>> > > 
>>>> > > -- 
>>>> > > You received this message because you are subscribed to the Google 
>>>> Groups "Graylog Users" group. 
>>>> > > To unsubscribe from this group and stop receiving emails from it, 
>>>> send an email to graylog2+u...@googlegroups.com. 
>>>> > > To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/graylog2/4fb8da46-2e73-42c7-b67d-444c0b801484%40googlegroups.com.
>>>>  
>>>>
>>>> > > For more options, visit https://groups.google.com/d/optout. 
>>>> > 
>>>> > 
>>>> > -- 
>>>> > You received this message because you are subscribed to the Google 
>>>> Groups "Graylog Users" group. 
>>>> > To unsubscribe from this group and stop receiving emails from it, 
>>>> send an email to graylog2+u...@googlegroups.com. 
>>>> > To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/graylog2/9d79cf3a-b221-4419-b94f-f278ec598fe0%40googlegroups.com.
>>>>  
>>>>
>>>> > For more options, visit https://groups.google.com/d/optout. 
>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to graylog2+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/bc347749-f60d-4e88-b6b9-83b559d4b6ee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [graylog2] Graylog stopped working

Reply via email to