[ 
https://issues.apache.org/jira/browse/SOLR-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17742055#comment-17742055
 ] 

Houston Putman commented on SOLR-8803:
--------------------------------------

{quote}I can say that in general it is NOT a good idea for Solr to 
automatically restart after an OOME... because chances are that the OOME is 
just going to happen again repeatedly until the root cause is addressed.
{quote}
Yeah I agree with [~dizzu333] that this is definitely not a general rule, at 
least where I have run Solr. OOMs can happen for a variety of reasons, and many 
times users would want Solr restarted.
{quote}But the creation of the log files and core dumps just delays the 
container restart which leads to less nodes handling the high load, which in 
turn leads to a cluster wide crash.
{quote}
This is certainly a tradeoff that needs to be made. Some would want to wait for 
the core dumps, some would want a restart faster. So having a toggle makes a 
lot of sense.

Just to make sure I understand, the {{-XX:+CrashOnOutOfMemoryError}} option is 
not stopping the container from restarting, it is just delaying it because it 
has to store the crash state? If this is the case, then I am fine having a 
variable for SOLR_OOM_ACTION=(exit|crash|script) (I don't like the idea of 
none).

 

> Generalize OOME handling to work for any OS
> -------------------------------------------
>
>                 Key: SOLR-8803
>                 URL: https://issues.apache.org/jira/browse/SOLR-8803
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 9.0
>            Reporter: Binoy Dalal
>            Assignee: Shawn Heisey
>            Priority: Minor
>              Labels: OOM, oom
>             Fix For: main (10.0), 9.2
>
>         Attachments: SOLR-8803-1.patch, SOLR-8803-10.patch, 
> SOLR-8803-2.patch, SOLR-8803-3.patch, SOLR-8803-4.patch, SOLR-8803-5.patch, 
> SOLR-8803-6.patch, SOLR-8803-7.patch, SOLR-8803-8.patch, SOLR-8803-9.patch, 
> SOLR-8803.patch, oom_win.cmd, solr-8803-build-transcript.txt
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Solr on windows does not currently have a script to kill the process on OOM 
> errors.
> The idea is to write a batch script that works like the OOM kill script for 
> Linux and kills the solr process on OOM errors while creating an OOM log file 
> like the one on Linux systems.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to