A monthly restart should not be necessary. We did a quarterly restart for one 
Solr 6.x cluster, but haven’t done that for our 8.7 clusters running on Java 
11. 

I was going to check how long ours have been up, but every Solr server was 
restarted 3 days ago to fix the log4j vulnerability.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Dec 13, 2021, at 9:36 AM, Scott <qm...@top-consulting.net> wrote:
> 
> I get this, so then is my experience similar to everyone else's ? Does swap 
> usage just increase until you run out of it ? I mean, I've always taken this 
> to be true and that's why I restart Solr once per month ...
> 
> Is there a way to not have to do this ? That's how this thread all got 
> started.
> 
> Thanks!
> 
> -----Original Message-----
> From: Walter Underwood <wun...@wunderwood.org> 
> Sent: Monday, December 13, 2021 12:24 PM
> To: users@solr.apache.org
> Subject: Re: Solr Cloud Node re-join issue
> 
> The heap size limits the Java heap. Java uses memory that is not in the heap 
> — bytecodes, compiled code, stack, threadlocal, mapped memory, etc.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Dec 13, 2021, at 7:53 AM, Scott <qm...@top-consulting.net> wrote:
>> 
>> So you're saying that Solr can use additional memory on top of what Xmx 
>> limits ? It appears the resident size keeps on increasing, swap was used at 
>> some point but it's not actively paging now.
>> 
>> In FreeBSD you can use procstat or vmstat for more info
>> 
>> root@solrcloud4:/usr/home/scott # procstat -r 57116
>> PID COMM             RESOURCE                          VALUE
>> 57116 java             user time                    07:59:00.567533
>> 57116 java             system time                  00:31:36.921875
>> 57116 java             maximum RSS                         31175620 KB
>> 57116 java             integral shared memory             139301820 KB
>> 57116 java             integral unshared data              46433940 KB
>> 57116 java             integral unshared stack            495295360 KB
>> 57116 java             page reclaims                       11361433
>> 57116 java             page faults                          2919735
>> 57116 java             swaps                                      0
>> 57116 java             block reads                          2918179
>> 57116 java             block writes                         1724654
>> 57116 java             messages sent                       23464275
>> 57116 java             messages received                   13276352
>> 57116 java             signals received                       28619
>> 57116 java             voluntary context switches          66989710
>> 57116 java             involuntary context switches        49835294
>> 
>> So with a SOLR_HEAP value of 16g , I'm now at a total size of 30g which has 
>> already used some swap that it hasn't released yet. This will keep on going 
>> until swap usage is 100% and the box will crash.
>> 
>> I guess my questions are:
>> - Why does Solr use more than 16g ?
>> - Why isn't swapped memory released ?
>> 
>> Thanks!
>> Scott
>> 
>> -----Original Message-----
>> From: Shawn Heisey <apa...@elyograg.org>
>> Sent: Monday, December 13, 2021 12:41 AM
>> To: users@solr.apache.org
>> Subject: Re: Solr Cloud Node re-join issue
>> 
>> On 12/12/2021 4:40 PM, Scott wrote:
>>> However, top still shows Solr using more than 16g . It started at 17g 
>>> and has been steadily growing, now it's at 23g and soon it will go 
>>> into swap
>>> 
>>> PID      USERNAME    THR PRI NICE   SIZE      RES  SWAP STATE    C   TIME   
>>>  WCPU COMMAND
>>> 57116 solr                  165  52    0       235G    23G    0       uwait 
>>>    1  36:20   9.60% java
>> 
>> I have been doing some experiments with a FreeBSD VM running in vmware 
>> player.  I have very little experience with that OS.
>> 
>> I have openjdk11 and apache-solr-8.10.0 installed using pkg.
>> 
>> It looks like top in FreeBSD has no equivalent to the SHR column seen on 
>> Linux.  Being able to see how much shared memory is being used is critical 
>> to seeing a complete picture of memory usage.  I suspect that if we could 
>> see it, the shared memory would be approximately 7GB when the RES column 
>> says 23GB.  This is something I have seen on Linux, and I have deduced that 
>> the actual memory used by the process will be the RES size minus the SHR 
>> size.  Sometimes the shared memory will get quite large.  I have no idea why 
>> this happens, but it does.
>> 
>> There is a Java tool called "jstat" that can give a very accurate picture of 
>> Java program memory usage.  But when the -XX:+PerfDisableSharedMem option is 
>> given to Java, that tool doesn't work.  That option is added with the 
>> default GC tuning options, because it eliminates a severe performance issue 
>> that is sometimes seen with Java software.
>> 
>> If you add the following to solr.in.sh then jstat will work:
>> 
>> GC_TUNE="-XX:+UseG1GC \
>>  -XX:+ParallelRefProcEnabled \
>>  -XX:MaxGCPauseMillis=250 \
>>  -XX:+UseLargePages \
>>  -XX:+AlwaysPreTouch \
>>  -XX:+ExplicitGCInvokesConcurrent"
>> 
>> After adding it, restart Solr and use the following command with the PID of 
>> the Solr process in place of PID, at a time when the RES column for Solr 
>> goes well beyond 16GB:
>> 
>> sudo jstat -gc -t PID 5000 20 > /tmp/jstat.out
>> 
>> That command will take a little less than two minutes to complete.  Then you 
>> can share the /tmp/jstat.out file using a file sharing website. 
>> Don't try to paste it into email ... the lines are VERY long.
>> 
>> If you add up the columns named S0C, S1C, EC, OC, MC, and CCSC for a given 
>> line, that will be pretty close to the process's total memory usage, in KB.
>> 
>> If you want to know what all those columns mean, here's Oracle's 
>> documentation for jstat:
>> 
>> https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jstat.html
>> 
>> I've just gotten a look at the output from vmstat ... looks like that tool 
>> is useless for what I was trying to get from it.  It doesn't have any 
>> columns for swap.  You may have noticed that the si and so columns I 
>> mentioned before are not present.
>> 
>> It is worth noting that on the top output you pasted, that the Solr process 
>> is using zero swap.  On the top screen, are there processes with SWAP 
>> columns significantly larger than zero?  When you see a problem, what is the 
>> output of "swapinfo" ?
>> 
>> Thanks,
>> Shawn
>> 
>> 
>> 
>> This is a private message
> 
> 
> 
> 
> This is a private message

Reply via email to