bmahler commented on PR #465:
URL: https://github.com/apache/mesos/pull/465#issuecomment-1907166415
> I'm not very familiar with jemalloc, but presumably the intent is to have,
like other allocators, at least one arena per-thread to limit contention - by
hard-coding a fixed total number of arenas are we sure it won't affect some
workloads?
this is only being hardcoded for the mesos agent specifically, so I guess by
workloads you mean the mesos agent in different environments? this patch does
fix the known problem of excessive memory consumption on large servers. I'm not
sure we're benefiting much from jemalloc's arenas per thread since libprocess
doesn't run Processes on the same threads consistently, so the affinity isn't
there, although I suppose malloc contention could benefit from more per-thread
malloc arenas
here's some histograms on memory consumption before vs after going to 4
threads, node counts partially snipped out:
```
30 MB | | |
40 MB | | * |
50 MB | | ***** |
60 MB | | *** |
70 MB | | ****** |
80 MB | | ********* |
90 MB | | ****************** |
100 MB | | ***************************** |
110 MB | | ********************************** |
120 MB | | ******************************************** |
130 MB | | ****************************************************** |
140 MB | | ******************************************************* |
150 MB | | ************************************************************ |
160 MB | | ***************************************************** |
170 MB | | ********************************************* |
180 MB | | ***************************** |
190 MB | | ****************** |
200 MB | | ************ |
210 MB | | ******* |
220 MB | | ***** |
230 MB | | **** |
240 MB | | **** |
250 MB | | **** |
260 MB | | *** |
270 MB | | *** |
280 MB | | ** |
290 MB | | ** |
300 MB | | * |
310 MB | | * |
320 MB | | * |
330 MB | | * |
340 MB | | * |
350 MB | | * |
360 MB | | * |
370 MB | | |
380 MB | | |
390 MB | | |
400 MB | | |
410 MB | | |
420 MB | | |
430 MB | | |
440 MB | | |
450 MB | | |
460 MB | | |
470 MB | | |
480 MB | | |
490 MB | | |
500 MB | | |
```
vs
```
30 MB | | |
40 MB | | |
50 MB | | ********************** |
60 MB | | ************************************************************ |
70 MB | | **************** |
80 MB | | * |
90 MB | | |
100 MB | | |
110 MB | | |
120 MB | | |
130 MB | | |
140 MB | | |
150 MB | | |
160 MB | | |
170 MB | | |
180 MB | | |
190 MB | | |
200 MB | | |
```
Of course the agents in the second histogram have not run for as long as in
the first histogram, but I can run another analysis and see how it looks
currently (years later :)).
> Out of interest, have you considered limiting the number of libprocess
threads instead, since you mention that most of the are dormant (I can see the
memory savings wouldn't be as significant)? Typically, how many cores do the
agents hosts have?
yes that's a good suggestion and we probably should be doing this in
addition to limiting jemalloc's thread count, note that libprocess currently
enforces 8 cores to overcome deadlocking in mesos tests:
https://issues.apache.org/jira/browse/MESOS-818
Looking at a random agent in production with 133 threads, it has about 123MB
of RSS and 1.5GB of VSS, all these threads are definitely bloating the VSS but
not sure how much they are increasing the RSS. I can report back how much
reducing the thread count helps cut this down, but probably won't be able to
look into this soon.
Probably the mesos agent should be defaulting the number of libprocess
worker threads to some reasonable maximum to avoid creating too many on large
servers, but internally we could limit this via the environment variable
instead.
We have some servers with 128 reported cores (64 cores / 128 hyperthreads),
and there are upcoming servers that will have 256 reported cores, so jemalloc
is quite wasteful on these servers.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]