On Tue, Mar 15, 2016 at 2:34 PM, Nicholas Nethercote <[email protected]>
wrote:
>
>
-----------------------------------------------------------------------------
> Conclusion
>
-----------------------------------------------------------------------------
>
> The overhead per content process is significant. I can see scope for
moderate
> improvements, but I'm having trouble seeing how big improvements can be
made.
> Without big improvements, scaling the number of content processes beyond
> 4 (*maybe* 8) won't be possible.
>
> - JS overhead is the biggest factor. We execute a lot of JS code just
starting
> up for each content process -- can that be reduced? We should also
consider a
> smaller nursery size limit for content processes.
>
> - Heap overhead is significant. Reducing the page-cache size could save a
> couple of MiBs. Improvements beyond that are hard. Turning on jemalloc4
> *might* help a bit, but I wouldn't bank on it, and there are other
> complications with that.
>
> - Static data is a big chunk. It's hard to make much of a dent there
because
> it has a *very* long tail.
>
> - The remaining buckets are a lot smaller.
Just to expand upon that, here are the top-level numbers for all three
platforms, both small and large processes. For this computation I assumed
that
"explicit" memory is entirely a subset of "resident-unique", which is
probably
true or very close to it. (Note: this data looks best with a fixed-width
font.)
Linux64, small processes
- resident-unique 38.1 MiB (100%)
- explicit - 22.8 MiB (60%)
- js-non-window - 11.2 MiB (29%)
- other - 7.8 MiB (20%)
- heap-overhead - 3.8 MiB (10%)
- static? - 15.3 MiB (40%)
Linux64, large processes
- resident-unique 52.6 MiB (100%)
- explicit - 38.7 MiB (74%)
- js-non-window - 22.3 MiB (42%)
- other - 9.8 MiB (19%)
- heap-overhead - 6.6 MiB (13%)
- static? - 13.9 MiB (26%)
Mac64, small processes
- resident-unique 49.3 MiB (100%)
- static? - 27.9 MiB (57%)
- explicit - 21.4 MiB (43%)
- js-non-windows - 11.1 MiB (23%)
- other - 6.9 MiB (14%)
- heap-overhead - 3.4 MiB ( 7%)
Mac64, large processes
- resident-unique 59.4 MiB (100%)
- explicit - 30.1 MiB (51%)
- js-non-windows - 15.7 MiB (26%)
- heap-overhead - 7.7 MiB (13%)
- other - 6.7 MiB (11%)
- static? - 29.3 MiB (49%)
Win32, small processes
- resident-unique 39.3 MiB (100%)
- static? - 23.4 MiB (60%)
- explicit - 15.9 MiB (40%)
- js-non-windows - 8.4 MiB (21%)
- heap-overhead - 3.8 MiB (10%)
- other - 3.7 MiB ( 9%)
Win32, large processes
- resident-unique 51.6 MiB (100%)
- explicit - 28.5 MiB (55%)
- js-non-windows - 16.1 MiB (31%)
- heap-overhead - 6.8 MiB (13%)
- other - 5.6 MiB (11%)
- static? - 23.1 MiB (45%)
The "resident-unique" increases by 38--59 MiB per content process. That's a
bit
lower than erahm got in his measurements, possibly because his methodology
involved doing a lot more work in each content process.
Of that increase:
- "static?" accounts for 26--60%
- "explicit/js-non-windows" accounts for 21--42%
- "explicit/heap-overhead" accounts for 7--13%
- "explicit/other" (everything not accounted for by the above three lines)
accounts for 9--20%
About the "static?" measure -- On Linux64, libxul contains about 5.3 MiB of
static data. Other libraries used by Firefox contain much less. So I don't
know
what else is being measured in the "static?" number (i.e. what accounts for
the
change in difference between "resident-unique" and "explicit").
Nick
_______________________________________________
dev-platform mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-platform