On Tue, Mar 15, 2016 at 2:34 PM, Nicholas Nethercote <n.netherc...@gmail.com> wrote: > > ----------------------------------------------------------------------------- > Conclusion > ----------------------------------------------------------------------------- > > The overhead per content process is significant. I can see scope for moderate > improvements, but I'm having trouble seeing how big improvements can be made. > Without big improvements, scaling the number of content processes beyond > 4 (*maybe* 8) won't be possible. > > - JS overhead is the biggest factor. We execute a lot of JS code just starting > up for each content process -- can that be reduced? We should also consider a > smaller nursery size limit for content processes. > > - Heap overhead is significant. Reducing the page-cache size could save a > couple of MiBs. Improvements beyond that are hard. Turning on jemalloc4 > *might* help a bit, but I wouldn't bank on it, and there are other > complications with that. > > - Static data is a big chunk. It's hard to make much of a dent there because > it has a *very* long tail. > > - The remaining buckets are a lot smaller.
Just to expand upon that, here are the top-level numbers for all three platforms, both small and large processes. For this computation I assumed that "explicit" memory is entirely a subset of "resident-unique", which is probably true or very close to it. (Note: this data looks best with a fixed-width font.) Linux64, small processes - resident-unique 38.1 MiB (100%) - explicit - 22.8 MiB (60%) - js-non-window - 11.2 MiB (29%) - other - 7.8 MiB (20%) - heap-overhead - 3.8 MiB (10%) - static? - 15.3 MiB (40%) Linux64, large processes - resident-unique 52.6 MiB (100%) - explicit - 38.7 MiB (74%) - js-non-window - 22.3 MiB (42%) - other - 9.8 MiB (19%) - heap-overhead - 6.6 MiB (13%) - static? - 13.9 MiB (26%) Mac64, small processes - resident-unique 49.3 MiB (100%) - static? - 27.9 MiB (57%) - explicit - 21.4 MiB (43%) - js-non-windows - 11.1 MiB (23%) - other - 6.9 MiB (14%) - heap-overhead - 3.4 MiB ( 7%) Mac64, large processes - resident-unique 59.4 MiB (100%) - explicit - 30.1 MiB (51%) - js-non-windows - 15.7 MiB (26%) - heap-overhead - 7.7 MiB (13%) - other - 6.7 MiB (11%) - static? - 29.3 MiB (49%) Win32, small processes - resident-unique 39.3 MiB (100%) - static? - 23.4 MiB (60%) - explicit - 15.9 MiB (40%) - js-non-windows - 8.4 MiB (21%) - heap-overhead - 3.8 MiB (10%) - other - 3.7 MiB ( 9%) Win32, large processes - resident-unique 51.6 MiB (100%) - explicit - 28.5 MiB (55%) - js-non-windows - 16.1 MiB (31%) - heap-overhead - 6.8 MiB (13%) - other - 5.6 MiB (11%) - static? - 23.1 MiB (45%) The "resident-unique" increases by 38--59 MiB per content process. That's a bit lower than erahm got in his measurements, possibly because his methodology involved doing a lot more work in each content process. Of that increase: - "static?" accounts for 26--60% - "explicit/js-non-windows" accounts for 21--42% - "explicit/heap-overhead" accounts for 7--13% - "explicit/other" (everything not accounted for by the above three lines) accounts for 9--20% About the "static?" measure -- On Linux64, libxul contains about 5.3 MiB of static data. Other libraries used by Firefox contain much less. So I don't know what else is being measured in the "static?" number (i.e. what accounts for the change in difference between "resident-unique" and "explicit"). Nick _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform