> > Hello, > taking latest trunk gcc, I built Firefox and Chromium. Both > projects compiled without debugging symbols and -O2 on an 8-core > machine. > > Firefox: > -flto=9, peak memory usage (in LTRANS): 11GB > > Chromium: > -flto=6, peak memory usage (in parallel WPA phase ): 16.5GB
I see, the ltrans memory use is however about the same later in the game. > > For details please see attached with graphs. The attachment contains > also -fmem-report and -fmem-report-wpa. > I think reduced memory footprint to ~3.5GB is a bit optimistic: > http://gcc.gnu.org/gcc-4.9/changes.html I will need to re-measure my setup - it is what I got last time with basically same configuration. It depends on parallelism, you should get sub 4GB peak with -flto=1, right? We should clarify this in changes.html. > > Is there any way we can reduce the memory footprint? Looking at the memreport we get for ggc memory: Chromium: cgraph.c:869 (cgraph_create_edge_1) 0: 0.0% 0: 0.0% 274319552: 4.8% 0: 0.0% 2637688 cgraph.c:510 (cgraph_allocate_node) 0: 0.0% 0: 0.0% 426228128: 7.5% 0: 0.0% 1299476 toplev.c:960 (realloc_for_line_map) 0: 0.0% 357908640: 3.8% 1073743896:18.8% 184: 0.0% 10 tree-streamer-in.c:621 (streamer_alloc_tree) 216054000:86.6% 7623611824:80.2% 2536849136:44.5% 57818592:36.0% 69421368 Total 249562346 9504578411 5700671942 160593619 97146243 source location Garbage Freed Leak Overhead Times Firefox: cgraph.c:869 (cgraph_create_edge_1) 0: 0.0% 0: 0.0% 130358176: 6.9% 0: 0.0% 1253444 cgraph.c:510 (cgraph_allocate_node) 0: 0.0% 0: 0.0% 182236800: 9.7% 0: 0.0% 555600 toplev.c:960 (realloc_for_line_map) 0: 0.0% 89503888: 5.5% 268468240:14.3% 160: 0.0% 13 tree-streamer-in.c:621 (streamer_alloc_tree) 93089976:77.5% 972848816:59.6% 639230248:33.9% 21332480:32.3% 13496198 Total 120076578 1632997043 1883064062 65981723 24732501 source location Garbage Freed Leak Overhead Times So chromium uses quite a lot more trees and also seem to have about twice as many functions. Next time, it is useful to include -Q while collecting the data - it shows individual GGC runs and also memory usage accounted per pass. That way we would know if there are a lot more functions to start with, or just more inlining going on. I have older patch that introduces cache to line map stremaing reducing its size quite a bit, that should save some memory of realloc_for_line_map. I will dig it out and update to current tree. I also wonder where the rest of memory goes, since the graphs shows about 10GB for Firefox. Some is probably accounting of mmap files, also gold's memory usage. We collect only some of memory usage that is not in ggc. Vectors: Chromium: ipa-cp.c:2421 (grow_edge_clone_vectors) 17225752: 6.9% 17225752 1: 0.0% vec.h:1393 (copy) 17291228: 6.9% 100465316 1499009: 3.7% lto-cgraph.c:141 (lto_symtab_encoder_encode) 30436272:12.2% 53192752 1460: 0.0% passes.c:2254 (execute_one_pass) 53853360:21.6% 83885960 1426939: 3.5% ipa-inline-analysis.c:974 (inline_summary_alloc) 84406056:33.8% 137806000 484472: 1.2% Total 249721648 40747241 Firefox: ipa-cp.c:2421 (grow_edge_clone_vectors) 7753312: 6.1% 7753312 1: 0.0% ipa-inline-analysis.c:4053 (read_inline_edge_sum 8758216: 6.9% 26420804 909584: 4.9% ipa-ref.c:54 (ipa_record_reference) 10747880: 8.4% 20943288 371083: 2.0% lto-cgraph.c:141 (lto_symtab_encoder_encode) 19756008:15.5% 23548272 1335: 0.0% passes.c:2254 (execute_one_pass) 26769688:21.0% 41942904 716378: 3.9% ipa-inline-analysis.c:974 (inline_summary_alloc) 40110248:31.5% 62026480 284283: 1.5% Total 127480444 18430703 that seems as usual. 249MB seems acceptable. Bitmaps seems to be dominated by ipa-reference. On Chromium this pass seems to go crazy, having about 800000MB of bitmaps. Perhaps you could try to get data with -fno-ipa-reference? We ought to get stats on hashtables, since these probably consume quite some memory during LTO streaing. Could you perhaps also get -flto-report? Honza > > Attachment (due to size restriction): > https://drive.google.com/file/d/0B0pisUJ80pO1bnV5V0RtWXJkaVU/edit?usp=sharing > > Thank you, > Martin