Hello, This patch tried to use non-clearing memory allocation where possible. This is especially important for very large functions, when arrays of size in the order of n_basic_blocks or num_ssa_names are allocated to hold sparse data sets. For such cases the overhead of memset becomes measurable (and even dominant for the time spent in a pass in some cases, such as the one I recently fixed in ifcvt.c).
This cuts off ~20% of the compile time for the test case of PR54146 at -O1. Not bad for a patch that basically only removes a bunch of memsets. I got another 5% for the changes in tree-ssa-loop-manip.c. A loop over an array with num_ssa_names there is expensive and unnecessary, and it helps to stuff all bitmaps together on a single obstack if you intend to blow them all away at the end (this could be done in a number of other places in the compiler). Clearing livein at the end of add_exit_phis_var also reduces peak memory with ~250MB at that point in the passes pipeline (only to blow up from ~1.5GB peak memory in the GIMPLE optimizers to ~3.6 GB in expand, and to ~8.6GB in IRA, but hey, who's counting? :-) Actually, the worst cases are not fixed with this patch. That'd be IRA (which consumes ~5GB on the test case, out of 8GB total), and tree-PRE. The IRA case looks like it may be hard to fix: Allocating multiple arrays of size O(max_regno) for every loop in init_loop_tree_node. The tree-PRE case is one where the avail arrays are allocated and cleared for every PRE candidate. This looks like a place where a pointer_map should be used instead. I'll tackle that later, when I've addressed more pressing problems in the compilation of the PR54146 test case. This patch was bootstrapped&tested on powerpc64-unknown-linux-gnu. OK for trunk? Kudos to the compile farm people, without them I couldn't even hope to get any of this work done! Ciao! Steven
memman.diff
Description: Binary data