http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54146
--- Comment #17 from Steven Bosscher <steven at gcc dot gnu.org> 2012-08-05 18:48:55 UTC --- (In reply to comment #14) > if-conversion : 177.26 (but due to loop_optimizer_init) Hmm, this is not loop_optimizer_init. All time is spent in the two memset calls in cond_move_process_if_block: /* Build a mapping for each block to the value used for each register. */ max_reg = max_reg_num (); size = (max_reg + 1) * sizeof (rtx); then_vals = (rtx *) alloca (size); else_vals = (rtx *) alloca (size); memset (then_vals, 0, size); memset (else_vals, 0, size); There are O(1e6) registers in the test case, and O(1e5) basic blocks. So we end up memset'ing O(10)MB X O(1e5)times = O(1e6)MB (assuming 64 bits pointers so that each rtx is O(10) bytes). I have a few ideas to fix this problem.