Hi, In split_function() (ipa-split.c), for the newly created call stmt, its block is set to the outermost scope, i.e. DECL_INITIAL(current_function_decl). When we inline this partially outlined function, we create the new block based on the block for the call stmt (in expand_call_inline()). So the new block for the inlined body is in parallel to the function top level scope. This bad block structure will late result in a bad stack layout.
This patch fixes the issue by setting the correct block number. It starts with the split_point entry bb to find the block stmt in the outlined region. The entry_bb maybe an empty BB so we need to follow the CFG until we find a non-empty bb. My early patch tried to use the block info from the first non-empty bb in the outlined regision. But that failed bootstrap as some of the stmts (assignment stmt) do not have the block info (bug?). So in this patch I traverse all the stmts in the bb until getting the block info. Tested with gcc bootstap. Thanks, 2012-04-24 Rong Xu <x...@google.com> * ipa-split.c (split_function): set the correct block for the call statement. Index: ipa-split.c =================================================================== --- ipa-split.c (revision 186634) +++ ipa-split.c (working copy) @@ -948,7 +948,7 @@ int num = 0; struct cgraph_node *node, *cur_node = cgraph_node (current_function_decl); basic_block return_bb = find_return_bb (); - basic_block call_bb; + basic_block call_bb, bb; gimple_stmt_iterator gsi; gimple call; edge e; @@ -958,6 +958,7 @@ gimple last_stmt = NULL; unsigned int i; tree arg; + tree split_loc_block = NULL; if (dump_file) { @@ -1072,6 +1073,33 @@ } } + /* Find the block location of the split region. */ + bb = split_point->entry_bb; + do + { + bool found = false; + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + if (is_gimple_debug(gsi_stmt(gsi))) + continue; + if ((split_loc_block = gimple_block (gsi_stmt (gsi)))) + { + found = true; + break; + } + } + if (found) + break; + + /* If we reach here, this bb should be an empty unconditional + or fall-throuth branch. We continue with the succ bb. */ + bb = single_succ (bb); + } + while (bb && bitmap_bit_p (split_point->split_bbs, bb->index)); + + gcc_assert (split_loc_block); + /* Now create the actual clone. */ rebuild_cgraph_edges (); node = cgraph_function_versioning (cur_node, NULL, NULL, args_to_skip, @@ -1115,7 +1143,7 @@ VEC_replace (tree, args_to_pass, i, arg); } call = gimple_build_call_vec (node->decl, args_to_pass); - gimple_set_block (call, DECL_INITIAL (current_function_decl)); + gimple_set_block (call, split_loc_block); /* We avoid address being taken on any variable used by split part, so return slot optimization is always possible. Moreover this is -- This patch is available for review at http://codereview.appspot.com/6111050