On Wed, Apr 24, 2013 at 6:59 AM, Jakub Jelinek wrote:
> Also, don't some function start in cold section and then switch into hot
> section?

Yes, this can happen, and there is nothing in the
find_rarely_executed_basic_blocks_and_crossing_edges algorithm to
prevent it. It's not supposed to happen, though, and it is only
possible to trigger the problem if your profile is based on multiple
runs with different test inputs.

Currently the decision about to which partition a basic block is
assigned is based on the return value of probably_never_executed_bb_p,
which looks like this:

static vec<edge>
find_rarely_executed_basic_blocks_and_crossing_edges (void)
{
  ...
  FOR_EACH_BB (bb)
    {
      if (probably_never_executed_bb_p (cfun, bb))
        BB_SET_PARTITION (bb, BB_COLD_PARTITION);
      else
        BB_SET_PARTITION (bb, BB_HOT_PARTITION);
    }
  ....
}

bool
probably_never_executed_bb_p (struct function *fun, const_basic_block bb)
{
  if (profile_info && flag_branch_probabilities)
    return ((bb->count + profile_info->runs / 2) / profile_info->runs) == 0;
   ...
  return false;
}

Consider a test case which has, say, profile_info->runs==6, and a
function in the test case that is only used in one of the runs so that
bb->count==1. In that case, the entry block will be cold, and the
supposed-to-be-imposed rule that a hot region is never dominated by a
cold region is broken. See attached test case with resulting .dot
file.

IMHO this is a bug in the bbpart implementation, and the checking code
I proposed will expose these bugs.

What find_rarely_executed_basic_blocks_and_crossing_edges should do,
is identify hot regions and connect them to the entry block and
(usually) to the exit block. From what I understand from Teresa's
patches, this is what she has implemented.

I don't know if we can trigger this situation with the current test
infrastructure.

Ciao!
Steven


$ cat t.c
#define N 1024*1024*1024

unsigned int a[N];

void __attribute__((__noinline__,__noclone__))
foo (void)
{
  unsigned int i;
  for (i = 0; i < N; i = i + 2)
    a[i] = i % 19;
}

int
main (int argc,
      char *argv[] __attribute__((__unused__)))
{
  if (argc > 1)
    foo ();
  return 0;
}

$ ./xgcc -B. -isystem ./include -O2 -fprofile-generate t.c
$ for i in 1 2 3 4 5 ; do ./a.out ; done
$ ./a.out 1
$ ./xgcc -B. -isystem ./include -O2 -fprofile-use t.c
-fdump-rtl-bbpart{,-graph} -freorder-blocks-and-partition

Attachment: t.c.199r.bbpart.dot
Description: Binary data

Reply via email to