Hi, all

  I am studying on what kind of information a compiler can pass to a
binary translator (QEMU, for example) so that the binary translator
can do much aggressive optimization. Previous discussion [1] gave an
example on what I want to do. And in the end of the discussion, it
showed that GCC is unable to maintain CFG until emitting the assembly.

  Here I want to know if we can get a loop boundry from GCC, for
example, address 0x0010 and 0x0020 are a loop start and end
respectively. Currently, our binary translator associates a backward
branch with a loop. I don't know if this simple heuristic can identify
all loops, at least for binary generated by GCC. There are studies
argued above heuristic might not work as expected because compiler
might perfom hot-cold optimization and code repositioning which result
in backward branches that are NOT loop-back branches.

  I also asked a similar question on the LLVM mailing list [2].

"What I want to do is to locate the range of a for-loop statement in
a binary. For example, given a for-loop statement belows, 

for (stat1; stat2; stat3) {
  /* do something */
}

  Is it possible to get information about the range (binary address)
of the above for-loop, say, 0x0100 - 0x0120."

It concluded that various optimizations may reorder the code. And one
possible way to disable those optimizations is to insert inline assembly
symbol before and after the loop.

  Any comment appreciated. 

Regards,
chenwj

[1] http://www.mail-archive.com/gcc@gcc.gnu.org/msg58152.html
[2] http://markmail.org/thread/a2ze4v7o4ez64xmd

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667

Reply via email to