Hello Feng Xue OS
Your project is interesting, but ambitious.
I think the major points are:
*whole program analysis*. Static analysis tools like
https://frama-c.com/ <https://frama-c.com/> or
https://github.com/bstarynk/bismon/
<https://github.com/bstarynk/bismon/> could be relevant. Projects like
https://www.decoder-project.eu/ <https://www.decoder-project.eu/> could
be relevant. With cross-compilation, things are becoming harder.
*abstract interpretation* might be relevant (but difficult and costly to
implement). See wikipedia.
*size of the whole program which is analyzed*. If the entire program
(including system libraries like libc) has e.g. less than ten thousand
routines and less than a million GIMPLE instructions in total, it make
sense. But if the entire program is as large as the Linux kernel, or the
GCC compiler, or the Firefox browser (all have many millions lines of
source code) you probably won't be able to do whole program
devirtualization in a few years of human work.
*computed gotos* or *labels as values* (see
https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
<https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html> for more) are
making this difficult. But they do exist, and probably could be hidden
in GNU glibc or libstdc++ internal code.
*asm**statements are difficult*. They usually appear inside your libc.
How would you deal with them?
*Can you afford a month of computer time to compile a large software*
with your whole program devirtualizer? In most cases, not, but Pitrat's
book /Artificial Beings - the conscience of a conscious machine/ (ISBN
9781848211018) suggest cases where it might make sense (he is explaining
a "compiler like system" which runs for a month of CPU time).
My recommendation would be to *code first a simple GCC plugin as a proof
of concept thing*, which reject programs which could not be
realistically devirtualized, and store somewhere (in some database
perhaps) a representation of them otherwise. I worked 3 years full time
on https://github.com/bstarynk/bismon/
<https://github.com/bstarynk/bismon/> to achieve a similar goal (and I
don't claim to have succeeded, and I don't have any more funding). My
guess is that some code could be useful to you (then contact me by email
both at work basile.starynkevi...@cea.fr and at home
bas...@starynkevitch.net ....)
The most important thing: limit your ambition at first. Write a document
(at least an internal one) stating what you won't do.
Cheers
--
Basile Starynkevitch <bas...@starynkevitch.net>
(only mine opinions / les opinions sont miennes uniquement)
92340 Bourg-la-Reine, France
web page: starynkevitch.net/Basile/