This is the project proposal that I am planning to submit to Google
Summer of Code 2007. It is based on previous work of Jeffrey Laws,
Diego Novillo and others. I hope someone will find it interesting and
perhaps would like to act as mentor. Feedback is very welcome (eve
negative feedback!!). I have other ideas as well, so I would like to
know if this is a bad project or I am stepping on someone's toes as
soon as possible.

Thanks,

Manuel.


Better Uninitialized Warnings.
Manuel López-Ibáñez

Synopsis

The GNU Compile Collection warns about the use of uninitialized
variables with the option -Wuninitialized. However, the current
implementation has some perceived shortcomings. On one hand, some
users would like more verbose and consistent warnings. On the other
hand,  some users would like to get as few warnings as possible. The
goal of this project is to implement both possibilities while at the
same time improving the current capabilities.

Rationale

GCC has the ability to warn the user about using the value of a
uninitialized variable. Such value is undefined and it is never
useful. It is not even useful as a random value, since it doesn't need
to be a random value. However, detecting in general when a variable is
used before being initialized is equivalent to solving the halting
problem, and thus infeasible. GCC uses the information gathered by
optimisers to detect some instances and warn about them when the
option -Wuninitialized is given in the command line. Some critiques
have been made about the current implementation. First, it only works
when optimisation is enabled through -O1, -O2 or -O3. Second, the set
of false positives varies according to the optimisations enabled. This
also causes high variability of the false positives reported when
optimisations are added or modified between releases.

What an user understands as a false positive may be different for the
particular user. Some users are interested in cases that are hidden
because of actions of the optimizers combined with the current
environment. However, many users aren't, since that case is hidden
because it cannot arise in the compiled code. The canonical example is
[MM05]:

int x;
if (f ())
  x = 3;
return x;

where 'f' always return non-zero for the current environment, and
thus, it may be optimised away. Here, a group of users would like to
get an uninitialized warning since 'f' may return zero when compiled
elsewhere. Yet, other group of users would consider spurious a warning
about a situation that cannot arise in the executable being compiled.

Other conflict is the desire by some users to emit the same warnings
at -O0 as at higher optimisation levels [JB04], while other users
prefer to get as much precision as possible by discarding false
positives at higher levels [RD04]. In addition, a perceived limitation
of the current Wuninitialized is the fact that it doesn't work without
optimisation. There is no consensus on how to solve this. An approach
may be to perform some dataflow analysis even without optimisation
[DJ01]. However, that would hurt performance of the compiler when
invoked with optimisation disabled. Other approach could warn for any
potential case, even when dataflow analysis or other optimisations
will easily show that it is a false positive. This latter approach
coincides with request of warning about any potential usage of an
uninitialized variable, even if that case cannot arise under the
current compilation environment.

Proposal

From the analysis above, we can divide users into two groups with
opposite requests. One group of users would like to obtain consistent,
verbose warnings. The other group is interested only in cases that can
actually arise in the executable being compiled, and thus, would
prefer as few false positives as possible.

The proposal of this project is to divide -Wuninitialized into two
different flags:

-Wuninitialized=verbose
"Is there a code path through this function, when considered in
isolation, and without being too clever, under which an uninitialized
value is used?" [MM05]
Produce consistent warnings across architectures and optimization
levels, (and ideally releases). Warn about any potential case, even
for unreachable code.

-Wuninitialized=precise
"Is there a code path through this function, when compiled on this
architecture with these flags, etc., for which we might actually use an
uninitialized value?" [MM05]
Produce the most precise warnings possible. That is, when more
optimisations are used, more false positives are detected and not
warned. This option can be used with -O0 but it will produce many
false positives. However, it will try to avoid any false positive that
could be detected at that level (some cheap optimisations may be
enabled at -O0 or some limited form of dataflow analysis may be
performed). Therefore, -Wuninitialized=precise at -O0 is different
from -Wuninintialized=verbose, since the latter aims to be consistent
while -O0 may vary across releases or architectures.

For example, -Wuninitialized=verbose will warn for:

int i;
int j=5;
if (0)
  j = i; /* 'i' may be used uninitialized */
return j;

Our ability to detect some cases depends on the level of optimisation,
so if we want to be consistent, -Wuninitialized=verbose must warn
about the following always:

   int x, f, y;
   f = foo ();
   if (f)
     x = 1;
   y = g ();
   if (f)
     y = x; /* 'x' may be used uninitialized */
   return y;


In addition to this, and as a side-effect, the whole implementation of
-Wuninitialized  would be reviewed with the goal of closing as many
bugs as possible [PR24639] and implementing some enhancements, like
detecting access to uninitialized arrays [PR10138][PR27120]

Roadmap

The first part of the project would review the current implementation
and past attempts. The proposal of Jeffrey Law about using two
different passes, one of them before any optimisation, seems very
promising [JL05]. Also, there is some code available that may or may
not be outdated but that will be certainly useful [PR24639attached].
The main physical result of this first part will be a large set of
testcases [GCCTestcases].

The second part will implement the proposal described here (modified
according to the feedback received from the GCC developers), while
trying to incorporate the enhancements if they are not too complex.
Otherwise, these may be implemented as additional patches.


[MM05] http://gcc.gnu.org/ml/gcc/2005-11/msg00002.html
[JL05] http://gcc.gnu.org/ml/gcc/2005-11/msg00032.html
[JB04] http://gcc.gnu.org/ml/gcc/2004-12/msg00591.html
[RD04] http://gcc.gnu.org/ml/gcc/2004-12/msg00603.html
[DJ01] http://gcc.gnu.org/ml/gcc/2001-07/msg01213.html
[PR10138] http://gcc.gnu.org/PR10138
[PR24639] http://gcc.gnu.org/PR24639
[PR24639attached]
http://gcc.gnu.org/bugzilla/attachment.cgi?id=10181&action=view
[PR27120] http://gcc.gnu.org/PR27120
[GCCTestcases] http://gcc.gnu.org/wiki/HowToPrepareATestcase

Reply via email to