This is the project proposal that I am planning to submit to Google Summer of Code 2007. It is based on previous work of Jeffrey Laws, Diego Novillo and others. I hope someone will find it interesting and perhaps would like to act as mentor. Feedback is very welcome (eve negative feedback!!). I have other ideas as well, so I would like to know if this is a bad project or I am stepping on someone's toes as soon as possible.
Thanks, Manuel. Better Uninitialized Warnings. Manuel López-Ibáñez Synopsis The GNU Compile Collection warns about the use of uninitialized variables with the option -Wuninitialized. However, the current implementation has some perceived shortcomings. On one hand, some users would like more verbose and consistent warnings. On the other hand, some users would like to get as few warnings as possible. The goal of this project is to implement both possibilities while at the same time improving the current capabilities. Rationale GCC has the ability to warn the user about using the value of a uninitialized variable. Such value is undefined and it is never useful. It is not even useful as a random value, since it doesn't need to be a random value. However, detecting in general when a variable is used before being initialized is equivalent to solving the halting problem, and thus infeasible. GCC uses the information gathered by optimisers to detect some instances and warn about them when the option -Wuninitialized is given in the command line. Some critiques have been made about the current implementation. First, it only works when optimisation is enabled through -O1, -O2 or -O3. Second, the set of false positives varies according to the optimisations enabled. This also causes high variability of the false positives reported when optimisations are added or modified between releases. What an user understands as a false positive may be different for the particular user. Some users are interested in cases that are hidden because of actions of the optimizers combined with the current environment. However, many users aren't, since that case is hidden because it cannot arise in the compiled code. The canonical example is [MM05]: int x; if (f ()) x = 3; return x; where 'f' always return non-zero for the current environment, and thus, it may be optimised away. Here, a group of users would like to get an uninitialized warning since 'f' may return zero when compiled elsewhere. Yet, other group of users would consider spurious a warning about a situation that cannot arise in the executable being compiled. Other conflict is the desire by some users to emit the same warnings at -O0 as at higher optimisation levels [JB04], while other users prefer to get as much precision as possible by discarding false positives at higher levels [RD04]. In addition, a perceived limitation of the current Wuninitialized is the fact that it doesn't work without optimisation. There is no consensus on how to solve this. An approach may be to perform some dataflow analysis even without optimisation [DJ01]. However, that would hurt performance of the compiler when invoked with optimisation disabled. Other approach could warn for any potential case, even when dataflow analysis or other optimisations will easily show that it is a false positive. This latter approach coincides with request of warning about any potential usage of an uninitialized variable, even if that case cannot arise under the current compilation environment. Proposal From the analysis above, we can divide users into two groups with opposite requests. One group of users would like to obtain consistent, verbose warnings. The other group is interested only in cases that can actually arise in the executable being compiled, and thus, would prefer as few false positives as possible. The proposal of this project is to divide -Wuninitialized into two different flags: -Wuninitialized=verbose "Is there a code path through this function, when considered in isolation, and without being too clever, under which an uninitialized value is used?" [MM05] Produce consistent warnings across architectures and optimization levels, (and ideally releases). Warn about any potential case, even for unreachable code. -Wuninitialized=precise "Is there a code path through this function, when compiled on this architecture with these flags, etc., for which we might actually use an uninitialized value?" [MM05] Produce the most precise warnings possible. That is, when more optimisations are used, more false positives are detected and not warned. This option can be used with -O0 but it will produce many false positives. However, it will try to avoid any false positive that could be detected at that level (some cheap optimisations may be enabled at -O0 or some limited form of dataflow analysis may be performed). Therefore, -Wuninitialized=precise at -O0 is different from -Wuninintialized=verbose, since the latter aims to be consistent while -O0 may vary across releases or architectures. For example, -Wuninitialized=verbose will warn for: int i; int j=5; if (0) j = i; /* 'i' may be used uninitialized */ return j; Our ability to detect some cases depends on the level of optimisation, so if we want to be consistent, -Wuninitialized=verbose must warn about the following always: int x, f, y; f = foo (); if (f) x = 1; y = g (); if (f) y = x; /* 'x' may be used uninitialized */ return y; In addition to this, and as a side-effect, the whole implementation of -Wuninitialized would be reviewed with the goal of closing as many bugs as possible [PR24639] and implementing some enhancements, like detecting access to uninitialized arrays [PR10138][PR27120] Roadmap The first part of the project would review the current implementation and past attempts. The proposal of Jeffrey Law about using two different passes, one of them before any optimisation, seems very promising [JL05]. Also, there is some code available that may or may not be outdated but that will be certainly useful [PR24639attached]. The main physical result of this first part will be a large set of testcases [GCCTestcases]. The second part will implement the proposal described here (modified according to the feedback received from the GCC developers), while trying to incorporate the enhancements if they are not too complex. Otherwise, these may be implemented as additional patches. [MM05] http://gcc.gnu.org/ml/gcc/2005-11/msg00002.html [JL05] http://gcc.gnu.org/ml/gcc/2005-11/msg00032.html [JB04] http://gcc.gnu.org/ml/gcc/2004-12/msg00591.html [RD04] http://gcc.gnu.org/ml/gcc/2004-12/msg00603.html [DJ01] http://gcc.gnu.org/ml/gcc/2001-07/msg01213.html [PR10138] http://gcc.gnu.org/PR10138 [PR24639] http://gcc.gnu.org/PR24639 [PR24639attached] http://gcc.gnu.org/bugzilla/attachment.cgi?id=10181&action=view [PR27120] http://gcc.gnu.org/PR27120 [GCCTestcases] http://gcc.gnu.org/wiki/HowToPrepareATestcase