Hi, Richard, Thanks a lot for your suggestion.
Actually, I like this idea. My understanding of your suggestion is: 1. During gimplification phase: For each auto-variable that does not have an explicit initializer, insert the following initializer for it: X = DEFERRED_INIT (X, INIT) In which, DEFERRED_INIT is an internal const function, which can be defined as: DEF_INTERNAL_FN (DEFERRED_INIT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) It’s two arguments are: 1st argument: this uninitialized auto-variable; 2nd argument: initialized pattern (zero | pattern); 2. During tree to SSA phase: No change, the current tree to SSA phase should automatically change the above new inserted statement as X_2 = DEFERRED_INIT (X_1(D), INIT); And all other uses of X-1(D) being replaced by X_2. 3. During expanding phase: Expand each call to “DEFERRED_INIT (X, INIT)” to zero or pattern depends on “INIT”. Is the above understanding correct? Do I miss anything? More comments and questions are embedded below: > On Dec 3, 2020, at 11:32 AM, Richard Sandiford <richard.sandif...@arm.com> > wrote: > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >> On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao <qing.z...@oracle.com> wrote: >>> Another issue is, in order to check whether an auto-variable has >>> initializer, I plan to add a new bit in “decl_common” as: >>> /* In a VAR_DECL, this is DECL_IS_INITIALIZED. */ >>> unsigned decl_is_initialized :1; >>> >>> /* IN VAR_DECL, set when the decl is initialized at the declaration. */ >>> #define DECL_IS_INITIALIZED(NODE) \ >>> (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized) >>> >>> set this bit when setting DECL_INITIAL for the variables in FE. then keep it >>> even though DECL_INITIAL might be NULLed. >> >> For locals it would be more reliable to set this flag during gimplification. >> >>> Do you have any comment and suggestions? >> >> As said above - do you want to cover registers as well as locals? I'd do >> the actual zeroing during RTL expansion instead since otherwise you >> have to figure youself whether a local is actually used (see >> expand_stack_vars) >> >> Note that optimization will already made have use of "uninitialized" state >> of locals so depending on what the actual goal is here "late" may be too >> late. > > Haven't thought about this much, so it might be a daft idea, but would a > compromise be to use a const internal function: > > X1 = .DEFERRED_INIT (X0, INIT) > > where the X0 argument is an uninitialised value and the INIT argument > describes the initialisation pattern? So for a decl we'd have: > > X = .DEFERRED_INIT (X, INIT) > > and for an SSA name we'd have: > > X_2 = .DEFERRED_INIT (X_1(D), INIT) > > with all other uses of X_1(D) being replaced by X_2. The idea is that: > > * Having the X0 argument would keep the uninitialised use of the > variable around for the later warning passes. > > * Using a const function should still allow the UB to be deleted as dead > if X1 isn't needed. So, current GCC will delete the UB as dead code when X1 is not needed, with The new option, we should keep this behavior? > > * Having a function in the way should stop passes from taking advantage > of direct uninitialised uses for optimisation. This will resolve the issue we raised before with directly adding “artificial” zero-initializer during gimplification. However, I am wondering whether the new added const internal functions will impact the optimization and then change the uninitialized analysis behavior? > > This means we won't be able to optimise based on the actual init > value at the gimple level, but that seems like a fair trade-off. Yes, with this approach: At gimple level, we will not be able to optimize on the new added init values; At RTL level, we will optimize on the new added init values; RTL optimizations will be able to eliminate any redundancy introduced by this new Initializations to reduce the cost of this options. > AIUI this is really a security feature or anti-UB hardening feature > (in the sense that users are more likely to see predictable behaviour > “in the field” even if the program has UB). Yes, this option is for security purpose, and currently have been used in productions by Microsoft, Apple and google, etc. Qing > > Thanks, > Richard