On Wed, 12 May 2021, Qing Zhao wrote: > Hi, > > This is the 3rd version of the patch for the new security feature for GCC. > > Please take look and let me know your comments and suggestions. > > thanks. > > Qing > > ******Compare with the 2nd version, the following are the major changes: > > 1. use "lookup_attribute ("uninitialized",) directly instead of adding > one new field "uninitialized" into tree_decl_with_vis. > 2. update documentation to mention that the new option will not confuse > -Wuninitialized, GCC still consider an auto without explicit initializer > as uninitialized. > 3. change the name of "build_pattern_cst" to more specific name as > "build_pattern_cst_for_auto_init". > 4. handling of nested VLA; > Adding new testing cases (auto-init-15/16.c) for this new handling. > 5. Add new verifications of calls to .DEFERRED_INIT in tree-cfg.c; > 6. in tree-sra.c, update the handling of "grp_to_be_debug_replaced", > bind the lhs variable to a call to .DEFERRED_INIT. > 7. In tree-ssa-structalias.c, delete "find_func_aliases_for_deferred_init", > return directly for a call to .DEFERRED_INIT in > "find_func_aliases_for_call". > 8. Add more detailed comments in tree-ssa-uninit.c and tree-ssa.c to explain > the special handling on REALPART_EXPR/IMAGPRT_EXPR. > 9. in build_pattern_cst_for_auto_init: > BOOLEAN_TYPE will be set to zero always; > INTEGER_TYPE (?and ENUMERAL_TYPE) use wi::from_buffer in order to > correctly handle 128-bit integers. > POINTER_TYPE will not assert on SIZE < 32. > REAL_TYPE add fallback; > 10. changed gcc_assert to gcc_unreachable in several places; > 11. add more comments; > 12. some style issue changes. > > ******Please see the version 2 at: > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567262.html > > > ******The following 2 items are the ones I didn’t addressed in this version > due to further study and might need more discussion: > > 1. Using __builtin_clear_padding to replace type_has_padding. > > My study shows: the call to __builtin_clear_padding is expanded during > gimplification phase. > And there is no __bultin_clear_padding expanding during rtx expanding phase. > If so, for -ftrivial-auto-var-init, padding initialization should be done > both in gimplification phase and rtx expanding phase. > And since the __builtin_clear_padding might not be good for rtx expanding, > reusing __builtin_clear_padding might not work. > > 2. Pattern init to NULLPTR_TYPE and ENUMERAL_TYPE: need more comments from > Richard Biener on this. > > ******The change of the 3rd version compared to the 2nd version are:
+@item -ftrivial-auto-var-init=@var{choice} +@opindex ftrivial-auto-var-init +Initialize automatic variables with either a pattern or with zeroes to increase +the security and predictability of a program by preventing uninitialized memory +disclosure and use. the docs do not state what "trivial" actually means? Does it affect C++ classes with ctors, thus is "trivial" equal to what C++ considers a POD type? diff --git a/gcc/expr.c b/gcc/expr.c index 798285eb52ca..9092349d7017 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -6539,14 +6539,19 @@ store_constructor (tree exp, rtx target, int cleared, poly_int64 size, cleared = 1; } - /* If the constructor has fewer fields than the structure or - if we are initializing the structure to mostly zeros, clear - the whole structure first. Don't do this if TARGET is a - register whose mode size isn't equal to SIZE since - clear_storage can't handle this case. */ + /* If the constructor has fewer fields than the structure, + or if we are initializing the structure to mostly zeros, + or if the user requests to initialize automatic variables with + paddings inside the type, + we should clear the whole structure first. + Don't do this if TARGET is a register whose mode size isn't equal + SIZE since clear_storage can't handle this case. */ else if (known_size_p (size) && (((int) CONSTRUCTOR_NELTS (exp) != fields_length (type)) - || mostly_zeros_p (exp)) + || mostly_zeros_p (exp) + || (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED + && !TREE_STATIC (exp) + && type_has_padding (type))) testing flag_trivial_auto_var_init tests the global options, if TUs are compiled with different setting of flag_trivial_auto_var_init and you use LTO or flag_trivial_auto_var_init is specified per function via optimize attributes it's more appropriate to test opt_for_fn (cfun->decl, flag_trivial_auto_var_init) You do not actually test whether TARGET is an auto-var in this place, so I question this change - the documentation of ftrivial-auto-var-init also doesn't mention initialization of padding and the above doesn't seem to honor -ftrivial-auto-var-init=pattern for this. +enum auto_init_approach { + AUTO_INIT_A = 0, + AUTO_INIT_D = 1 +}; I'm assuming we're going for one implementation in the end - have you made up your mind yet? +/* If FOR_UNINIT is true, GIMPLE_CALL S is a call to builtin_memset that + is known to be emitted for unintialized VLA objects. */ + +static inline void +gimple_call_set_memset_for_uninit (gcall *s, bool for_uninit) seeing this, can you explain why using .DEFERRED_INIT does not work for VLAs? You can build .DEFERRED_INIT (WITH_SIZE_EXPR <decl, size-expression>, type) to carry the initialization size. There's maybe_with_size_expr that you can conveniently use to perform the WITH_SIZE_EXPR wrapping. I'd strongly prefer not going different routes for VLAs vs. on-VLAs. @@ -5001,6 +5185,17 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, /* If a single access to the target must be ensured and all elements are zero, then it's optimal to clear whatever their number. */ cleared = true; + else if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED + && !TREE_STATIC (object) + && type_has_padding (type)) + /* If the user requests to initialize automatic variables with + paddings inside the type, we should initialize the paddings too. + C guarantees that brace-init with fewer initializers than members + aggregate will initialize the rest of the aggregate as-if it were + static initialization. In turn static initialization guarantees + that pad is initialized to zero bits. + So, it's better to clear the whole record under such situation. */ + cleared = true; so here we have padding as well - I think this warrants to be controlled by an extra option? And we can maybe split this out to a separate patch? (the whole padding stuff) +/* Expand the IFN_DEFERRED_INIT function according to its second argument. */ +static void +expand_DEFERRED_INIT (internal_fn, gcall *stmt) +{ + tree var = gimple_call_lhs (stmt); + tree init = NULL_TREE; + enum auto_init_type init_type + = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1)); + + switch (init_type) + { + default: + gcc_unreachable (); + case AUTO_INIT_PATTERN: + init = build_pattern_cst_for_auto_init (TREE_TYPE (var)); + expand_assignment (var, init, false); + break; + case AUTO_INIT_ZERO: + init = build_zero_cst (TREE_TYPE (var)); + expand_assignment (var, init, false); + break; + } I think actually building build_pattern_cst_for_auto_init can generate massive garbage and for big auto vars code size is also a concern and ideally on x86 you'd produce rep movq. So I don't think going via expand_assignment is good. Instead you possibly want to lower .DEFERRED_INIT to MEMs following expand_builtin_memset and eventually enhance that to allow storing pieces larger than a byte. diff --git a/gcc/tree-core.h b/gcc/tree-core.h index e457b917b98e..2e0e76ea8838 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -1829,7 +1829,7 @@ struct GTY(()) tree_decl_with_vis { unsigned final : 1; /* Belong to FUNCTION_DECL exclusively. */ unsigned regdecl_flag : 1; - /* 14 unused bits. */ + /* 14 unused bits. */ /* 32 more unused on 64 bit HW. */ }; spurious change, please drop. + /* Ignore REALPART_EXPR or IMAGPART_EXPR if its operand is a call to + .DEFERRED_INIT. This is for handling the following case correctly: + + 1 typedef _Complex float C; + 2 C foo(int cond) + 3 { + 4 C f; + 5 __imag__ f = 0; + 6 if (cond) + 7 { + 8 __real__ f = 1; + 9 return f; + 10 } + 11 return f; + 12 } + + with -ftrivial-auto-var-init, compiler will insert the following + artificial initialization at line 4: + f = .DEFERRED_INIT (f, 2); + _1 = REALPART_EXPR <f>; + + without the following special handling, _1 = REALPART_EXPR <f> will + be treated as the uninitialized use point, which is incorrect. (the + real uninitialized use point is at line 11). */ + if (is_gimple_assign (context) + && (gimple_assign_rhs_code (context) == REALPART_EXPR + || gimple_assign_rhs_code (context) == IMAGPART_EXPR)) + { + tree v = gimple_assign_rhs1 (context); + if (TREE_CODE (TREE_OPERAND (v, 0)) == SSA_NAME + && gimple_call_internal_p (SSA_NAME_DEF_STMT (TREE_OPERAND (v, 0)), + IFN_DEFERRED_INIT)) + return; + } will this not mishandle (not report) C foo () { C f; return __real f; } ? I think ssa_undefined_value_p is supposed to catch this, you seem to duplicate something there as well. +/* Returns true when the given TYPE has padding inside it. + return false otherwise. */ +bool +type_has_padding (tree type) +{ + switch (TREE_CODE (type)) + { + case RECORD_TYPE: + { btw, there's __builtin_clear_padding and a whole machinery around it in gimple-fold.c, I'm sure that parts could be re-used if they are neccessary in the end. Otherwise the changes look OK. Do DCE/DSE remove unused .DEFERRED_INIT? Thanks, Richard.