Hi, Richard, Thanks a lot for your review.
> On Apr 23, 2021, at 2:05 PM, Richard Sandiford <richard.sandif...@arm.com> > wrote: > > Finally getting to this now that the GCC 11 rush is over. Sorry for > the slow response. > > I've tried to review most of the code below, but skipped the testsuite > parts in the interests of time. I'll probably have more comments in > future rounds, just wanted to get the ball rolling. > > This is realy Richi's area more than mine though, so please take this > with a grain of salt. > > Qing Zhao <qing.z...@oracle.com> writes: >> 2. initialize all paddings to zero when -ftrivial-auto-var-init is present. >> In expr.c (store_constructor): >> >> Clear the whole structure when >> -ftrivial-auto-var-init and the structure has paddings. >> >> In gimplify.c (gimplify_init_constructor): >> >> Clear the whole structure when >> -ftrivial-auto-var-init and the structure has paddings. > > Just to check: are we sure we want to use zero as the padding fill > value even for -ftrivial-auto-var-init=pattern? Or should it be > 0xAA instead, to match the integer fill pattern? > > I can see the arguments both ways, just thought it was worth asking. For this question, I think Kees had provided the background information on it. Yes, this is basically following Clang’s current implementation in order to match C spec. > >> […] >> @@ -1589,6 +1592,24 @@ handle_retain_attribute (tree *pnode, tree name, tree >> ARG_UNUSED (args), >> return NULL_TREE; >> } >> >> +/* Handle a "uninitialized" attribute; arguments as in > > This occurs in existing code too, but s/a/an/. Okay, will fix it. > >> + struct attribute_spec.handler. */ >> + >> +static tree >> +handle_uninitialized_attribute (tree *node, tree name, tree ARG_UNUSED >> (args), >> + int ARG_UNUSED (flags), bool *no_add_attrs) >> +{ >> + if (VAR_P (*node)) >> + DECL_UNINITIALIZED (*node) = 1; >> + else >> + { >> + warning (OPT_Wattributes, "%qE attribute ignored", name); >> + *no_add_attrs = true; >> + } >> + >> + return NULL_TREE; >> +} >> + >> /* Handle a "externally_visible" attribute; arguments as in >> struct attribute_spec.handler. */ > >> […] >> @@ -11689,6 +11689,34 @@ Perform basic block vectorization on trees. This >> flag is enabled by default at >> @option{-O3} and by @option{-ftree-vectorize}, @option{-fprofile-use}, >> and @option{-fauto-profile}. >> >> +@item -ftrivial-auto-var-init=@var{choice} >> +@opindex ftrivial-auto-var-init >> +Initialize automatic variables with either a pattern or with zeroes to >> increase >> +program security by preventing uninitialized memory disclosure and use. >> + >> +The three values of @var{choice} are: >> + >> +@itemize @bullet >> +@item >> +@samp{uninitialized} doesn't initialize any automatic variables. >> +This is C and C++'s default. >> + >> +@item >> +@samp{pattern} Initialize automatic variables with values which will likely >> +transform logic bugs into crashes down the line, are easily recognized in a >> +crash dump and without being values that programmers can rely on for useful >> +program semantics. >> +The values used for pattern initialization might be changed in the future. >> + >> +@item >> +@samp{zero} Initialize automatic variables with zeroes. >> +@end itemize >> + >> +The default is @samp{uninitialized}. >> + >> +You can control this behavior for a specific variable by using the variable >> +attribute @code{uninitialized} (@pxref{Variable Attributes}). >> + > > I think it's important to say here that GCC still considers the > variables to be uninitialised and still considers reading them to > be undefined behaviour. The option is simply trying to improve the > security and predictability of the program in the presence of these > uninitialised variables. > > I think it would also be worth saying that options like -Wuninitialized > still try to warn about uninitialised variables, although using > -ftrivial-auto-var-init may change which warnings are generated. > > (The above comments are just a summary, not suitable for direct > inclusion. :-)) Agreed. Will update the documentation per your suggestions. > >> @item -fvect-cost-model=@var{model} >> @opindex fvect-cost-model >> Alter the cost model used for vectorization. The @var{model} argument >> […] >> diff --git a/gcc/gimplify.c b/gcc/gimplify.c >> index 6da6698..fafd2e9 100644 >> --- a/gcc/gimplify.c >> +++ b/gcc/gimplify.c >> @@ -1716,6 +1716,116 @@ gimplify_vla_decl (tree decl, gimple_seq *seq_p) >> >> gimplify_and_add (t, seq_p); >> >> + /* Add a call to memset or calls to memcpy to initialize this vla >> + when the user requested. */ >> + if (!DECL_ARTIFICIAL (decl) >> + && VAR_P (decl) >> + && !DECL_EXTERNAL (decl) >> + && !TREE_STATIC (decl) >> + && !DECL_UNINITIALIZED (decl)) >> + switch (flag_trivial_auto_var_init) >> + { >> + case AUTO_INIT_UNINITIALIZED: >> + break; >> + case AUTO_INIT_ZERO: >> + { >> + /* Generate a call to memset to initialize this vla. */ >> + gcall *gs; >> + t = builtin_decl_implicit (BUILT_IN_MEMSET); >> + gs = gimple_build_call (t, 3, addr, integer_zero_node, >> + DECL_SIZE_UNIT (decl)); >> + gimple_call_set_memset_for_uninit (gs, true); >> + gimplify_seq_add_stmt (seq_p, gs); >> + } >> + break; >> + case AUTO_INIT_PATTERN: >> + { >> + /* Generate the following sequence to initialize this vla: >> + if (DECL_SIZE_UNIT (decl) > 0) goto label_true; >> + else goto label_cont; >> + label_true: >> + { >> + element_type = TREE_TYPE (TREE_TYPE (decl)); >> + size_of_element = DECL_SIZE_UNIT (element_type); >> + init_node = build_pattern_cst (element_type); >> + cur = addr; >> + offset = DECL_SIZE_UNIT (decl) - size_of_element; >> + end = addr + offset; >> + >> + label_loop: >> + { >> + memcpy (cur, &init_node, size_of_element); >> + cur += size_of_element; >> + if (cur <= end) goto label_loop; >> + else goto label_cont; >> + } >> + } >> + label_cont: >> + */ >> + >> + tree size_of_element, element_type; >> + tree cur, end, offset; >> + tree init_node, addrof_init_node; >> + tree t; >> + >> + tree label_true = create_artificial_label (UNKNOWN_LOCATION); >> + tree label_cont = create_artificial_label (UNKNOWN_LOCATION); >> + tree label_loop = create_artificial_label (UNKNOWN_LOCATION); >> + >> + element_type = TREE_TYPE (TREE_TYPE (decl)); >> + >> + gcond *cond_stmt = gimple_build_cond (GT_EXPR, DECL_SIZE_UNIT (decl), >> + build_zero_cst (sizetype), >> + label_true, >> + label_cont); >> + gimplify_seq_add_stmt (seq_p, cond_stmt); >> + gimplify_seq_add_stmt (seq_p, gimple_build_label (label_true)); >> + >> + size_of_element = create_tmp_var (sizetype, >> + ".size_of_element"); >> + >> + init_node = create_tmp_var (element_type, ".init_node"); >> + mark_addressable (init_node); >> + addrof_init_node = build_fold_addr_expr_loc (UNKNOWN_LOCATION, >> + init_node); >> + >> + gimplify_assign (size_of_element, TYPE_SIZE_UNIT (element_type), >> + seq_p); >> + gimplify_assign (init_node, build_pattern_cst (element_type), seq_p); > > Rather than building a separate init_node here, would it work to assign > to element 0 instead? Then the loop could start at element 1. I think that should be fine, can save a little time. > > What about nested VLAs, like: > > void g(void *); > void f(int a) > { > int x[a][a]; > g(x); > } > Will check on this. > ? > >> + >> + cur = create_tmp_var (ptr_type, ".cur_addr"); >> + end = create_tmp_var (ptr_type, ".end_addr"); >> + offset = create_tmp_var (sizetype, ".offset"); >> + >> + gimplify_assign (cur, addr, seq_p); >> + gimplify_seq_add_stmt (seq_p, >> + gimple_build_assign (offset, MINUS_EXPR, >> + DECL_SIZE_UNIT (decl), >> + size_of_element)); >> + gimplify_seq_add_stmt (seq_p, >> + gimple_build_assign (end, POINTER_PLUS_EXPR, >> + addr, offset)); >> + >> + gimplify_seq_add_stmt (seq_p, gimple_build_label (label_loop)); >> + >> + t = builtin_decl_implicit (BUILT_IN_MEMCPY); >> + gimplify_seq_add_stmt (seq_p, >> + gimple_build_call (t, 3, cur, >> + addrof_init_node, >> + size_of_element)); >> + gimplify_seq_add_stmt (seq_p, >> + gimple_build_assign (cur, POINTER_PLUS_EXPR, >> + cur, size_of_element)); >> + cond_stmt = gimple_build_cond (LE_EXPR, cur, end, label_loop, >> + label_cont); >> + gimplify_seq_add_stmt (seq_p, cond_stmt); >> + gimplify_seq_add_stmt (seq_p, gimple_build_label (label_cont)); >> + } >> + break; >> + default: >> + gcc_assert (0); >> + } >> + >> /* Record the dynamic allocation associated with DECL if requested. */ >> if (flag_callgraph_info & CALLGRAPH_INFO_DYNAMIC_ALLOC) >> record_dynamic_alloc (decl); >> @@ -1738,6 +1848,65 @@ force_labels_r (tree *tp, int *walk_subtrees, void >> *data ATTRIBUTE_UNUSED) >> return NULL_TREE; >> } >> >> + >> +/* Build a call to internal const function DEFERRED_INIT: >> + 1st argument: DECL; >> + 2nd argument: INIT_TYPE; >> + >> + as DEFERRED_INIT (DECL, INIT_TYPE) >> + >> + DEFERRED_INIT is defined as: >> + DEF_INTERNAL_FN(DEFERRED_INIT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, >> NULL). */ > > I don't think it's necessary to repeat the definition here. Okay, will delete it. > >> + >> +static gimple * >> +build_deferred_init (tree decl, >> + enum auto_init_type init_type) >> +{ >> + tree init_type_node >> + = build_int_cst (integer_type_node, (int) init_type); >> + return gimple_build_call_internal (IFN_DEFERRED_INIT, 2, >> + decl, init_type_node); > > Nit: indentation is off in the return statement. Okay. > >> +} >> + >> […] >> @@ -1831,6 +2000,17 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p) >> as they may contain a label address. */ >> walk_tree (&init, force_labels_r, NULL, NULL); >> } >> + /* When there is no explicit initializer, if the user requested, >> + We should insert an artifical initializer for this automatic >> + variable for non vla variables. */ > > I think we should explain why we can skip VLAs here. VLA is handled in another place already, it should be initialized with calls to memset/memcpy. > >> + else if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED >> + && !DECL_UNINITIALIZED (decl) >> + && !TREE_STATIC (decl) >> + && !is_vla) >> + gimple_add_init_for_auto_var (decl, >> + flag_trivial_auto_var_init, >> + flag_auto_init_approach, >> + seq_p); >> } >> >> return GS_ALL_DONE; >> […] >> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c >> index 7e3aae5..da02be2 100644 >> --- a/gcc/tree-cfg.c >> +++ b/gcc/tree-cfg.c >> @@ -3471,6 +3471,9 @@ verify_gimple_call (gcall *stmt) >> } >> } >> >> + if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)) >> + return false; >> + > > Why is this needed? I think a comment would be helpful. Will add the comments. > >> /* ??? The C frontend passes unpromoted arguments in case it >> didn't see a function declaration before the call. So for now >> leave the call arguments mostly unverified. Once we gimplify >> diff --git a/gcc/tree-core.h b/gcc/tree-core.h >> index d2e6c89..fb42c41 100644 >> --- a/gcc/tree-core.h >> +++ b/gcc/tree-core.h >> @@ -1815,6 +1815,7 @@ struct GTY(()) tree_decl_with_vis { >> unsigned in_text_section : 1; >> unsigned in_constant_pool : 1; >> unsigned dllimport_flag : 1; >> + unsigned uninitialized : 1; >> /* Don't belong to VAR_DECL exclusively. */ >> unsigned weak_flag : 1; >> >> @@ -1836,7 +1837,7 @@ struct GTY(()) tree_decl_with_vis { >> unsigned final : 1; >> /* Belong to FUNCTION_DECL exclusively. */ >> unsigned regdecl_flag : 1; >> - /* 14 unused bits. */ >> + /* 13 unused bits. */ > > It doesn't look like DECL_UNINITIALIZED is used in compile-time-sensitive > code, so it might be better to keep "uninitialized" as a normal attribute > and use lookup_attribute where necessary. Okay. > >> /* 32 more unused on 64 bit HW. */ >> }; >> >> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c >> index d177f1b..fe439f7 100644 >> --- a/gcc/tree-sra.c >> +++ b/gcc/tree-sra.c >> @@ -384,6 +384,13 @@ static struct >> >> /* Numbber of components created when splitting aggregate parameters. */ >> int param_reductions_created; >> + >> + /* Number of deferred_init calls that are modified. */ >> + int deferred_init; >> + >> + /* Number of deferred_init calls that are created by >> + generate_subtree_deferred_init. */ >> + int subtree_deferred_init; >> } sra_stats; >> >> static void >> @@ -4070,6 +4077,105 @@ get_repl_default_def_ssa_name (struct access *racc, >> tree reg_type) >> return get_or_create_ssa_default_def (cfun, racc->replacement_decl); >> } >> >> + >> +/* Generate statements to call .DEFERRED_INIT to initialize scalar >> replacements >> + of accesses within a subtree ACCESS, all its children, siblings and their > > IMO reads more easily with “;” rather than “,” before “all” Okay. > >> + children are to be processed. TOP_OFFSET is the offset of the processed >> + subtree which has to be subtracted from offsets of individual accesses to >> + get corresponding offsets for AGG. GSI is a statement iterator used to >> place >> + the new statements. */ >> +static void >> +generate_subtree_deferred_init (struct access *access, tree agg, >> + enum auto_init_type init_type, >> + HOST_WIDE_INT top_offset, >> + gimple_stmt_iterator *gsi, >> + location_t loc) >> +{ >> + do >> + { >> + if (access->grp_to_be_replaced) >> + { >> + tree repl = get_access_replacement (access); >> + tree init_type_node >> + = build_int_cst (integer_type_node, (int) init_type); >> + gimple *call = gimple_build_call_internal (IFN_DEFERRED_INIT, 2, >> + repl, init_type_node); >> + gimple_call_set_lhs (call, repl); > > AFAICT “access” is specifically for the lhs of the original call. > So there seems to be an implicit assumption here that the lhs of the > original call is the same as the first argument of the original call. > Is that guaranteed/required? For call to DEFFERED_INIT, yes, this is guaranteed. > If so, I think it's something that > tree-cfg.c should check. It might also be worth having an assertion > in sra_modify_deferred_init. I can definitely add an assertion to make sure this. > > If the argument and lhs can be different then I think we need to > handle the access patterns for them both. > > Or is the idea that any instance of: > > LHS = .DEFERRED_INIT (RHS, INIT_TYPE) > > can be replaced with: > > LHS = .DEFERRED_INIT (LHS, INIT_TYPE) > > without affecting semantics? If so, I think that would be worth > a comment. > >> + gsi_insert_before (gsi, call, GSI_SAME_STMT); >> + update_stmt (call); > > It seems odd that we need to call update_stmt for a new stmt, but I see > other code in the file also does this. Will double check on this. > >> + gimple_set_location (call, loc); >> + >> + sra_stats.subtree_deferred_init++; >> + } >> + else if (access->grp_to_be_debug_replaced) >> + { >> + /* FIXME, this part might have some issue. */ >> + tree drhs = build_debug_ref_for_model (loc, agg, >> + access->offset - top_offset, >> + access); >> + gdebug *ds = gimple_build_debug_bind (get_access_replacement (access), >> + drhs, gsi_stmt (*gsi)); >> + gsi_insert_before (gsi, ds, GSI_SAME_STMT); > > Would be good to fix the FIXME :-) This is the part I am not very sure, so I added the FIXME in order to get more review and suggestion to make sure it. -:) > > I guess the thing we need to decide here is whether -ftrivial-auto-var-init > should affect debug-only constructs too. Where can I get more details on Debug-only constructs ? > If it doesn't, exmaining removed > components in a debugger might show uninitialised values in cases where > the user was expecting initialised ones. There would be no security > concern, but it might be surprising. > > I think in principle the DRHS can contain a call to DEFERRED_INIT. > Doing that would probably require further handling elsewhere though. What’s DRHS? Where can I get more info on it? > >> + } >> + if (access->first_child) >> + generate_subtree_deferred_init (access->first_child, agg, init_type, >> + top_offset, gsi, loc); >> + >> + access = access ->next_sibling; >> + } >> + while (access); >> +} >> + >> +/* For a call to .DEFERRED_INIT: >> + var = .DEFERRED_INIT (var, init_type); >> + examine the LHS variable VAR and replace it with a scalar replacement if >> + there is one, also replace the RHS call to a call to .DEFERRED_INIT of >> + the corresponding scalar relacement variable. Examine the subtree and >> + do the scalar replacements in the subtree too. STMT is the call, GSI is >> + the statment iterator to place newly created statements. */ > > typo: statement Okay. > >> + >> +static enum assignment_mod_result >> +sra_modify_deferred_init (gimple *stmt, gimple_stmt_iterator *gsi) >> +{ >> + tree lhs = gimple_call_lhs (stmt); >> + enum auto_init_type init_type >> + = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1)); >> + struct access *access = get_access_for_expr (lhs); >> + if (!access) >> + return SRA_AM_NONE; >> + location_t loc = gimple_location (stmt); >> + >> + if (access->grp_to_be_replaced) >> + { >> + tree repl = get_access_replacement (access); >> + gimple_call_set_lhs (stmt, repl); >> + gimple_call_set_arg (stmt, 0, repl); >> + sra_stats.deferred_init++; >> + } >> + else if (access->grp_to_be_debug_replaced) >> + { >> + /* FIXME, this part might have some issues. */ >> + tree drepl = get_access_replacement (access); >> + gdebug *ds = gimple_build_debug_bind (drepl, NULL_TREE, >> + gsi_stmt (*gsi)); >> + gsi_insert_before (gsi, ds, GSI_SAME_STMT); >> + } > > Same comments as above. OKay, will try to understand Debug-only construct details. > >> + >> + if (access->first_child) >> + generate_subtree_deferred_init (access->first_child, lhs, >> + init_type, access->offset, >> + gsi, loc); >> + if (access->grp_covered) >> + { >> + unlink_stmt_vdef (stmt); >> + gsi_remove (gsi, true); >> + release_defs (stmt); >> + return SRA_AM_REMOVED; >> + } >> + >> + return SRA_AM_MODIFIED; >> +} >> + >> /* Examine both sides of the assignment statement pointed to by STMT, replace >> them with a scalare replacement if there is one and generate copying of >> replacements if scalarized aggregates have been used in the assignment. >> GSI >> […] >> @@ -4863,6 +4863,29 @@ find_func_aliases_for_builtin_call (struct function >> *fn, gcall *t) >> return false; >> } >> >> +static void >> +find_func_aliases_for_deferred_init (gcall *t) >> +{ >> + tree lhsop = gimple_call_lhs (t); >> + enum auto_init_type init_type >> + = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (t, 1)); >> + auto_vec<ce_s, 2> lhsc; >> + auto_vec<ce_s, 4> rhsc; >> + struct constraint_expr temp; >> + >> + get_constraint_for (lhsop, &lhsc); >> + if (init_type == AUTO_INIT_ZERO && flag_delete_null_pointer_checks) >> + temp.var = nothing_id; >> + else >> + temp.var = nonlocal_id; >> + temp.type = ADDRESSOF; >> + temp.offset = 0; >> + rhsc.safe_push (temp); >> + >> + process_all_all_constraints (lhsc, rhsc); >> + return; >> +} >> + > > What's the reasoning behind doing it like this? AFAICT the result > of the call doesn't validly alias anything, regardless of the init type. > Even if there happens to be a valid decl at the address given by the > chosen fill pattern, there's no expectation that accesses to that > decl will be coherent wrt accesses via the result of the call. So, you mean the above change will not have any impact at all? > >> /* Create constraints for the call T. */ >> >> static void >> […] >> diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c >> index 0800f59..0c61f60 100644 >> --- a/gcc/tree-ssa-uninit.c >> +++ b/gcc/tree-ssa-uninit.c >> @@ -135,6 +135,20 @@ warn_uninit (enum opt_code wc, tree t, tree expr, tree >> var, >> if (is_gimple_assign (context) >> && gimple_assign_rhs_code (context) == COMPLEX_EXPR) >> return; >> + >> + /* Ignore REALPART_EXPR or IMAGPART_EXPR if its operand is >> + a call to .DEFERRED_INIT. */ >> + if (is_gimple_assign (context) >> + && (gimple_assign_rhs_code (context) == REALPART_EXPR >> + || gimple_assign_rhs_code (context) == IMAGPART_EXPR)) >> + { >> + tree v = gimple_assign_rhs1 (context); >> + if (TREE_CODE (TREE_OPERAND (v, 0)) == SSA_NAME >> + && gimple_call_internal_p (SSA_NAME_DEF_STMT (TREE_OPERAND (v, 0)), >> + IFN_DEFERRED_INIT)) >> + return; >> + } >> + > > Which case is this handling? In this context I would have expected > the REALPART_EXPRs and IMAGPART_EXPRs to act like normal rvalue > accessors, i.e. they are returning one half of their argument and > discarding the other half. Why is this different from, say, accessing > one field of a structure? I should add comments when I added this part of the code. I remembered that this was to fix a missed warning for complex expression in the regression test. But I forgot the details, I will check on this, and add more detailed comments in the code. > >> if (!has_undefined_value_p (t)) >> return; >> >> @@ -209,6 +223,19 @@ check_defs (ao_ref *ref, tree vdef, void *data_) >> { >> check_defs_data *data = (check_defs_data *)data_; >> gimple *def_stmt = SSA_NAME_DEF_STMT (vdef); >> + >> + /* Ignore the vdef iff the definition statement is a call >> + to .DEFERRED_INIT function. */ >> + if (gimple_call_internal_p (def_stmt, IFN_DEFERRED_INIT)) >> + return false; >> + >> + /* Ignore the vdef iff the definition statement is a call >> + to builtin_memset function that is added for uninitialized >> + auto variable initialization. */ >> + if (gimple_call_builtin_p (def_stmt, BUILT_IN_MEMSET) >> + && gimple_call_memset_for_uninit_p (def_stmt)) >> + return false; >> + > > s/iff/if/, since the “only if” part doesn't apply. Okay. > >> /* If this is a clobber then if it is not a kill walk past it. */ >> if (gimple_clobber_p (def_stmt)) >> { >> @@ -611,6 +638,9 @@ warn_uninitialized_vars (bool wmaybe_uninit) >> ssa_op_iter op_iter; >> tree use; >> >> + if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)) >> + continue; >> + > > I guess the reasoning here is that the call is an artificial use and so > won't give a meaningful error message. If the result of the call is > used somewhere else then we warn there instead. Yes. > > I think that deserves a comment though. Will add comments here. > >> if (is_gimple_debug (stmt)) >> continue; >> >> diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c >> index cf54c89..93d6124 100644 >> --- a/gcc/tree-ssa.c >> +++ b/gcc/tree-ssa.c >> @@ -1325,6 +1325,23 @@ ssa_undefined_value_p (tree t, bool partial) >> if (gimple_nop_p (def_stmt)) >> return true; >> >> + /* The value is undefined iff the definition statement is a call >> + to .DEFERRED_INIT function. */ >> + if (gimple_call_internal_p (def_stmt, IFN_DEFERRED_INIT)) >> + return true; >> + >> + if (partial && is_gimple_assign (def_stmt) >> + && (gimple_assign_rhs_code (def_stmt) == REALPART_EXPR >> + || gimple_assign_rhs_code (def_stmt) == IMAGPART_EXPR)) >> + { >> + tree real_imag_part = TREE_OPERAND (gimple_assign_rhs1 (def_stmt), 0); >> + if (TREE_CODE (real_imag_part) == SSA_NAME >> + && gimple_call_internal_p (SSA_NAME_DEF_STMT (real_imag_part), >> + IFN_DEFERRED_INIT)) >> + return true; >> + } >> + >> + > > Same s/iff/if/ comment here. Also same question as above about the > REALPART_EXPR/IMAGPART_EXPR thing. Okay, will check on the details and add more comments. > > Nit: too many blank links at the end. Will fix this. > >> /* Check if the complex was not only partially defined. */ >> if (partial && is_gimple_assign (def_stmt) >> && gimple_assign_rhs_code (def_stmt) == COMPLEX_EXPR) >> diff --git a/gcc/tree.c b/gcc/tree.c >> index 7c44c22..2c9acad 100644 >> --- a/gcc/tree.c >> +++ b/gcc/tree.c >> @@ -2531,6 +2531,144 @@ build_zero_cst (tree type) >> } >> } >> >> +/* Build pattern constant of type TYPE. This is used for initializing >> + auto variables. */ >> + >> +tree >> +build_pattern_cst (tree type) > > I think this should have a more descriptive name, since the pattern > is very specific to the use case. Agreed, will change the name more specifically. > >> +{ >> + /* The following value is a guaranteed unmappable pointer value and has a >> + repeated byte-pattern which makes it easier to synthesize. We use it >> for >> + pointers as well as integers so that aggregates are likely to be >> + initialized with this repeated value. */ >> + uint64_t largevalue = 0xAAAAAAAAAAAAAAAAull; >> + /* For 32-bit platforms it's a bit trickier because, across systems, only >> the >> + zero page can reasonably be expected to be unmapped, and even then we >> need >> + a very low address. We use a smaller value, and that value sadly >> doesn't >> + have a repeated byte-pattern. We don't use it for integers. */ >> + uint32_t smallvalue = 0x000000AA; >> + >> + switch (TREE_CODE (type)) >> + { >> + case INTEGER_TYPE: >> + case ENUMERAL_TYPE: >> + case BOOLEAN_TYPE: >> + /* This will initialize a boolean type variable to 0 instead of 1. >> + We think that initializint a boolean variable to 0 other than 1 > > initializing Okay. > >> + is better even for pattern initialization. */ >> + return build_int_cstu (type, largevalue); > > I've no objection to that choice for booleans, but: booleans in some > languages (like Ada) can have multibit precision. If we want booleans > to be zero then it would probably be better to treat them as a separate > case and just use build_zero_cst (type) for them. Good point, yes, it might be better to handle boolean separately. > > Also, the above won't work correctly for 128-bit integers: it will > zero-initialize the upper half. It would probably be better to use > wi::from_buffer to construct the integer instead. Okay. > > I'm also not sure what effect this will have on enumerations, or when > the fill value is outside the TYPE_MIN_VALUE…TYPE_MAX_VALUE range. > Hopefully Richi will chime in :-) The pattern initialization part is much more tricky, I am hoping to get more help and Discussion here to make sure each case. > >> + case POINTER_TYPE: >> + case OFFSET_TYPE: >> + case REFERENCE_TYPE: >> + case NULLPTR_TYPE: >> + { >> + poly_uint64 intvalue; >> + >> + if (POINTER_SIZE == 64) >> + intvalue = largevalue; >> + else if (POINTER_SIZE == 32) >> + intvalue = smallvalue; >> + else >> + gcc_assert (0); > > GCC supports 16-bit targets, so we can't assert here. We might as > well just go for 0xAA there too. Okay. > > In fact it might be simpler to have something like: > > if (POINTER_TYPE_P (type) && TYPE_PRECISION (type) < 64) > return build_int_cstu (type, 0xAA); > > if (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type)) > ...the wi::from_buffer thing above...; Okay. > > I'm not sure if this makes sense for NULLPTR_TYPE. Is there a good testing case in gcc test suite about NULLPTR_TYPE that I can take a look? > >> + return build_int_cstu (type, intvalue); >> + } >> + case REAL_TYPE: >> + { >> + REAL_VALUE_TYPE rnan; >> + >> + /* create an quiet NAN for REAL TYPE. */ >> + if (real_nan (&rnan, "", 1, TYPE_MODE (type))) >> + return build_real (type, rnan); >> + return NULL_TREE; > > It doesn't look like the callers can cope with a null return value. > Maybe get_max_float would be a good fallback instead. Okay. > >> + } >> + >> + case FIXED_POINT_TYPE: >> + { >> + /* FIXME. What should we put into a fixed point? */ >> + FIXED_VALUE_TYPE fixed; >> + fixed_from_string (&fixed, "0xFFFFFFFFFFFFFFFF", >> + SCALAR_TYPE_MODE (type)); >> + return build_fixed (type, fixed); >> + } >> + case VECTOR_TYPE: >> + { >> + tree scalar = build_pattern_cst (TREE_TYPE (type)); >> + return build_vector_from_val (type, scalar); >> + } >> + case COMPLEX_TYPE: >> + { >> + tree element = build_pattern_cst (TREE_TYPE (type)); >> + return build_complex (type, element, element); >> + } >> + case RECORD_TYPE: >> + { >> + tree field; >> + tree field_value; >> + vec<constructor_elt, va_gc> *v = NULL; >> + for (field= TYPE_FIELDS (type); field; field = DECL_CHAIN (field)) >> + { >> + if (TREE_CODE (field) != FIELD_DECL) >> + continue; >> + /* if the field is a variable length array, it should be the last > > s/if/If/ Okay. > >> + field of the record, and no need to initialize. */ > > Why doesn't it need to be initialized though? My understanding is, the compiler will not allocate memory for the latest field of the record if its a VLA, it’s the user’s responsibility to allocate Memory for it. Therefore, compiler doesn’t need to initialize it. > >> + if (TREE_CODE (TREE_TYPE (field)) == ARRAY_TYPE >> + && TYPE_SIZE (TREE_TYPE (field)) == NULL_TREE >> + && ((TYPE_DOMAIN (TREE_TYPE (field)) != NULL_TREE >> + && TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (field))) >> + == NULL_TREE) >> + || TYPE_DOMAIN (TREE_TYPE (field)) == NULL_TREE)) >> + continue; >> + field_value = build_pattern_cst (TREE_TYPE (field)); >> + CONSTRUCTOR_APPEND_ELT (v, field, field_value); >> + } >> + return build_constructor (type, v); >> + } >> + case UNION_TYPE: >> + case QUAL_UNION_TYPE: >> + { >> + tree field, max_field = NULL; >> + unsigned max_size = 0; >> + tree field_value; >> + vec<constructor_elt, va_gc> *v = NULL; >> + /* find the field with the largest size. */ > > s/find/Find/ Okay. > >> + for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field)) >> + { >> + if (TREE_CODE (field) != FIELD_DECL) >> + continue; >> + if (tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (field))) >= max_size) >> + { >> + max_size = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (field))); >> + max_field = field; >> + } > > This isn't safe. We shouldn't use tree_to_uhwi without checking whether > tree_fits_uhwi_p. We then need a fallback for !tree_fits_uhwi_p. Okay, will make sure this. > >> + } >> + field_value = build_pattern_cst (TREE_TYPE (max_field)); >> + CONSTRUCTOR_APPEND_ELT (v, max_field, field_value); >> + return build_constructor (type, v); >> + } >> + case ARRAY_TYPE: >> + { >> + vec<constructor_elt, va_gc> *elts = NULL; >> + tree element = build_pattern_cst (TREE_TYPE (type)); >> + tree nelts = array_type_nelts (type); >> + if (nelts && tree_fits_uhwi_p (nelts)) >> + { >> + unsigned HOST_WIDE_INT n = tree_to_uhwi (nelts) + 1; >> + for (unsigned int i = 0; i < n; i++) >> + CONSTRUCTOR_APPEND_ELT (elts, NULL_TREE, element); >> + return build_constructor (type, elts); >> + } >> + /* variable length array should not be here. */ > > s/variable/Variable/ Okay. > >> + gcc_assert (0); > > Should be gcc_unreachable () instead. Okay. > >> + } >> + default: >> + if (!AGGREGATE_TYPE_P (type)) >> + return fold_convert (type, build_pattern_cst (unsigned_type_node)); >> + else >> + gcc_assert (0); > > Same here. Okay. > >> @@ -11950,6 +12088,72 @@ lower_bound_in_type (tree outer, tree inner) >> } >> } >> >> +/* Returns true when the given TYPE has padding inside it. >> + return false otherwise. */ >> +bool >> +type_has_padding (tree type) > > Would it be possible to reuse __builtin_clear_padding here? Not sure, where can I get more details on __builtin_clear_padding? I can study a little bit more on this to make sure this. > >> +{ >> + switch (TREE_CODE (type)) >> + { >> + case RECORD_TYPE: >> + { >> + unsigned HOST_WIDE_INT record_size >> + = tree_to_uhwi (TYPE_SIZE_UNIT (type)); > > As above, it's not safe to just to use tree_to_uhwi here without checking. > I've skipped the rest of the function because of the builtin question above. Okay. Thanks a lot again for your review. Qing > > Thanks, > Richard