(Resend, the previous version messed up your questions and my answers, 
hopefully this time it’s better)

Hi, Richard,

Thanks a lot for your comments.

On May 26, 2021, at 6:18 AM, Richard Biener 
<rguent...@suse.de<mailto:rguent...@suse.de>> wrote:

On Wed, 12 May 2021, Qing Zhao wrote:

Hi,

This is the 3rd version of the patch for the new security feature for GCC.

Please take look and let me know your comments and suggestions.

thanks.

Qing

******Compare with the 2nd version, the following are the major changes:

1. use "lookup_attribute ("uninitialized",) directly instead of adding
  one new field "uninitialized" into tree_decl_with_vis.
2. update documentation to mention that the new option will not confuse
  -Wuninitialized, GCC still consider an auto without explicit initializer
  as uninitialized.
3. change the name of "build_pattern_cst" to more specific name as
  "build_pattern_cst_for_auto_init".
4. handling of nested VLA;
  Adding new testing cases (auto-init-15/16.c) for this new handling.
5. Add  new verifications of calls to .DEFERRED_INIT in tree-cfg.c;
6. in tree-sra.c, update the handling of "grp_to_be_debug_replaced",
  bind the lhs variable to a call to .DEFERRED_INIT.
7. In tree-ssa-structalias.c, delete "find_func_aliases_for_deferred_init",
  return directly for a call to .DEFERRED_INIT in "find_func_aliases_for_call".
8. Add more detailed comments in tree-ssa-uninit.c and tree-ssa.c to explain
  the special handling on REALPART_EXPR/IMAGPRT_EXPR.
9. in build_pattern_cst_for_auto_init:
  BOOLEAN_TYPE will be set to zero always;
  INTEGER_TYPE (?and ENUMERAL_TYPE) use wi::from_buffer in order to
               correctly handle 128-bit integers.
  POINTER_TYPE will not assert on SIZE < 32.
  REAL_TYPE add fallback;
10. changed gcc_assert to gcc_unreachable in several places;
11. add more comments;
12. some style issue changes.

******Please see the version 2 at:
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567262.html


******The following 2 items are the ones I didn’t addressed in this version due 
to further study and might need more discussion:

1. Using __builtin_clear_padding  to replace type_has_padding.

My study shows: the call to __builtin_clear_padding is expanded during 
gimplification phase.
And there is no __bultin_clear_padding expanding during rtx expanding phase.
If so,  for -ftrivial-auto-var-init, padding initialization should be done both 
in gimplification phase and rtx expanding phase.
And since the __builtin_clear_padding might not be good for rtx expanding, 
reusing __builtin_clear_padding might not work.

2. Pattern init to NULLPTR_TYPE and ENUMERAL_TYPE: need more comments from 
Richard Biener on this.

******The change of the 3rd version compared to the 2nd version are:


+@item -ftrivial-auto-var-init=@var{choice}
+@opindex ftrivial-auto-var-init
+Initialize automatic variables with either a pattern or with zeroes to
increase
+the security and predictability of a program by preventing uninitialized
memory
+disclosure and use.

the docs do not state what "trivial" actually means?  Does it affect
C++ classes with ctors, thus is "trivial" equal to what C++ considers
a POD type?


Thank you for this question.

The name -ftrivial-auto-var-init is just for compatible with Clang. I really 
don’t know why
they added trivial.

As I checked a small example with C++ class with ctors, I see both Clang and my 
patch add
Initialization to this class:

=====
#include <iostream>

using namespace std;

class Line {
  public:
     void setLength( double len );
     double getLength( void );
     Line();  // This is the constructor
  private:
     double length;
};

// Member functions definitions including constructor
Line::Line(void) {
  cout << "Object is being created" << endl;
}
void Line::setLength( double len ) {
 length = len;
}
double Line::getLength( void ) {
 return length;
}

// Main function for the program
int main() {
 Line line;

 // set line length
 line.setLength(6.0);
 cout << "Length of line : " << line.getLength() <<endl;

 return 0;
}

=====

Both clang and my patch add initialization to the above auto variable “line”.

So, I have the following questions need help:

1. Do we need to exclude C++ class with ctor from auto initialization?

2. I see Clang use call to internal memset to initialize such class, but for my 
patch, I only initialize the data fields inside this class.
   Which one is better?


diff --git a/gcc/expr.c b/gcc/expr.c
index 798285eb52ca..9092349d7017 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -6539,14 +6539,19 @@ store_constructor (tree exp, rtx target, int
cleared, poly_int64 size,
           cleared = 1;
         }

-        /* If the constructor has fewer fields than the structure or
-          if we are initializing the structure to mostly zeros, clear
-          the whole structure first.  Don't do this if TARGET is a
-          register whose mode size isn't equal to SIZE since
-          clear_storage can't handle this case.  */
+       /* If the constructor has fewer fields than the structure,
+          or if we are initializing the structure to mostly zeros,
+          or if the user requests to initialize automatic variables with
+          paddings inside the type,
+          we should clear the whole structure first.
+          Don't do this if TARGET is a register whose mode size isn't
equal
+          SIZE since clear_storage can't handle this case.  */
       else if (known_size_p (size)
                && (((int) CONSTRUCTOR_NELTS (exp) != fields_length
(type))
-                    || mostly_zeros_p (exp))
+                    || mostly_zeros_p (exp)
+                    || (flag_trivial_auto_var_init >
AUTO_INIT_UNINITIALIZED
+                        && !TREE_STATIC (exp)
+                        && type_has_padding (type)))

testing flag_trivial_auto_var_init tests the global options, if TUs
are compiled with different setting of flag_trivial_auto_var_init
and you use LTO or flag_trivial_auto_var_init is specified per
function via optimize attributes it's more appropriate to test
opt_for_fn (cfun->decl, flag_trivial_auto_var_init)


Okay.  Thanks for the info. I will update this.


You do not actually test whether TARGET is an auto-var in this place,
so I question this change

Will add checking for Auto on this.

- the documentation of ftrivial-auto-var-init
also doesn't mention initialization of padding

Will add initialization of padding on this.
Clang add the padding initialization with this option, I guess that we might 
need to
be compatible with it?

and the above doesn't
seem to honor -ftrivial-auto-var-init=pattern for this.


Richard Sandiford raised the same question in the previous review.
And this behavior also follows Clang’s behavior.

The following is the answer from Kees Cook on this question:

=====

Originally, Clang
implemented using pattern, but there was discussion around it and the
decision there was to go with zero init, as it seemed to more closely
match the C spec:
https://github.com/llvm/llvm-project/commit/d39fbc7e20d84364e409ce59724ce20625637062

=====


+enum auto_init_approach {
+  AUTO_INIT_A = 0,
+  AUTO_INIT_D = 1
+};

I'm assuming we're going for one implementation in the end - have
you made up your mind yet?

Yes, I already decided to take the approach D. (Adding call to internal 
function .DEFFERED_INIT).

The reason to temporarily keep both implementations is just for the convenience 
to run some
Performance comparison during the implementation period. I am afraid that I 
might need some
Changes for approach D during the review process, in case there might be major 
change, we
need to run the performance testing again.

So, before the last version, I will just keep both implementations.

If all the change is good, I will complete delete approach A at that time.
Is this okay?


+/* If FOR_UNINIT is true, GIMPLE_CALL S is a call to builtin_memset that
+   is known to be emitted for unintialized VLA objects.  */
+
+static inline void
+gimple_call_set_memset_for_uninit (gcall *s, bool for_uninit)

seeing this, can you explain why using .DEFERRED_INIT does not
work for VLAs?

The major reason for going different routes for VLAs vs. no-VLAs is:

In the original gimplification phase, VLAs and no-VLAs go different routes.
I just followed the different routes for them:

In “gimplify_decl_expr”, VLA goes to “gimplify_vla_decl”, and is expanded to
call to alloca.  Naturally, I add calls to “memset/memcpy” in 
“gimplify_vla_decl” to
Initialize it.

On the other hand, no-VLAs are handled differently in “gimplify_decl_expr”, so
I added calls to “.DEFFERED_INIT” to initialize them.

What’s the major issue if I add calls to “memset/memcpy” in “gimplify_vla_decl” 
to
Initialize VLAs?


 You can build

 .DEFERRED_INIT (WITH_SIZE_EXPR <decl, size-expression>, type)

to carry the initialization size.  There's maybe_with_size_expr that
you can conveniently use to perform the WITH_SIZE_EXPR wrapping.
I'd strongly prefer not going different routes for VLAs vs. on-VLAs.

What’s the major benefit to this?


@@ -5001,6 +5185,17 @@ gimplify_init_constructor (tree *expr_p, gimple_seq
*pre_p, gimple_seq *post_p,
         /* If a single access to the target must be ensured and all
elements
            are zero, then it's optimal to clear whatever their number.
*/
         cleared = true;
+       else if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED
+                && !TREE_STATIC (object)
+                && type_has_padding (type))
+         /* If the user requests to initialize automatic variables with
+            paddings inside the type, we should initialize the paddings
too.
+            C guarantees that brace-init with fewer initializers than
members
+            aggregate will initialize the rest of the aggregate as-if it
were
+            static initialization.  In turn static initialization
guarantees
+            that pad is initialized to zero bits.
+            So, it's better to clear the whole record under such
situation.  */
+         cleared = true;

so here we have padding as well - I think this warrants to be controlled
by an extra option?  And we can maybe split this out to a separate
patch? (the whole padding stuff)

Clang does the padding initialization with this option, shall we be consistent 
with Clang?


+/* Expand the IFN_DEFERRED_INIT function according to its second
argument.  */
+static void
+expand_DEFERRED_INIT (internal_fn, gcall *stmt)
+{
+  tree var = gimple_call_lhs (stmt);
+  tree init = NULL_TREE;
+  enum auto_init_type init_type
+    = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1));
+
+  switch (init_type)
+    {
+    default:
+      gcc_unreachable ();
+    case AUTO_INIT_PATTERN:
+      init = build_pattern_cst_for_auto_init (TREE_TYPE (var));
+      expand_assignment (var, init, false);
+      break;
+    case AUTO_INIT_ZERO:
+      init = build_zero_cst (TREE_TYPE (var));
+      expand_assignment (var, init, false);
+      break;
+    }

I think actually building build_pattern_cst_for_auto_init can generate
massive garbage and for big auto vars code size is also a concern and
ideally on x86 you'd produce rep movq.  So I don't think going
via expand_assignment is good.  Instead you possibly want to lower
.DEFERRED_INIT to MEMs following expand_builtin_memset and
eventually enhance that to allow storing pieces larger than a byte.

Will study this approach a little bit more, might need more help from you on 
this part.


diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index e457b917b98e..2e0e76ea8838 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1829,7 +1829,7 @@ struct GTY(()) tree_decl_with_vis {
 unsigned final : 1;
 /* Belong to FUNCTION_DECL exclusively.  */
 unsigned regdecl_flag : 1;
- /* 14 unused bits. */
+ /* 14 unused bits.  */
 /* 32 more unused on 64 bit HW. */
};


spurious change, please drop.

Okay.


+  /* Ignore REALPART_EXPR or IMAGPART_EXPR if its operand is a call to
+     .DEFERRED_INIT.  This is for handling the following case correctly:
+
+  1 typedef _Complex float C;
+  2 C foo(int cond)
+  3 {
+  4   C f;
+  5   __imag__ f = 0;
+  6   if (cond)
+  7     {
+  8       __real__ f = 1;
+  9       return f;
+ 10     }
+ 11   return f;
+ 12 }
+
+    with -ftrivial-auto-var-init, compiler will insert the following
+    artificial initialization at line 4:
+  f = .DEFERRED_INIT (f, 2);
+  _1 = REALPART_EXPR <f>;
+
+    without the following special handling, _1 = REALPART_EXPR <f> will
+    be treated as the uninitialized use point, which is incorrect. (the
+    real uninitialized use point is at line 11).  */
+  if (is_gimple_assign (context)
+      && (gimple_assign_rhs_code (context) == REALPART_EXPR
+         || gimple_assign_rhs_code (context) == IMAGPART_EXPR))
+    {
+      tree v = gimple_assign_rhs1 (context);
+      if (TREE_CODE (TREE_OPERAND (v, 0)) == SSA_NAME
+         && gimple_call_internal_p (SSA_NAME_DEF_STMT (TREE_OPERAND (v,
0)),
+                                    IFN_DEFERRED_INIT))
+       return;
+    }

will this not mishandle (not report)

C foo ()
{
 C f;
 return __real f;
}

?  I think ssa_undefined_value_p is supposed to catch this, you
seem to duplicate something there as well.

Will check on this.


+/* Returns true when the given TYPE has padding inside it.
+   return false otherwise.  */
+bool
+type_has_padding (tree type)
+{
+  switch (TREE_CODE (type))
+    {
+    case RECORD_TYPE:
+      {

btw, there's __builtin_clear_padding and a whole machinery around
it in gimple-fold.c, I'm sure that parts could be re-used if they
are neccessary in the end.

Richard Sandiford provided this suggestion previously, my previous study shows 
that it might not
Be convenient to use it. But I will study this a little bit more and get back 
to you.


Otherwise the changes look OK.

Do DCE/DSE remove unused .DEFERRED_INIT?

It should, but I will double check on this.

Thanks again.

Qing



Thanks,
Richard.



Reply via email to