jfb created this revision.
jfb added reviewers: kcc, rjmccall, rsmith.
Herald added subscribers: dexonsmith, jkorous, JDevlieghere.

Add an option to initialize automatic variables with either a pattern or with
zeroes. The default is still that automatic variables are uninitialized. Also
add attributes to request pattern / zero / uninitialized on a per-variable
basis, mainly to disable initialization of large stack arrays when deemed too
expensive.

This isn't meant to change the semantics of C and C++. Rather, it's meant to be
a last-resort when programmers inadvertently have some undefined behavior in
their code. This patch aims to make undefined behavior hurt less, which
security-minded people will be very happy about. Notably, this means that
there's no inadvertent information leak when:

- The compiler re-uses stack slots, and a value is used uninitialized.
- The compiler re-uses a register, and a value is used uninitialized.
- Stack structs / arrays / unions with padding are copied.

This patch only addresses stack and register information leaks. There's many
more infoleaks that we could address, and much more undefined behavior that
could be tamed. Let's keep this patch focused, and I'm happy to address related
issues elsewhere.

To keep the patch simple, only some `undef` is removed for now, see
`replaceUndef`. The padding-related infoleaks are therefore not all gone yet.
This will be addressed in a follow-up, mainly because addressing padding-related
leaks should be a stand-alone option which is implied by variable
initialization.

There are three options when it comes to automatic variable initialization:

  0. Uninitialized
  
    This is C and C++'s default. It's not changing. Depending on code
    generation, a programmer who runs into undefined behavior by using an
    uninialized automatic variable may observe any previous value (including
    program secrets), or any value which the compiler saw fit to materialize on
    the stack or in a register (this could be to synthesize an immediate, to
    refer to code or data locations, to generate cookies, etc).
  
  1. Pattern initialization
  
    This is the recommended initialization approach. Pattern initialization's
    goal is to initialize automatic variables with values which will likely
    transform logic bugs into crashes down the line, are easily recognizable in
    a crash dump, without being values which programmers can rely on for useful
    program semantics. At the same time, pattern initialization tries to
    generate code which will optimize well. You'll find the following details in
    `patternFor`:
  
    - Integers are initialized with repeated 0xAA bytes (infinite scream).
    - Vectors of integers are also initialized with infinite scream.
    - Pointers are initialized with infinite scream on 64-bit platforms because
      it's an unmappable pointer value on architectures I'm aware of. Pointers
      are initialize to 0x000000AA (small scream) on 32-bit platforms because
      32-bit platforms don't consistently offer unmappable pages. When they do
      it's usually the zero page. As people try this out, I expect that we'll
      want to allow different platforms to customize this, let's do so later.
    - Vectors of pointers are initialized the same way pointers are.
    - Floating point values and vectors are initialized with a vanilla quiet NaN
      (e.g. 0x7ff00000 and 0x7ffe000000000000). We could use other NaNs, say
      0xfffaaaaa (negative NaN, with infinite scream payload). NaNs are nice
      (here, anways) because they propagate on arithmetic, making it more likely
      that entire computations become NaN when a single uninitialized value
      sneaks in.
    - Arrays are initialized to their homogeneous elements' initialization
      value, repeated. Stack-based Variable-Length Arrays (VLAs) are
      runtime-initialized to the allocated size (no effort is made for negative
      size, but zero-sized VLAs are untouched even if technically undefined).
    - Structs are initialized to their heterogeneous element's initialization
      values. Zero-size structs are initialized as 0xAA since they're allocated
      a single byte.
    - Unions are initialized using the initialization for the largest member of
      the union.
  
    Expect the values used for pattern initialization to change over time, as we
    refine heuristics (both for performance and security). The goal is truly to
    avoid injecting semantics into undefined behavior, and we should be
    comfortable changing these values when there's a worthwhile point in doing
    so.
  
    Why so much infinite scream? Repeated byte patterns tend to be easy to
    synthesize on most architectures, and otherwise memset is usually very
    efficient. For values which aren't entirely repeated byte patterns, LLVM
    will often generate code which does memset + a few stores.
  
  2. Zero initialization
  
    Zero initialize all values. This has the unfortunate side-effect of
    providing semantics to otherwise undefined behavior, programs therefore
    might start to rely on this behavior, and that's sad. However, some
    programmers believe that pattern initialization is too expensive for them,
    and data might show that they're right. The only way to make these
    programmers wrong is to offer zero-initialization as an option, figure out
    where they are right, and optimize the compiler into submission. Until the
    compiler provides acceptable performance for all security-minded code, zero
    initialization is a useful (if blunt) tool.

I've been asked for a fourth initialization option: user-provided byte value.
This might be useful, and can easily be added later.

Why is an out-of band initialization mecanism desired? We could instead use
-Wuninitialized! Indeed we could, but then we're forcing the programmer to
provide semantics for something which doesn't actually have any (it's
uninitialized!). It's then unclear whether `int derp = 0;` lends meaning to `0`,
or whether it's just there to shut that warning up. It's also way easier to use
a compiler flag than it is to manually and intelligently initialize all values
in a program.

Why not just rely on static analysis? Because it cannot reason about all dynamic
code paths effectively, and it has false positives. It's a great tool, could get
even better, but it's simply incapable of catching all uses of uninitialized
values.

Why not just rely on memory sanitizer? Because it's not universally available,
has a 3x performance cost, and shouldn't be deployed in production. Again, it's
a great tool, it'll find the dynamic uses of uninitialized variables that your
test coverage hits, but it won't find the ones that you encounter in production.

What's the performance like? Not too bad! Previous publications [0] have cited
2.7 to 4.5% averages. We've commmitted a few patches over the last few months to
address specific regressions, both in code size and performance. In all cases,
the optimizations are generally useful, but variable initialization benefits
from them a lot more than regular code does. We've got a handful of other
optimizations in mind, but the code is in good enough shape and has found enough
latent issues that it's a good time to get the change reviewed, checked in, and
have others kick the tires. We'll continue reducing overheads as we try this out
on diverse codebases.

Is it a good idea? Security-minded folks think so, and apparently so does the
Microsoft Visual Studio team [1] who say "Between 2017 and mid 2018, this
feature would have killed 49 MSRC cases that involved uninitialized struct data
leaking across a trust boundary. It would have also mitigated a number of bugs
involving uninitialized struct data being used directly.". They seem to use pure
zero initialization, and claim to have taken the overheads down to within noise.
Don't just trust Microsoft though, here's another relevant person asking for
this [2]. It's been proposed for GCC [3] and LLVM [4] before.

What are the caveats? A few!

- Variables declared in unreachable code, and used later, aren't initialized. 
This goto, Duff's device, other objectionable uses of switch. This should 
instead be a hard-error in any serious codebase.
- Volatile stack variables are still weird. That's pre-existing, it's really 
the language's fault and this patch keeps it weird. We should deprecate 
volatile [5].
- As noted above, padding isn't fully handled yet.

I don't think these caveats make the patch untenable because they can be
addressed separately.

Should this be on by default? Maybe, in some circumstances. It's a conversation
we can have when we've tried it out sufficiently, and we're confident that we've
eliminated enough of the overheads that most codebases would want to opt-in.
Let's keep our precious undefined behavior until that point in time.

How do I use it:

1. On the command-line:

  -ftrivial-auto-var-init=uninitialized (the default) 
-ftrivial-auto-var-init=pattern -ftrivial-auto-var-init=zero
2. Using an attribute:

  int dont_initialize_me __attribute((trivial_auto_init("uninitialized"))); int 
zero_me __attribute((trivial_auto_init("zero"))); int pattern_me 
__attribute((trivial_auto_init("pattern")));

  [0]: 
https://users.elis.ugent.be/~jsartor/researchDocs/OOPSLA2011Zero-submit.pdf 
[1]: https://twitter.com/JosephBialek/status/1062774315098112001 [2]: 
https://outflux.net/slides/2018/lss/danger.pdf [3]: 
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00615.html [4]: 
https://github.com/AndroidHardeningArchive/platform_external_clang/commit/776a0955ef6686d23a82d2e6a3cbd4a6a882c31c
 [5]: http://wg21.link/p1152

rdar://problem/39131435


Repository:
  rC Clang

https://reviews.llvm.org/D54604

Files:
  include/clang/Basic/Attr.td
  include/clang/Basic/AttrDocs.td
  include/clang/Basic/LangOptions.def
  include/clang/Basic/LangOptions.h
  include/clang/Driver/Options.td
  include/clang/Driver/ToolChain.h
  lib/CodeGen/CGDecl.cpp
  lib/Driver/ToolChains/Clang.cpp
  lib/Frontend/CompilerInvocation.cpp
  lib/Sema/SemaDeclAttr.cpp
  test/CodeGenCXX/auto-var-init.cpp
  test/CodeGenCXX/trivial-auto-var-init-attribute.cpp
  test/CodeGenCXX/trivial-auto-var-init.cpp
  test/Sema/attr-trivial_auto_init.c

Index: test/Sema/attr-trivial_auto_init.c
===================================================================
--- /dev/null
+++ test/Sema/attr-trivial_auto_init.c
@@ -0,0 +1,26 @@
+// RUN: %clang_cc1 %s -verify -fsyntax-only
+
+void good() {
+    int dont_initialize_me __attribute((trivial_auto_init("uninitialized")));
+    int zero_me __attribute((trivial_auto_init("zero")));
+    int pattern_me __attribute((trivial_auto_init("pattern")));
+}
+
+void bad() {
+    int im_bad __attribute((trivial_auto_init)); // expected-error {{'trivial_auto_init' attribute takes one argument}}
+    int im_baaad __attribute((trivial_auto_init("uninitialized", "zero"))); // expected-error {{'trivial_auto_init' attribute takes one argument}}
+    static int come_on __attribute((trivial_auto_init("uninitialized"))); // expected-warning {{'trivial_auto_init' attribute only applies to local variables}}
+    int you_know __attribute((trivial_auto_init("pony"))); // expected-warning {{'trivial_auto_init' attribute argument not supported: pony}}
+    int and_the_whole_world_has_to __attribute((trivial_auto_init(uninitialized))); // expected-error {{'trivial_auto_init' attribute requires a string}}
+}
+
+extern int answer_right_now __attribute((trivial_auto_init("uninitialized"))); // expected-warning {{'trivial_auto_init' attribute only applies to local variables}}
+int just_to_tell_you_once_again __attribute((trivial_auto_init("uninitialized"))); // expected-warning {{'trivial_auto_init' attribute only applies to local variables}}
+static int whos_bad __attribute((trivial_auto_init("uninitialized"))); // expected-warning {{'trivial_auto_init' attribute only applies to local variables}}
+
+void the_word_is_out() __attribute((trivial_auto_init("uninitialized"))) {} // expected-warning {{'trivial_auto_init' attribute only applies to local variables}}
+void youre_doin_wrong(__attribute((trivial_auto_init("uninitialized"))) int a) {} // expected-warning {{'trivial_auto_init' attribute only applies to local variables}}
+
+struct GonnaLockYouUp {
+  __attribute((trivial_auto_init("uninitialized"))) int before_too_long; // expected-warning {{'trivial_auto_init' attribute only applies to local variables}}
+} __attribute((trivial_auto_init("uninitialized"))); // expected-warning {{'trivial_auto_init' attribute only applies to local variables}}
Index: test/CodeGenCXX/trivial-auto-var-init.cpp
===================================================================
--- /dev/null
+++ test/CodeGenCXX/trivial-auto-var-init.cpp
@@ -0,0 +1,128 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -fblocks %s -emit-llvm -o - | FileCheck %s -check-prefix=UNINIT
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -fblocks -ftrivial-auto-var-init=pattern %s -emit-llvm -o - | FileCheck %s -check-prefix=PATTERN
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -fblocks -ftrivial-auto-var-init=zero %s -emit-llvm -o - | FileCheck %s -check-prefix=ZERO
+
+// PATTERN-NOT: undef
+// ZERO-NOT: undef
+
+template<typename T> void used(T &) noexcept;
+
+extern "C" {
+
+// UNINIT-LABEL:  test_selfinit(
+// ZERO-LABEL:    test_selfinit(
+// PATTERN-LABEL: test_selfinit(
+void test_selfinit() {
+  int self = self + 1;
+  used(self);
+}
+
+// UNINIT-LABEL:  test_block(
+// ZERO-LABEL:    test_block(
+// PATTERN-LABEL: test_block(
+void test_block() {
+  __block int block;
+  used(block);
+}
+
+// UNINIT-LABEL:  test_goto_unreachable_value(
+// ZERO-LABEL:    test_goto_unreachable_value(
+// PATTERN-LABEL: test_goto_unreachable_value(
+void test_goto_unreachable_value() {
+  goto jump;
+  int oops;
+ jump:
+  used(oops);
+}
+
+// UNINIT-LABEL:  test_goto(
+// ZERO-LABEL:    test_goto(
+// PATTERN-LABEL: test_goto(
+void test_goto(int i) {
+  if (i)
+    goto jump;
+  int oops;
+ jump:
+  used(oops);
+}
+
+// UNINIT-LABEL:  test_switch(
+// ZERO-LABEL:    test_switch(
+// PATTERN-LABEL: test_switch(
+void test_switch(int i) {
+  switch (i) {
+  case 0:
+    int oops;
+    break;
+  case 1:
+    used(oops);
+  }
+}
+
+// UNINIT-LABEL:  test_vla(
+// ZERO-LABEL:    test_vla(
+// PATTERN-LABEL: test_vla(
+void test_vla(int size) {
+  // Variable-length arrays can't have a zero size according to C11 6.7.6.2/5.
+  // Neither can they be negative-sized.
+  //
+  // We don't use the former fact because some code creates zero-sized VLAs and
+  // doesn't use them. clang makes these share locations with other stack
+  // values, which leads to initialization of the wrong values.
+  //
+  // We rely on the later fact because it generates better code.
+  //
+  // Both cases are caught by UBSan.
+  int vla[size];
+  int *ptr = vla;
+  used(ptr);
+}
+
+void test_struct_vla(int size) {
+  // Same as above, but with a struct that doesn't just memcpy.
+  struct {
+    float f;
+    char c;
+    void *ptr;
+  } vla[size];
+  void *ptr = static_cast<void*>(vla);
+  used(ptr);
+}
+
+// UNINIT-LABEL:  test_zsa(
+// ZERO-LABEL:    test_zsa(
+// PATTERN-LABEL: test_zsa(
+void test_zsa(int size) {
+  // Technically not valid, but as long as clang accepts them we should do
+  // something sensible.
+  int zsa[0];
+  used(zsa);
+}
+  
+// UNINIT-LABEL:  test_huge_uninit(
+// ZERO-LABEL:    test_huge_uninit(
+// PATTERN-LABEL: test_huge_uninit(
+void test_huge_uninit() {
+  // We can't emit this as an inline constant to a store instruction because
+  // SDNode hits an internal size limit. We have to copy from a global.
+  char big[65536];
+  used(big);
+}
+
+// UNINIT-LABEL:  test_huge_small_init(
+// ZERO-LABEL:    test_huge_small_init(
+// PATTERN-LABEL: test_huge_small_init(
+void test_huge_small_init() {
+  char big[65536] = { 'a', 'b', 'c', 'd' };
+  used(big);
+}
+
+// UNINIT-LABEL:  test_huge_larger_init(
+// ZERO-LABEL:    test_huge_larger_init(
+// PATTERN-LABEL: test_huge_larger_init(
+void test_huge_larger_init() {
+  char big[65536] = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
+  used(big);
+}
+
+} // extern "C"
Index: test/CodeGenCXX/trivial-auto-var-init-attribute.cpp
===================================================================
--- /dev/null
+++ test/CodeGenCXX/trivial-auto-var-init-attribute.cpp
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -fblocks %s -emit-llvm -o - | FileCheck %s -check-prefix=UNINIT
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -fblocks -ftrivial-auto-var-init=pattern %s -emit-llvm -o - | FileCheck %s -check-prefix=PATTERN
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -fblocks -ftrivial-auto-var-init=zero %s -emit-llvm -o - | FileCheck %s -check-prefix=ZERO
+
+template<typename T> void used(T &) noexcept;
+
+extern "C" {
+
+// UNINIT-LABEL:  test_attribute_uninitialized
+// ZERO-LABEL:    test_attribute_uninitialized
+// PATTERN-LABEL: test_attribute_uninitialized
+void test_attribute_uninitialized() {
+  [[clang::trivial_auto_init("uninitialized")]] int i;
+  used(i);
+}
+
+// UNINIT-LABEL:  test_attribute_zero
+// ZERO-LABEL:    test_attribute_zero
+// PATTERN-LABEL: test_attribute_zero
+void test_attribute_zero() {
+  [[clang::trivial_auto_init("zero")]] int i;
+  used(i);
+}
+
+// UNINIT-LABEL:  test_attribute_pattern
+// ZERO-LABEL:    test_attribute_pattern
+// PATTERN-LABEL: test_attribute_pattern
+void test_attribute_pattern() {
+  [[clang::trivial_auto_init("pattern")]] int i;
+  used(i);
+}
+
+} // extern "C"
Index: test/CodeGenCXX/auto-var-init.cpp
===================================================================
--- test/CodeGenCXX/auto-var-init.cpp
+++ test/CodeGenCXX/auto-var-init.cpp
@@ -1,4 +1,9 @@
 // RUN: %clang_cc1 -triple x86_64-unknown-unknown -fblocks %s -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -fblocks -ftrivial-auto-var-init=pattern %s -emit-llvm -o - | FileCheck %s -check-prefix=PATTERN
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -fblocks -ftrivial-auto-var-init=zero %s -emit-llvm -o - | FileCheck %s -check-prefix=ZERO
+
+// PATTERN-NOT: undef
+// ZERO-NOT: undef
 
 template<typename T> void used(T &) noexcept;
 
@@ -56,28 +61,48 @@
 // CHECK-LABEL: @test_char_uninit()
 // CHECK:       %uninit = alloca i8, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_char_uninit()
+// PATTERN: store i8 -86, i8* %uninit, align 1
+// ZERO-LABEL: @test_char_uninit()
+// ZERO: store i8 0, i8* %uninit, align 1
 
 TEST_BRACES(char, char);
 // CHECK-LABEL: @test_char_braces()
 // CHECK:       %braces = alloca i8, align [[ALIGN:[0-9]*]]
 // CHECK-NEXT:  store i8 0, i8* %braces, align [[ALIGN]]
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%braces)
+// PATTERN-LABEL: @test_char_braces()
+// PATTERN: store i8 -86, i8* %braces, align 1
+// ZERO-LABEL: @test_char_braces()
+// ZERO: store i8 0, i8* %braces, align 1
 
 TEST_UNINIT(uchar, unsigned char);
 // CHECK-LABEL: @test_uchar_uninit()
 // CHECK:       %uninit = alloca i8, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_uchar_uninit()
+// PATTERN: store i8 -86, i8* %uninit, align 1
+// ZERO-LABEL: @test_uchar_uninit()
+// ZERO: store i8 0, i8* %uninit, align 1
 
 TEST_BRACES(uchar, unsigned char);
 // CHECK-LABEL: @test_uchar_braces()
 // CHECK:       %braces = alloca i8, align [[ALIGN:[0-9]*]]
 // CHECK-NEXT:  store i8 0, i8* %braces, align [[ALIGN]]
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%braces)
+// PATTERN-LABEL: @test_uchar_braces()
+// PATTERN: store i8 -86, i8* %braces, align 1
+// ZERO-LABEL: @test_uchar_braces()
+// ZERO: store i8 0, i8* %braces, align 1
 
 TEST_UNINIT(schar, signed char);
 // CHECK-LABEL: @test_schar_uninit()
 // CHECK:       %uninit = alloca i8, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_schar_uninit()
+// PATTERN: store i8 -86, i8* %uninit, align 1
+// ZERO-LABEL: @test_schar_uninit()
+// ZERO: store i8 0, i8* %uninit, align 1
 
 TEST_BRACES(schar, signed char);
 // CHECK-LABEL: @test_schar_braces()
@@ -89,6 +114,10 @@
 // CHECK-LABEL: @test_wchar_t_uninit()
 // CHECK:       %uninit = alloca i32, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_wchar_t_uninit()
+// PATTERN: store i32 -1431655766, i32* %uninit, align
+// ZERO-LABEL: @test_wchar_t_uninit()
+// ZERO: store i32 0, i32* %uninit, align
 
 TEST_BRACES(wchar_t, wchar_t);
 // CHECK-LABEL: @test_wchar_t_braces()
@@ -100,6 +129,10 @@
 // CHECK-LABEL: @test_short_uninit()
 // CHECK:       %uninit = alloca i16, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_short_uninit()
+// PATTERN: store i16 -21846, i16* %uninit, align
+// ZERO-LABEL: @test_short_uninit()
+// ZERO: store i16 0, i16* %uninit, align
 
 TEST_BRACES(short, short);
 // CHECK-LABEL: @test_short_braces()
@@ -111,6 +144,10 @@
 // CHECK-LABEL: @test_ushort_uninit()
 // CHECK:       %uninit = alloca i16, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_ushort_uninit()
+// PATTERN: store i16 -21846, i16* %uninit, align
+// ZERO-LABEL: @test_ushort_uninit()
+// ZERO: store i16 0, i16* %uninit, align
 
 TEST_BRACES(ushort, unsigned short);
 // CHECK-LABEL: @test_ushort_braces()
@@ -122,6 +159,10 @@
 // CHECK-LABEL: @test_int_uninit()
 // CHECK:       %uninit = alloca i32, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_int_uninit()
+// PATTERN: store i32 -1431655766, i32* %uninit, align
+// ZERO-LABEL: @test_int_uninit()
+// ZERO: store i32 0, i32* %uninit, align
 
 TEST_BRACES(int, int);
 // CHECK-LABEL: @test_int_braces()
@@ -133,6 +174,10 @@
 // CHECK-LABEL: @test_unsigned_uninit()
 // CHECK:       %uninit = alloca i32, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_unsigned_uninit()
+// PATTERN: store i32 -1431655766, i32* %uninit, align
+// ZERO-LABEL: @test_unsigned_uninit()
+// ZERO: store i32 0, i32* %uninit, align
 
 TEST_BRACES(unsigned, unsigned);
 // CHECK-LABEL: @test_unsigned_braces()
@@ -144,6 +189,10 @@
 // CHECK-LABEL: @test_long_uninit()
 // CHECK:       %uninit = alloca i64, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_long_uninit()
+// PATTERN: store i64 -6148914691236517206, i64* %uninit, align
+// ZERO-LABEL: @test_long_uninit()
+// ZERO: store i64 0, i64* %uninit, align
 
 TEST_BRACES(long, long);
 // CHECK-LABEL: @test_long_braces()
@@ -155,6 +204,10 @@
 // CHECK-LABEL: @test_ulong_uninit()
 // CHECK:       %uninit = alloca i64, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_ulong_uninit()
+// PATTERN: store i64 -6148914691236517206, i64* %uninit, align
+// ZERO-LABEL: @test_ulong_uninit()
+// ZERO: store i64 0, i64* %uninit, align
 
 TEST_BRACES(ulong, unsigned long);
 // CHECK-LABEL: @test_ulong_braces()
@@ -166,6 +219,10 @@
 // CHECK-LABEL: @test_longlong_uninit()
 // CHECK:       %uninit = alloca i64, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_longlong_uninit()
+// PATTERN: store i64 -6148914691236517206, i64* %uninit, align
+// ZERO-LABEL: @test_longlong_uninit()
+// ZERO: store i64 0, i64* %uninit, align
 
 TEST_BRACES(longlong, long long);
 // CHECK-LABEL: @test_longlong_braces()
@@ -177,6 +234,10 @@
 // CHECK-LABEL: @test_ulonglong_uninit()
 // CHECK:       %uninit = alloca i64, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_ulonglong_uninit()
+// PATTERN: store i64 -6148914691236517206, i64* %uninit, align
+// ZERO-LABEL: @test_ulonglong_uninit()
+// ZERO: store i64 0, i64* %uninit, align
 
 TEST_BRACES(ulonglong, unsigned long long);
 // CHECK-LABEL: @test_ulonglong_braces()
@@ -188,6 +249,10 @@
 // CHECK-LABEL: @test_int128_uninit()
 // CHECK:       %uninit = alloca i128, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_int128_uninit()
+// PATTERN: store i128 -113427455640312821154458202477256070486, i128* %uninit, align
+// ZERO-LABEL: @test_int128_uninit()
+// ZERO: store i128 0, i128* %uninit, align
 
 TEST_BRACES(int128, __int128);
 // CHECK-LABEL: @test_int128_braces()
@@ -199,6 +264,10 @@
 // CHECK-LABEL: @test_uint128_uninit()
 // CHECK:       %uninit = alloca i128, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_uint128_uninit()
+// PATTERN: store i128 -113427455640312821154458202477256070486, i128* %uninit, align
+// ZERO-LABEL: @test_uint128_uninit()
+// ZERO: store i128 0, i128* %uninit, align
 
 TEST_BRACES(uint128, unsigned __int128);
 // CHECK-LABEL: @test_uint128_braces()
@@ -211,6 +280,10 @@
 // CHECK-LABEL: @test_fp16_uninit()
 // CHECK:       %uninit = alloca half, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_fp16_uninit()
+// PATTERN: store half 0xH7E00, half* %uninit, align
+// ZERO-LABEL: @test_fp16_uninit()
+// ZERO: store half 0xH0000, half* %uninit, align
 
 TEST_BRACES(fp16, __fp16);
 // CHECK-LABEL: @test_fp16_braces()
@@ -222,6 +295,10 @@
 // CHECK-LABEL: @test_float_uninit()
 // CHECK:       %uninit = alloca float, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_float_uninit()
+// PATTERN: store float 0x7FF8000000000000, float* %uninit, align
+// ZERO-LABEL: @test_float_uninit()
+// ZERO: store float 0.000000e+00, float* %uninit, align
 
 TEST_BRACES(float, float);
 // CHECK-LABEL: @test_float_braces()
@@ -233,6 +310,10 @@
 // CHECK-LABEL: @test_double_uninit()
 // CHECK:       %uninit = alloca double, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_double_uninit()
+// PATTERN: store double 0x7FF8000000000000, double* %uninit, align
+// ZERO-LABEL: @test_double_uninit()
+// ZERO: store double 0.000000e+00, double* %uninit, align
 
 TEST_BRACES(double, double);
 // CHECK-LABEL: @test_double_braces()
@@ -244,6 +325,10 @@
 // CHECK-LABEL: @test_longdouble_uninit()
 // CHECK:       %uninit = alloca x86_fp80, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_longdouble_uninit()
+// PATTERN: store x86_fp80 0xK7FFFC000000000000000, x86_fp80* %uninit, align
+// ZERO-LABEL: @test_longdouble_uninit()
+// ZERO: store x86_fp80 0xK00000000000000000000, x86_fp80* %uninit, align
 
 TEST_BRACES(longdouble, long double);
 // CHECK-LABEL: @test_longdouble_braces()
@@ -256,6 +341,10 @@
 // CHECK-LABEL: @test_intptr_uninit()
 // CHECK:       %uninit = alloca i32*, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_intptr_uninit()
+// PATTERN: store i32* inttoptr (i64 -6148914691236517206 to i32*), i32** %uninit, align
+// ZERO-LABEL: @test_intptr_uninit()
+// ZERO: store i32* null, i32** %uninit, align
 
 TEST_BRACES(intptr, int*);
 // CHECK-LABEL: @test_intptr_braces()
@@ -267,6 +356,10 @@
 // CHECK-LABEL: @test_intptrptr_uninit()
 // CHECK:       %uninit = alloca i32**, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_intptrptr_uninit()
+// PATTERN: store i32** inttoptr (i64 -6148914691236517206 to i32**), i32*** %uninit, align
+// ZERO-LABEL: @test_intptrptr_uninit()
+// ZERO: store i32** null, i32*** %uninit, align
 
 TEST_BRACES(intptrptr, int**);
 // CHECK-LABEL: @test_intptrptr_braces()
@@ -278,6 +371,10 @@
 // CHECK-LABEL: @test_function_uninit()
 // CHECK:       %uninit = alloca void ()*, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_function_uninit()
+// PATTERN: store void ()* inttoptr (i64 -6148914691236517206 to void ()*), void ()** %uninit, align
+// ZERO-LABEL: @test_function_uninit()
+// ZERO: store void ()* null, void ()** %uninit, align
 
 TEST_BRACES(function, void(*)());
 // CHECK-LABEL: @test_function_braces()
@@ -289,6 +386,10 @@
 // CHECK-LABEL: @test_bool_uninit()
 // CHECK:       %uninit = alloca i8, align
 // CHECK-NEXT:  call void @{{.*}}used{{.*}}%uninit)
+// PATTERN-LABEL: @test_bool_uninit()
+// PATTERN: store i8 -86, i8* %uninit, align 1
+// ZERO-LABEL: @test_bool_uninit()
+// ZERO: store i8 0, i8* %uninit, align 1
 
 TEST_BRACES(bool, bool);
 // CHECK-LABEL: @test_bool_braces()
Index: lib/Sema/SemaDeclAttr.cpp
===================================================================
--- lib/Sema/SemaDeclAttr.cpp
+++ lib/Sema/SemaDeclAttr.cpp
@@ -6009,6 +6009,28 @@
     handleSimpleAttributeWithExclusions<NoDestroyAttr, AlwaysDestroyAttr>(S, D, A);
 }
 
+static void handleTrivialAutoInitAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
+  if (AL.getNumArgs() != 1) {
+    S.Diag(AL.getLoc(), diag::err_attribute_wrong_number_arguments) << AL << 1;
+    return;
+  }
+  StringRef Str;
+  SourceLocation ArgLoc;
+  if (!S.checkStringLiteralArgumentAttr(AL, 0, Str, &ArgLoc))
+    return;
+  TrivialAutoInitAttr::TrivialAutoVarInitKind Kind;
+  if (!TrivialAutoInitAttr::ConvertStrToTrivialAutoVarInitKind(Str, Kind)) {
+    S.Diag(AL.getLoc(), diag::warn_attribute_type_not_supported)
+        << AL << Str << ArgLoc;
+    return;
+  }
+  assert(cast<VarDecl>(D)->getStorageDuration() == SD_Automatic &&
+         "trivial_auto_init is only valid on automatic duration variables");
+  unsigned Index = AL.getAttributeSpellingListIndex();
+  D->addAttr(::new (S.Context)
+                 TrivialAutoInitAttr(AL.getLoc(), S.Context, Kind, Index));
+}
+
 //===----------------------------------------------------------------------===//
 // Top Level Sema Entry Points
 //===----------------------------------------------------------------------===//
@@ -6687,6 +6709,10 @@
   case ParsedAttr::AT_NoDestroy:
     handleDestroyAttr(S, D, AL);
     break;
+
+  case ParsedAttr::AT_TrivialAutoInit:
+    handleTrivialAutoInitAttr(S, D, AL);
+    break;
   }
 }
 
Index: lib/Frontend/CompilerInvocation.cpp
===================================================================
--- lib/Frontend/CompilerInvocation.cpp
+++ lib/Frontend/CompilerInvocation.cpp
@@ -2811,6 +2811,19 @@
   case 3: Opts.setStackProtector(LangOptions::SSPReq); break;
   }
 
+  if (Arg *A = Args.getLastArg(OPT_ftrivial_auto_var_init)) {
+    StringRef Val = A->getValue();
+    if (Val == "uninitialized")
+      Opts.setTrivialAutoVarInit(
+          LangOptions::TrivialAutoVarInitKind::Uninitialized);
+    else if (Val == "zero")
+      Opts.setTrivialAutoVarInit(LangOptions::TrivialAutoVarInitKind::Zero);
+    else if (Val == "pattern")
+      Opts.setTrivialAutoVarInit(LangOptions::TrivialAutoVarInitKind::Pattern);
+    else
+      Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val;
+  }
+
   // Parse -fsanitize= arguments.
   parseSanitizerKinds("-fsanitize=", Args.getAllArgValues(OPT_fsanitize_EQ),
                       Diags, Opts.Sanitize);
Index: lib/Driver/ToolChains/Clang.cpp
===================================================================
--- lib/Driver/ToolChains/Clang.cpp
+++ lib/Driver/ToolChains/Clang.cpp
@@ -2453,6 +2453,47 @@
   }
 }
 
+static void RenderTrivialAutoVarInitOptions(const Driver &D,
+                                            const ToolChain &TC,
+                                            const ArgList &Args,
+                                            ArgStringList &CmdArgs) {
+  auto DefaultTrivialAutoVarInit = TC.GetDefaultTrivialAutoVarInit();
+  StringRef TrivialAutoVarInit = "";
+
+  for (const Arg *A : Args) {
+    switch (A->getOption().getID()) {
+    default:
+      continue;
+    case options::OPT_ftrivial_auto_var_init: {
+      A->claim();
+      StringRef Val = A->getValue();
+      if (Val == "uninitialized" || Val == "zero" || Val == "pattern")
+        TrivialAutoVarInit = Val;
+      else
+        D.Diag(diag::err_drv_unsupported_option_argument)
+            << A->getOption().getName() << Val;
+      break;
+    }
+    }
+  }
+
+  if (TrivialAutoVarInit.empty())
+    switch (DefaultTrivialAutoVarInit) {
+    case LangOptions::TrivialAutoVarInitKind::Uninitialized:
+      break;
+    case LangOptions::TrivialAutoVarInitKind::Pattern:
+      TrivialAutoVarInit = "pattern";
+      break;
+    case LangOptions::TrivialAutoVarInitKind::Zero:
+      TrivialAutoVarInit = "zero";
+      break;
+    }
+
+  if (!TrivialAutoVarInit.empty())
+    CmdArgs.push_back(
+        Args.MakeArgString("-ftrivial-auto-var-init=" + TrivialAutoVarInit));
+}
+
 static void RenderOpenCLOptions(const ArgList &Args, ArgStringList &CmdArgs) {
   const unsigned ForwardedArguments[] = {
       options::OPT_cl_opt_disable,
@@ -4426,6 +4467,7 @@
                   options::OPT_mno_speculative_load_hardening);
 
   RenderSSPOptions(TC, Args, CmdArgs, KernelOrKext);
+  RenderTrivialAutoVarInitOptions(D, TC, Args, CmdArgs);
 
   // Translate -mstackrealign
   if (Args.hasFlag(options::OPT_mstackrealign, options::OPT_mno_stackrealign,
Index: lib/CodeGen/CGDecl.cpp
===================================================================
--- lib/CodeGen/CGDecl.cpp
+++ lib/CodeGen/CGDecl.cpp
@@ -963,6 +963,69 @@
   return llvm::isBytewiseValue(Init);
 }
 
+static llvm::Constant *patternFor(CodeGenModule &CGM, llvm::Type *Ty) {
+  // The following value is a guaranteed unmappable pointer value and has a
+  // reperated byte-pattern which makes it easier to synthesize. We use it for
+  // pointers as well as integers so that aggregates are likely to be
+  // initialized with this repeated value.
+  constexpr uint64_t LargeValue = 0xAAAAAAAAAAAAAAAAull;
+  // For 32-bit platforms it's a bit trickier because, across systems, only the
+  // zero page can reasonably be expected to be unmapped, and even then we need
+  // a very low address. We use a smaller value, and that value sadly doesn't
+  // have a repeated byte-pattern. We don't use it for integers.
+  constexpr uint32_t SmallValue = 0x000000AA;
+  if (Ty->isIntOrIntVectorTy()) {
+    unsigned BitWidth = llvm::cast<llvm::IntegerType>(
+                            Ty->isVectorTy() ? Ty->getVectorElementType() : Ty)
+                            ->getBitWidth();
+    if (BitWidth <= 64)
+      return llvm::ConstantInt::get(Ty, LargeValue);
+    return llvm::ConstantInt::get(
+        Ty, llvm::APInt::getSplat(BitWidth, llvm::APInt(64, LargeValue)));
+  } else if (Ty->isPtrOrPtrVectorTy()) {
+    auto *PtrTy = llvm::cast<llvm::PointerType>(
+        Ty->isVectorTy() ? Ty->getVectorElementType() : Ty);
+    unsigned PtrWidth = CGM.getContext().getTargetInfo().getPointerWidth(
+        PtrTy->getAddressSpace());
+    llvm::Type *IntTy = llvm::IntegerType::get(CGM.getLLVMContext(), PtrWidth);
+    uint64_t IntValue;
+    switch (PtrWidth) {
+    default:
+      llvm_unreachable("pattern initialization of unsupported pointer width");
+    case 64:
+      IntValue = LargeValue;
+      break;
+    case 32:
+      IntValue = SmallValue;
+      break;
+    }
+    auto *Int = llvm::ConstantInt::get(IntTy, IntValue);
+    return llvm::ConstantExpr::getIntToPtr(Int, PtrTy);
+  } else if (Ty->isFPOrFPVectorTy()) {
+    return llvm::ConstantFP::getNaN(Ty);
+  } else if (Ty->isArrayTy()) {
+    // Note: this doesn't touch tail padding (at the end of an object, before
+    // the next array object). It is instead handled by replaceUndef.
+    auto *ArrTy = llvm::cast<llvm::ArrayType>(Ty);
+    llvm::SmallVector<llvm::Constant *, 8> Element(
+        ArrTy->getNumElements(), patternFor(CGM, ArrTy->getElementType()));
+    return llvm::ConstantArray::get(ArrTy, Element);
+  }
+
+  // Note: this doesn't touch struct padding. It will initialize as much union
+  // padding as is required for the largest type in the union. Padding is
+  // instead handled by replaceUndef. Stores to structs with volatile members
+  // don't have a volatile qualifier when initialized according to C++. This is
+  // fine because stack-based volatiles don't really have volatile semantics
+  // anyways, and the initialization shouldn't be observable.
+  auto *StructTy = llvm::cast<llvm::StructType>(Ty);
+  llvm::SmallVector<llvm::Constant *, 8> Struct(StructTy->getNumElements());
+  for (unsigned El = 0; El != Struct.size(); ++El) {
+    Struct[El] = patternFor(CGM, StructTy->getElementType(El));
+  }
+  return llvm::ConstantStruct::get(StructTy, Struct);
+}
+
 static Address createUnnamedGlobalFrom(CodeGenModule &CGM, const VarDecl &D,
                                        CGBuilderTy &Builder,
                                        llvm::Constant *Constant,
@@ -1010,23 +1073,29 @@
                                   Address Loc, bool isVolatile,
                                   CGBuilderTy &Builder,
                                   llvm::Constant *constant) {
+  auto *Ty = constant->getType();
+  bool isScalar = Ty->isIntOrIntVectorTy() || Ty->isPtrOrPtrVectorTy() ||
+                  Ty->isFPOrFPVectorTy();
+  if (isScalar) {
+    Builder.CreateStore(constant, Loc, isVolatile);
+    return;
+  }
+
   auto *Int8Ty = llvm::IntegerType::getInt8Ty(CGM.getLLVMContext());
   auto *IntPtrTy = CGM.getDataLayout().getIntPtrType(CGM.getLLVMContext());
 
   // If the initializer is all or mostly the same, codegen with bzero / memset
   // then do a few stores afterward.
-  uint64_t ConstantSize =
-      CGM.getDataLayout().getTypeAllocSize(constant->getType());
+  uint64_t ConstantSize = CGM.getDataLayout().getTypeAllocSize(Ty);
   auto *SizeVal = llvm::ConstantInt::get(IntPtrTy, ConstantSize);
   if (shouldUseBZeroPlusStoresToInitialize(constant, ConstantSize)) {
     Builder.CreateMemSet(Loc, llvm::ConstantInt::get(Int8Ty, 0), SizeVal,
                          isVolatile);
 
     bool valueAlreadyCorrect =
         constant->isNullValue() || isa<llvm::UndefValue>(constant);
     if (!valueAlreadyCorrect) {
-      Loc = Builder.CreateBitCast(
-          Loc, constant->getType()->getPointerTo(Loc.getAddressSpace()));
+      Loc = Builder.CreateBitCast(Loc, Ty->getPointerTo(Loc.getAddressSpace()));
       emitStoresForInitAfterBZero(CGM, constant, Loc, isVolatile, Builder);
     }
     return;
@@ -1051,6 +1120,63 @@
       SizeVal, isVolatile);
 }
 
+static void emitStoresForZeroInit(CodeGenModule &CGM, const VarDecl &D,
+                                  Address Loc, bool isVolatile,
+                                  CGBuilderTy &Builder) {
+  llvm::Type *ElTy = Loc.getElementType();
+  llvm::Constant *constant = llvm::Constant::getNullValue(ElTy);
+  emitStoresForConstant(CGM, D, Loc, isVolatile, Builder, constant);
+}
+
+static void emitStoresForPatternInit(CodeGenModule &CGM, const VarDecl &D,
+                                     Address Loc, bool isVolatile,
+                                     CGBuilderTy &Builder) {
+  llvm::Type *ElTy = Loc.getElementType();
+  llvm::Constant *constant = patternFor(CGM, ElTy);
+  assert(!isa<llvm::UndefValue>(constant));
+  emitStoresForConstant(CGM, D, Loc, isVolatile, Builder, constant);
+}
+
+static bool containsUndef(llvm::Constant *constant) {
+  auto *Ty = constant->getType();
+  if (isa<llvm::UndefValue>(constant))
+    return true;
+  if (Ty->isStructTy() || Ty->isArrayTy() || Ty->isVectorTy()) {
+    for (unsigned Op = 0, NumOp = constant->getNumOperands(); Op != NumOp;
+         ++Op) {
+      auto *OpValue = cast<llvm::Constant>(constant->getOperand(Op));
+      if (containsUndef(OpValue))
+        return true;
+    }
+  }
+  return false;
+}
+
+static llvm::Constant *replaceUndef(llvm::Constant *constant) {
+  // FIXME: when doing pattern initialization, replace undef with 0xAA instead.
+  // FIXME: also replace padding between values by creating a new struct type
+  //        which has no padding.
+  auto *Ty = constant->getType();
+  if (isa<llvm::UndefValue>(constant))
+    return llvm::Constant::getNullValue(Ty);
+  if (!(Ty->isStructTy() || Ty->isArrayTy() || Ty->isVectorTy()))
+    return constant;
+  if (!containsUndef(constant))
+    return constant;
+  llvm::SmallVector<llvm::Constant *, 8> Values(constant->getNumOperands());
+  for (unsigned Op = 0, NumOp = constant->getNumOperands(); Op != NumOp; ++Op) {
+    auto *OpValue = cast<llvm::Constant>(constant->getOperand(Op));
+    Values[Op] = replaceUndef(OpValue);
+  }
+  if (Ty->isStructTy()) {
+    return llvm::ConstantStruct::get(llvm::cast<llvm::StructType>(Ty), Values);
+  } else if (Ty->isArrayTy()) {
+    return llvm::ConstantArray::get(llvm::cast<llvm::ArrayType>(Ty), Values);
+  }
+  assert(Ty->isVectorTy());
+  return llvm::ConstantVector::get(Values);
+}
+
 /// EmitAutoVarDecl - Emit code and set up an entry in LocalDeclMap for a
 /// variable declaration with auto, register, or no storage class specifier.
 /// These turn into simple stack objects, or GlobalValues depending on target.
@@ -1442,6 +1568,8 @@
   auto DL = ApplyDebugLocation::CreateDefaultArtificial(*this, D.getLocation());
   QualType type = D.getType();
 
+  bool isVolatile = type.isVolatileQualified();
+
   // If this local has an initializer, emit it now.
   const Expr *Init = D.getInit();
 
@@ -1469,24 +1597,133 @@
     return;
   }
 
-  if (isTrivialInitializer(Init))
-    return;
-
   // Check whether this is a byref variable that's potentially
   // captured and moved by its own initializer.  If so, we'll need to
   // emit the initializer first, then copy into the variable.
-  bool capturedByInit = emission.IsEscapingByRef && isCapturedBy(D, Init);
+  bool capturedByInit =
+      Init && emission.IsEscapingByRef && isCapturedBy(D, Init);
 
   Address Loc =
-    capturedByInit ? emission.Addr : emission.getObjectAddress(*this);
+      capturedByInit ? emission.Addr : emission.getObjectAddress(*this);
+
+  // Note: constexpr already initializes everything correctly.
+  LangOptions::TrivialAutoVarInitKind trivialAutoVarInit;
+  if (D.isConstexpr()) {
+    trivialAutoVarInit = LangOptions::TrivialAutoVarInitKind::Uninitialized;
+  } else if (const TrivialAutoInitAttr *Attr =
+                 D.getAttr<TrivialAutoInitAttr>()) {
+    switch (Attr->getTrivialAutoVarInit()) {
+    case TrivialAutoInitAttr::Uninitialized:
+      trivialAutoVarInit = LangOptions::TrivialAutoVarInitKind::Uninitialized;
+      break;
+    case TrivialAutoInitAttr::Zero:
+      trivialAutoVarInit = LangOptions::TrivialAutoVarInitKind::Zero;
+      break;
+    case TrivialAutoInitAttr::Pattern:
+      trivialAutoVarInit = LangOptions::TrivialAutoVarInitKind::Pattern;
+      break;
+    }
+  } else {
+    trivialAutoVarInit = getContext().getLangOpts().getTrivialAutoVarInit();
+  }
+
+  auto initializeWhatIsTechnicallyUninitialized = [&]() {
+    if (trivialAutoVarInit !=
+        LangOptions::TrivialAutoVarInitKind::Uninitialized) {
+      CharUnits Size = getContext().getTypeSizeInChars(type);
+      if (Size.isZero()) {
+        if (const VariableArrayType *VlaType =
+                dyn_cast_or_null<VariableArrayType>(
+                    getContext().getAsArrayType(type))) {
+          // VLAs look zero-sized to getTypeInfo. We can't emit constant stores
+          // to them, so emit a memset with the VLA size to initialize them.
+          // Technically zero-sized or negative-sized VLAs are undefined, and
+          // UBSan will catch that code, but there exists code which generates
+          // zero-sized VLAs. Be nice and initialize whatever they requested.
+          auto VlaSize = getVLASize(VlaType);
+          auto SizeVal = VlaSize.NumElts;
+          CharUnits EltSize = getContext().getTypeSizeInChars(VlaSize.Type);
+          switch (trivialAutoVarInit) {
+          case LangOptions::TrivialAutoVarInitKind::Uninitialized:
+            llvm_unreachable("Handled above");
+
+          case LangOptions::TrivialAutoVarInitKind::Zero:
+            if (!EltSize.isOne())
+              SizeVal = Builder.CreateNUWMul(SizeVal, CGM.getSize(EltSize));
+            Builder.CreateMemSet(Loc, llvm::ConstantInt::get(Int8Ty, 0),
+                                 SizeVal, isVolatile);
+            break;
+
+          case LangOptions::TrivialAutoVarInitKind::Pattern: {
+            llvm::Type *ElTy = Loc.getElementType();
+            llvm::Constant *Constant = patternFor(CGM, ElTy);
+            CharUnits ConstantAlign =
+                getContext().getTypeAlignInChars(VlaSize.Type);
+            if (!EltSize.isOne())
+              SizeVal = Builder.CreateNUWMul(SizeVal, CGM.getSize(EltSize));
+            llvm::Value *BaseSizeInChars =
+                llvm::ConstantInt::get(IntPtrTy, EltSize.getQuantity());
+            Address Begin =
+                Builder.CreateElementBitCast(Loc, Int8Ty, "vla.begin");
+            llvm::Value *End = Builder.CreateInBoundsGEP(Begin.getPointer(),
+                                                         SizeVal, "vla.end");
+            llvm::BasicBlock *OriginBB = Builder.GetInsertBlock();
+            llvm::BasicBlock *LoopBB = createBasicBlock("vla-init.loop");
+            llvm::BasicBlock *ContBB = createBasicBlock("vla-init.cont");
+            EmitBlock(LoopBB);
+            llvm::PHINode *Cur =
+                Builder.CreatePHI(Begin.getType(), 2, "vla.cur");
+            Cur->addIncoming(Begin.getPointer(), OriginBB);
+            CharUnits CurAlign =
+                Loc.getAlignment().alignmentOfArrayElement(EltSize);
+            Builder.CreateMemCpy(Address(Cur, CurAlign),
+                                 createUnnamedGlobalFrom(
+                                     CGM, D, Builder, Constant, ConstantAlign),
+                                 BaseSizeInChars, isVolatile);
+            llvm::Value *Next = Builder.CreateInBoundsGEP(
+                Int8Ty, Cur, BaseSizeInChars, "vla.next");
+            llvm::Value *Done =
+                Builder.CreateICmpEQ(Next, End, "vla-init.isdone");
+            Builder.CreateCondBr(Done, ContBB, LoopBB);
+            Cur->addIncoming(Next, LoopBB);
+            EmitBlock(ContBB);
+          } break;
+          }
+        }
+        // We're done: this was either zero-sized, or a VLA.
+        return;
+      }
+    }
+
+    switch (trivialAutoVarInit) {
+    case LangOptions::TrivialAutoVarInitKind::Uninitialized:
+      // Leave trivial automatic variables uninitialized.
+      break;
+    case LangOptions::TrivialAutoVarInitKind::Zero:
+      emitStoresForZeroInit(CGM, D, Loc, isVolatile, Builder);
+      break;
+    case LangOptions::TrivialAutoVarInitKind::Pattern:
+      emitStoresForPatternInit(CGM, D, Loc, isVolatile, Builder);
+      break;
+    }
+  };
+
+  if (isTrivialInitializer(Init)) {
+    initializeWhatIsTechnicallyUninitialized();
+    return;
+  }
 
   llvm::Constant *constant = nullptr;
   if (emission.IsConstantAggregate || D.isConstexpr()) {
     assert(!capturedByInit && "constant init contains a capturing block?");
     constant = ConstantEmitter(*this).tryEmitAbstractForInitializer(D);
+    if (constant && trivialAutoVarInit !=
+                        LangOptions::TrivialAutoVarInitKind::Uninitialized)
+      constant = replaceUndef(constant);
   }
 
   if (!constant) {
+    initializeWhatIsTechnicallyUninitialized();
     LValue lv = MakeAddrLValue(Loc, type);
     lv.setNonGC(true);
     return EmitExprAsInit(Init, &D, lv, capturedByInit);
@@ -1499,10 +1736,6 @@
     return EmitStoreThroughLValue(RValue::get(constant), lv, true);
   }
 
-  // If this is a simple aggregate initialization, we can optimize it
-  // in various ways.
-  bool isVolatile = type.isVolatileQualified();
-
   llvm::Type *BP = CGM.Int8Ty->getPointerTo(Loc.getAddressSpace());
   if (Loc.getType() != BP)
     Loc = Builder.CreateBitCast(Loc, BP);
Index: include/clang/Driver/ToolChain.h
===================================================================
--- include/clang/Driver/ToolChain.h
+++ include/clang/Driver/ToolChain.h
@@ -10,9 +10,10 @@
 #ifndef LLVM_CLANG_DRIVER_TOOLCHAIN_H
 #define LLVM_CLANG_DRIVER_TOOLCHAIN_H
 
+#include "clang/Basic/DebugInfoOptions.h"
 #include "clang/Basic/LLVM.h"
+#include "clang/Basic/LangOptions.h"
 #include "clang/Basic/Sanitizers.h"
-#include "clang/Basic/DebugInfoOptions.h"
 #include "clang/Driver/Action.h"
 #include "clang/Driver/Multilib.h"
 #include "clang/Driver/Types.h"
@@ -349,6 +350,12 @@
     return 0;
   }
 
+  /// Get the default trivial automatic variable initialization.
+  virtual LangOptions::TrivialAutoVarInitKind
+  GetDefaultTrivialAutoVarInit() const {
+    return LangOptions::TrivialAutoVarInitKind::Uninitialized;
+  }
+
   /// GetDefaultLinker - Get the default linker to use.
   virtual const char *getDefaultLinker() const { return "ld"; }
 
Index: include/clang/Driver/Options.td
===================================================================
--- include/clang/Driver/Options.td
+++ include/clang/Driver/Options.td
@@ -1624,6 +1624,9 @@
   HelpText<"Use a strong heuristic to apply stack protectors to functions">;
 def fstack_protector : Flag<["-"], "fstack-protector">, Group<f_Group>,
   HelpText<"Enable stack protectors for functions potentially vulnerable to stack smashing">;
+def ftrivial_auto_var_init : Joined<["-"], "ftrivial-auto-var-init=">, Group<f_Group>,
+  Flags<[CC1Option]>, HelpText<"Initialize trivial automatic stack variables: uninitialized (default)"
+  " | zero | pattern">, Values<"uninitialized,zero,pattern">;
 def fstandalone_debug : Flag<["-"], "fstandalone-debug">, Group<f_Group>, Flags<[CoreOption]>,
   HelpText<"Emit full debug info for all types used by the program">;
 def fno_standalone_debug : Flag<["-"], "fno-standalone-debug">, Group<f_Group>, Flags<[CoreOption]>,
Index: include/clang/Basic/LangOptions.h
===================================================================
--- include/clang/Basic/LangOptions.h
+++ include/clang/Basic/LangOptions.h
@@ -54,6 +54,11 @@
   enum GCMode { NonGC, GCOnly, HybridGC };
   enum StackProtectorMode { SSPOff, SSPOn, SSPStrong, SSPReq };
 
+  // Automatic variables live on the stack, and when trivial they're usually
+  // uninitialized because it's undefined behavior to use them without
+  // initializing them.
+  enum class TrivialAutoVarInitKind { Uninitialized, Zero, Pattern };
+
   enum SignedOverflowBehaviorTy {
     // Default C standard behavior.
     SOB_Undefined,
Index: include/clang/Basic/LangOptions.def
===================================================================
--- include/clang/Basic/LangOptions.def
+++ include/clang/Basic/LangOptions.def
@@ -262,6 +262,8 @@
              "type symbol visibility")
 ENUM_LANGOPT(StackProtector, StackProtectorMode, 2, SSPOff,
              "stack protector mode")
+ENUM_LANGOPT(TrivialAutoVarInit, TrivialAutoVarInitKind, 2, TrivialAutoVarInitKind::Uninitialized,
+             "trivial automatic variable initialization")
 ENUM_LANGOPT(SignedOverflowBehavior, SignedOverflowBehaviorTy, 2, SOB_Undefined,
              "signed integer overflow handling")
 
Index: include/clang/Basic/AttrDocs.td
===================================================================
--- include/clang/Basic/AttrDocs.td
+++ include/clang/Basic/AttrDocs.td
@@ -3570,6 +3570,17 @@
   }];
 }
 
+def TrivialAutoInitDocs : Documentation {
+  let Category = DocCatVariable;
+  let Content = [{
+The command-line parameter ``-ftrivial-auto-var-init=*`` can be used to
+automatically initialize trivial automatic stack variables. By default, trivial
+automatic stack variables are uninitialized. This attribute is used to override
+the command-line parameter, and accepts a single parameter that must be one of
+the following: ``uninitialized``, ``zero``, or ``pattern``.
+  }];
+}
+
 def GnuInlineDocs : Documentation {
   let Category = DocCatFunction;
   let Content = [{
Index: include/clang/Basic/Attr.td
===================================================================
--- include/clang/Basic/Attr.td
+++ include/clang/Basic/Attr.td
@@ -3086,3 +3086,12 @@
   let Subjects = SubjectList<[Var]>;
   let Documentation = [AlwaysDestroyDocs];
 }
+
+def TrivialAutoInit : InheritableAttr {
+  let Spellings = [Clang<"trivial_auto_init", 1>];
+  let Args = [EnumArgument<"TrivialAutoVarInit", "TrivialAutoVarInitKind",
+                           ["uninitialized", "zero", "pattern"],
+                           ["Uninitialized", "Zero", "Pattern"]>];
+  let Subjects = SubjectList<[LocalVar]>;
+  let Documentation = [TrivialAutoInitDocs];
+}
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to