[PATCH] D47802: Allow std::vector to move construct its allocator

2018-06-05 Thread Scott Constable via Phabricator via cfe-commits
fidget324 created this revision.
fidget324 added reviewers: hiraditya, EricWF.
Herald added a subscriber: cfe-commits.

Fix an issue that was preventing std::vector from invoking the move
constructor on its allocator when appropriate.

Added a constructor to __vector_base which accepts an rvalue reference
 to the allocator, thus allowing the move constructor to be invoked.
 Previously, only a const lvalue reference was being accepted.

This fixes bug 37694: https://bugs.llvm.org/show_bug.cgi?id=37694.


Repository:
  rCXX libc++

https://reviews.llvm.org/D47802

Files:
  include/vector


Index: include/vector
===
--- include/vector
+++ include/vector
@@ -355,6 +355,7 @@
 __vector_base()
 _NOEXCEPT_(is_nothrow_default_constructible::value);
 _LIBCPP_INLINE_VISIBILITY __vector_base(const allocator_type& __a);
+_LIBCPP_INLINE_VISIBILITY __vector_base(allocator_type&& __a);
 ~__vector_base();
 
 _LIBCPP_INLINE_VISIBILITY
@@ -438,6 +439,15 @@
 {
 }
 
+template 
+inline _LIBCPP_INLINE_VISIBILITY
+__vector_base<_Tp, _Allocator>::__vector_base(allocator_type&& __a)
+: __begin_(nullptr),
+  __end_(nullptr),
+  __end_cap_(nullptr, std::move(__a))
+{
+}
+
 template 
 __vector_base<_Tp, _Allocator>::~__vector_base()
 {


Index: include/vector
===
--- include/vector
+++ include/vector
@@ -355,6 +355,7 @@
 __vector_base()
 _NOEXCEPT_(is_nothrow_default_constructible::value);
 _LIBCPP_INLINE_VISIBILITY __vector_base(const allocator_type& __a);
+_LIBCPP_INLINE_VISIBILITY __vector_base(allocator_type&& __a);
 ~__vector_base();
 
 _LIBCPP_INLINE_VISIBILITY
@@ -438,6 +439,15 @@
 {
 }
 
+template 
+inline _LIBCPP_INLINE_VISIBILITY
+__vector_base<_Tp, _Allocator>::__vector_base(allocator_type&& __a)
+: __begin_(nullptr),
+  __end_(nullptr),
+  __end_cap_(nullptr, std::move(__a))
+{
+}
+
 template 
 __vector_base<_Tp, _Allocator>::~__vector_base()
 {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D47802: Allow std::vector to move construct its allocator

2018-06-05 Thread Scott Constable via Phabricator via cfe-commits
fidget324 abandoned this revision.
fidget324 added a comment.

That's fine. Your patch was much more thorough, anyways.


Repository:
  rCXX libc++

https://reviews.llvm.org/D47802



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75938: [DO NOT MERGE] X86 Mitigate for Load Value Injection (LVI)--All Code

2020-05-26 Thread Scott Constable via Phabricator via cfe-commits
sconstab abandoned this revision.
sconstab added a comment.

Changes have been merged.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75938/new/

https://reviews.llvm.org/D75938



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79910: [x86][seses] Add clang flag; Use lvi-cfi with seses

2020-05-26 Thread Scott Constable via Phabricator via cfe-commits
sconstab added inline comments.



Comment at: clang/lib/Driver/ToolChains/Arch/X86.cpp:200
+if (!Args.hasArg(options::OPT_mno_lvi_cfi)) {
+  Features.push_back("+lvi-cfi");
+  LVIOpt = options::OPT_mlvi_cfi;

Would it be better to add `FeatureLVIControlFlowIntegrity` as a dependency for 
`FeatureSpeculativeExecutionSideEffectSuppression` in 
`llvm/lib/Target/X86/X86.td`?



Comment at: 
llvm/lib/Target/X86/X86SpeculativeExecutionSideEffectSuppression.cpp:90
+  const X86Subtarget &Subtarget = MF.getSubtarget();
+  if (!Subtarget.useSpeculativeExecutionSideEffectSuppression() &&
+  !EnableSpeculativeExecutionSideEffectSuppression)

Is it really necessary to have the target feature and the CLI flag? If SESES is 
required for, say, a *.ll file, then `+seses` can always be added as a target 
feature.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79910/new/

https://reviews.llvm.org/D79910



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77427: [X86] Add tests to clang Driver to ensure that SLH/Retpoline features are not enabled with LVI-CFI

2020-04-03 Thread Scott Constable via Phabricator via cfe-commits
sconstab created this revision.
sconstab added a reviewer: craig.topper.

https://reviews.llvm.org/D77427

Files:
  clang/test/Driver/x86-target-features.c


Index: clang/test/Driver/x86-target-features.c
===
--- clang/test/Driver/x86-target-features.c
+++ clang/test/Driver/x86-target-features.c
@@ -159,6 +159,13 @@
 // LVICFI: "-target-feature" "+lvi-cfi"
 // NO-LVICFI-NOT: lvi-cfi
 
+// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mspeculative-load-hardening 
%s -### -o %t.o 2>&1 | FileCheck -check-prefix=LVICFI-SLH %s
+// LVICFI-SLH: error: invalid argument 'mspeculative-load-hardening' not 
allowed with 'mlvi-cfi'
+// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mretpoline %s -### -o %t.o 
2>&1 | FileCheck -check-prefix=LVICFI-RETPOLINE %s
+// LVICFI-RETPOLINE: error: invalid argument 'mretpoline' not allowed with 
'mlvi-cfi'
+// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mretpoline-external-thunk %s 
-### -o %t.o 2>&1 | FileCheck -check-prefix=LVICFI-RETPOLINE-EXTERNAL-THUNK %s
+// LVICFI-RETPOLINE-EXTERNAL-THUNK: error: invalid argument 
'mretpoline-external-thunk' not allowed with 'mlvi-cfi'
+
 // RUN: %clang -target i386-linux-gnu -mwaitpkg %s -### -o %t.o 2>&1 | 
FileCheck -check-prefix=WAITPKG %s
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | 
FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"


Index: clang/test/Driver/x86-target-features.c
===
--- clang/test/Driver/x86-target-features.c
+++ clang/test/Driver/x86-target-features.c
@@ -159,6 +159,13 @@
 // LVICFI: "-target-feature" "+lvi-cfi"
 // NO-LVICFI-NOT: lvi-cfi
 
+// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mspeculative-load-hardening %s -### -o %t.o 2>&1 | FileCheck -check-prefix=LVICFI-SLH %s
+// LVICFI-SLH: error: invalid argument 'mspeculative-load-hardening' not allowed with 'mlvi-cfi'
+// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mretpoline %s -### -o %t.o 2>&1 | FileCheck -check-prefix=LVICFI-RETPOLINE %s
+// LVICFI-RETPOLINE: error: invalid argument 'mretpoline' not allowed with 'mlvi-cfi'
+// RUN: %clang -target i386-linux-gnu -mlvi-cfi -mretpoline-external-thunk %s -### -o %t.o 2>&1 | FileCheck -check-prefix=LVICFI-RETPOLINE-EXTERNAL-THUNK %s
+// LVICFI-RETPOLINE-EXTERNAL-THUNK: error: invalid argument 'mretpoline-external-thunk' not allowed with 'mlvi-cfi'
+
 // RUN: %clang -target i386-linux-gnu -mwaitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=WAITPKG %s
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77431: [X86] Add tests to clang Driver to ensure that SLH/Retpoline features are not enabled with LVI-hardening

2020-04-03 Thread Scott Constable via Phabricator via cfe-commits
sconstab created this revision.
sconstab added a reviewer: craig.topper.

https://reviews.llvm.org/D77431

Files:
  clang/test/Driver/x86-target-features.c


Index: clang/test/Driver/x86-target-features.c
===
--- clang/test/Driver/x86-target-features.c
+++ clang/test/Driver/x86-target-features.c
@@ -164,6 +164,13 @@
 // LVIHARDENING: "-target-feature" "+lvi-load-hardening" "-target-feature" 
"+lvi-cfi"
 // NO-LVIHARDENING-NOT: lvi
 
+// RUN: %clang -target i386-linux-gnu -mlvi-hardening 
-mspeculative-load-hardening %s -### -o %t.o 2>&1 | FileCheck 
-check-prefix=LVIHARDENING-SLH %s
+// LVIHARDENING-SLH: error: invalid argument 'mspeculative-load-hardening' not 
allowed with 'mlvi-hardening'
+// RUN: %clang -target i386-linux-gnu -mlvi-hardening -mretpoline %s -### -o 
%t.o 2>&1 | FileCheck -check-prefix=LVIHARDENING-RETPOLINE %s
+// LVIHARDENING-RETPOLINE: error: invalid argument 'mretpoline' not allowed 
with 'mlvi-hardening'
+// RUN: %clang -target i386-linux-gnu -mlvi-hardening 
-mretpoline-external-thunk %s -### -o %t.o 2>&1 | FileCheck 
-check-prefix=LVIHARDENING-RETPOLINE-EXTERNAL-THUNK %s
+// LVIHARDENING-RETPOLINE-EXTERNAL-THUNK: error: invalid argument 
'mretpoline-external-thunk' not allowed with 'mlvi-hardening'
+
 // RUN: %clang -target i386-linux-gnu -mwaitpkg %s -### -o %t.o 2>&1 | 
FileCheck -check-prefix=WAITPKG %s
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | 
FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"


Index: clang/test/Driver/x86-target-features.c
===
--- clang/test/Driver/x86-target-features.c
+++ clang/test/Driver/x86-target-features.c
@@ -164,6 +164,13 @@
 // LVIHARDENING: "-target-feature" "+lvi-load-hardening" "-target-feature" "+lvi-cfi"
 // NO-LVIHARDENING-NOT: lvi
 
+// RUN: %clang -target i386-linux-gnu -mlvi-hardening -mspeculative-load-hardening %s -### -o %t.o 2>&1 | FileCheck -check-prefix=LVIHARDENING-SLH %s
+// LVIHARDENING-SLH: error: invalid argument 'mspeculative-load-hardening' not allowed with 'mlvi-hardening'
+// RUN: %clang -target i386-linux-gnu -mlvi-hardening -mretpoline %s -### -o %t.o 2>&1 | FileCheck -check-prefix=LVIHARDENING-RETPOLINE %s
+// LVIHARDENING-RETPOLINE: error: invalid argument 'mretpoline' not allowed with 'mlvi-hardening'
+// RUN: %clang -target i386-linux-gnu -mlvi-hardening -mretpoline-external-thunk %s -### -o %t.o 2>&1 | FileCheck -check-prefix=LVIHARDENING-RETPOLINE-EXTERNAL-THUNK %s
+// LVIHARDENING-RETPOLINE-EXTERNAL-THUNK: error: invalid argument 'mretpoline-external-thunk' not allowed with 'mlvi-hardening'
+
 // RUN: %clang -target i386-linux-gnu -mwaitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=WAITPKG %s
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-04-03 Thread Scott Constable via Phabricator via cfe-commits
sconstab added inline comments.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:318
+}
+auto *VertexArray = new Node[VertexSize + 1 /* terminator node */];
+auto *EdgeArray = new Edge[EdgeSize];

mattdr wrote:
> As a general rule `new` is a code-smell in modern C++. This should be a 
> `vector`.
@mattdr I do agree with the general rule. I also think that in this case where 
the structure is immutable, std::vector is wasteful because it needs to keep 
separate values for the current number of elements and the current capacity. At 
local scope within a function the unneeded value would likely be optimized 
away, but then there would be an awkward handoff to transfer the data from the 
vector to the array members.

I would not want to see the array members changed to vectors, unless LLVM 
provides an encapsulated array structure that does not need to grow and shrink.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:335
+VertexArray[VI].__edges = EdgeArray + EdgeSize; // terminator node
+return new GraphT{VertexArray, VertexSize, EdgeArray, EdgeSize,
+  std::forward(Args)...};

mattdr wrote:
> this should return a `unique_ptr` to signal ownership transfer
Yes, agree.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-04-04 Thread Scott Constable via Phabricator via cfe-commits
sconstab added inline comments.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:318
+}
+auto *VertexArray = new Node[VertexSize + 1 /* terminator node */];
+auto *EdgeArray = new Edge[EdgeSize];

mattdr wrote:
> sconstab wrote:
> > mattdr wrote:
> > > As a general rule `new` is a code-smell in modern C++. This should be a 
> > > `vector`.
> > @mattdr I do agree with the general rule. I also think that in this case 
> > where the structure is immutable, std::vector is wasteful because it needs 
> > to keep separate values for the current number of elements and the current 
> > capacity. At local scope within a function the unneeded value would likely 
> > be optimized away, but then there would be an awkward handoff to transfer 
> > the data from the vector to the array members.
> > 
> > I would not want to see the array members changed to vectors, unless LLVM 
> > provides an encapsulated array structure that does not need to grow and 
> > shrink.
> So, first: I'm glad you removed the unnecessary use of `new[]` here and the 
> corresponding (and error-prone!) use of `delete[]` later. That removes a 
> memory leak LLVM won't have to debug.
> 
> You suggest here that something other than `std::vector` would be more 
> efficient. If so, would `std::array` suffice? If not, can you explain why 
> static allocation is impossible but dynamic allocation would be too expensive?
A statically sized array (e.g., std::array) is insufficient because the size in 
this case is not compiler determinable; a dynamically sized and dynamically 
resizable array (e.g., std::vector) is sufficient but overly costly; a 
dynamically sized and dynamically //unresizable// array is sufficient and has 
minimal cost.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:13
+/// The advantages to this implementation are two-fold:
+/// 1. Iteration and traversal operations should experience terrific caching
+///performance.

mattdr wrote:
> erm, "terrific"? If there's a substantive argument w.r.t. cache locality 
> etc., please make it explicit.
This is valid. I will reword.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:16
+/// 2. Set representations and operations on nodes and edges become
+///extraordinarily efficient. For instance, a set of edges is implemented 
as
+///a bit vector, wherein each bit corresponds to one edge in the edge

mattdr wrote:
> "extraordinarily" is, again, not a useful engineering categorization. Please 
> restrict comments to describing quantifiable claims of complexity.
AFAIK there is not a precise engineering term for "tiny O(1)." Nonetheless I 
will reword.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:40
+
+template 
+class ImmutableGraph {

mattdr wrote:
> Every template argument for a class represents combinatorial addition of 
> complexity for the resulting code. Why do each of these template arguments 
> need to exist? in particular, why does SizeT need to exist?
I suspect that there may be more uses for this data structure and that 
eventually it may migrate to ADT. I have SizeT as a template argument because I 
found it plenty sufficient to have `int` as the size parameter for the array 
bounds, but I suspect other uses may require `size_t`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-04-04 Thread Scott Constable via Phabricator via cfe-commits
sconstab added inline comments.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:285
+  std::unique_ptr Edges;
+  size_type EdgesSize;
+};

@craig.topper It now occurs to me that these fields should probably be 
reordered to:
```
  std::unique_ptr Nodes;
  std::unique_ptr Edges;
  size_type NodesSize;
  size_type EdgesSize;
```
The current ordering will cause internal fragmentation.

Old ordering:
```
static_assert(sizeof(ImmutableGraph) == 32);
```
New ordering:
```
static_assert(sizeof(ImmutableGraph) == 24);
```
With vectors instead of arrays:
```
static_assert(sizeof(ImmutableGraph) == 48);
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-04-04 Thread Scott Constable via Phabricator via cfe-commits
sconstab added a comment.

Overall, the restyling by @craig.topper looks much better than what I had 
committed before. I agree that `std::unique_ptr` is the right "container" 
in this circumstance. And the addition of `ArrayRef<>` accessors is also a nice 
touch. A few extra inline comments.




Comment at: llvm/lib/Target/X86/ImmutableGraph.h:13
+/// The advantages to this implementation are two-fold:
+/// 1. Iteration and traversal operations should experience terrific caching
+///performance.

sconstab wrote:
> mattdr wrote:
> > erm, "terrific"? If there's a substantive argument w.r.t. cache locality 
> > etc., please make it explicit.
> This is valid. I will reword.
"Iteration and traversal operations benefit from cache locality."



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:16
+/// 2. Set representations and operations on nodes and edges become
+///extraordinarily efficient. For instance, a set of edges is implemented 
as
+///a bit vector, wherein each bit corresponds to one edge in the edge

sconstab wrote:
> mattdr wrote:
> > "extraordinarily" is, again, not a useful engineering categorization. 
> > Please restrict comments to describing quantifiable claims of complexity.
> AFAIK there is not a precise engineering term for "tiny O(1)." Nonetheless I 
> will reword.
"Operations on sets of nodes/edges are efficient, and representations of those 
sets in memory are compact. For instance..."



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:84
+  : Nodes(std::move(Nodes)), NodesSize(NodesSize), Edges(std::move(Edges)),
+EdgesSize(EdgesSize) {}
+  ImmutableGraph(const ImmutableGraph &) = delete;

After the members are reordered, this list must also be reordered.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:103
+
+  class NodeSet {
+friend class iterator;

This had not occurred to me until now, but a lot of code is shared between 
`NodeSet` and `EdgeSet`. Maybe a template could reduce the redundancy?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-04-04 Thread Scott Constable via Phabricator via cfe-commits
sconstab added inline comments.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:307
+public:
+  using NodeRef = size_type;
+

Just noticed that `ImmutableGraphBuilder` and `ImmutableGraph` have 
non-identical types called `NodeRef`. Suggest renaming this one to 
`BuilderNodeRef`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-04-07 Thread Scott Constable via Phabricator via cfe-commits
sconstab added a comment.

Summary points for @craig.topper who has commandeered this diff:

- fix the typo that Matt pointed out
- `SizeT` should not be a template parameter, and `size_type` should be fixed 
to `int`.
- Maybe have a member reference in `MachineGadgetGraph` to the associated 
`MachineFunction`.
- Determine how this pass (and other X86 machine passes) should fail on 
unsupported X86 subtargets.




Comment at: llvm/lib/Target/X86/ImmutableGraph.h:41
+class ImmutableGraph {
+  using Traits = GraphTraits *>;
+  template  friend class ImmutableGraphBuilder;

mattdr wrote:
> I think this self-reference to `ImmutableGraph` dropped the `SizeT` parameter.
Yup. Good catch.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:73
+// The end of this Node's edges is the beginning of the next node's edges.
+const Edge *edges_end() const { return (this + 1)->Edges; }
+ArrayRef edges() const {

mattdr wrote:
> Seems like you also want to add a comment here that we know we will never be 
> asked for `edges_end` for the last stored node -- that is, we know that `this 
> + 1` always refers to a valid Node (which is presumably a dummy/sentinel)
Not sure I agree. I cannot think of a conventional use of this interface that 
would perform an operation on the sentinel.
```
G->nodes_end().edges_end(); // invalid use of any end iterator
SomeNode.edges_end(); // invalid if SomeNode == G->nodes_end()
```
That is, the way that we "know" that we will never be asked for `edges_end()` 
for the last stored node is that the ask itself would already violate C++ 
conventions.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:79
+
+protected:
+  ImmutableGraph(std::unique_ptr Nodes, std::unique_ptr Edges,

mattdr wrote:
> Why "protected" rather than "private"? Usually seeing "protected" makes me 
> think subclassing is expected, but that doesn't seem to be the case here.
The `MachineGadgetGraph` class actually does subclass `ImmutableGraph` to add 
some contextual information. I did not want the constructors for 
`ImmutableGraph` to be public, because the intent is to use the builder. So 
`protected` seemed like the best option.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:117
+NodeSet(const ImmutableGraph &G, bool ContainsAll = false)
+: G{G}, V{static_cast(G.nodes_size()), ContainsAll} {}
+bool insert(const Node &N) {

mattdr wrote:
> How do we know that a value of `size_type` (aka `SizeT`) can be cast to 
> `unsigned` without truncation?
Ah. We do not know that. We could have a static assert here, but maybe the best 
thing to do would be to follow Matt's earlier advice and fix `size_type` to 
`int`, rather than have it as a template parameter. Anything larger would break 
the `BitVectors` and/or waste space.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:113
+  NumFences(NumFences), NumGadgets(NumGadgets) {}
+MachineFunction &getMF() { // FIXME: This function should be cleaner
+  for (const Node &N : nodes())

mattdr wrote:
> Cleaner how?
Maybe by keeping a member reference to the associated `MachineFunction`?



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:233
+  if (!STI->useLVILoadHardening() || !STI->is64Bit())
+return false; // FIXME: support 32-bit
+

mattdr wrote:
> If the user requests hardening and we can't do it, it seems better to fail 
> loudly so they don't accidentally deploy an unmitigated binary.
@craig.topper I think this is related to the discussion we were having about 
what would happen for SLH on unsupported subtargets. I'm not sure what the most 
appropriate solution would be.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-04-07 Thread Scott Constable via Phabricator via cfe-commits
sconstab added inline comments.



Comment at: llvm/lib/Target/X86/ImmutableGraph.h:73
+// The end of this Node's edges is the beginning of the next node's edges.
+const Edge *edges_end() const { return (this + 1)->Edges; }
+ArrayRef edges() const {

mattdr wrote:
> sconstab wrote:
> > mattdr wrote:
> > > Seems like you also want to add a comment here that we know we will never 
> > > be asked for `edges_end` for the last stored node -- that is, we know 
> > > that `this + 1` always refers to a valid Node (which is presumably a 
> > > dummy/sentinel)
> > Not sure I agree. I cannot think of a conventional use of this interface 
> > that would perform an operation on the sentinel.
> > ```
> > G->nodes_end().edges_end(); // invalid use of any end iterator
> > SomeNode.edges_end(); // invalid if SomeNode == G->nodes_end()
> > ```
> > That is, the way that we "know" that we will never be asked for 
> > `edges_end()` for the last stored node is that the ask itself would already 
> > violate C++ conventions.
> I believe any operation on the last `Node` in the array will end up accessing 
> the sentinel:
> 
> ```
> Node* LastNode = G->nodes_begin() + (G->nodes_size() - 1);  // valid 
> reference to the last node
> LastNode->edges_end();  // uses `this+1`, accessing the sentinel value in the 
> Nodes array
> ```
> 
`G->nodes_size()` will return the size without the sentinel node, so your 
example should actually operate on the last data node. Right?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-04-09 Thread Scott Constable via Phabricator via cfe-commits
sconstab added inline comments.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:233
+  if (!STI->useLVILoadHardening() || !STI->is64Bit())
+return false; // FIXME: support 32-bit
+

craig.topper wrote:
> sconstab wrote:
> > mattdr wrote:
> > > If the user requests hardening and we can't do it, it seems better to 
> > > fail loudly so they don't accidentally deploy an unmitigated binary.
> > @craig.topper I think this is related to the discussion we were having 
> > about what would happen for SLH on unsupported subtargets. I'm not sure 
> > what the most appropriate solution would be.
> Added a fatal error. Which isn't great as it will generate a crash report in 
> clang. But it will tell the user to file a compiler bug so I guess that's 
> something.
Would it be better to have
```
report_fatal_error("LVI load hardening is only supported on 64-bit "
   "targets.", false);
```
So that the crash diagnostic is not generated?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79910: [x86][seses] Add clang flag; Use lvi-cfi with seses

2020-06-15 Thread Scott Constable via Phabricator via cfe-commits
sconstab added a comment.

Any progress on this patch? D75939  has been 
merged, but the SESES feature will not be secure until it has CFI protections.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79910/new/

https://reviews.llvm.org/D79910



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79910: [x86][seses] Add clang flag; Use lvi-cfi with seses

2020-06-29 Thread Scott Constable via Phabricator via cfe-commits
sconstab accepted this revision.
sconstab added a comment.
This revision is now accepted and ready to land.

LGTM.




Comment at: clang/lib/Driver/ToolChains/Arch/X86.cpp:200
+if (!Args.hasArg(options::OPT_mno_lvi_cfi)) {
+  Features.push_back("+lvi-cfi");
+  LVIOpt = options::OPT_mlvi_cfi;

zbrid wrote:
> sconstab wrote:
> > Would it be better to add `FeatureLVIControlFlowIntegrity` as a dependency 
> > for `FeatureSpeculativeExecutionSideEffectSuppression` in 
> > `llvm/lib/Target/X86/X86.td`?
> Thanks for the tip! Yeah, I will update to do that, but it looks like that 
> only ensures an error will be thrown if they aren't used together rather than 
> ensuring one is enabled when the other is enabled. Am I misunderstanding?
I'm not certain about this either. @craig.topper opinion?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79910/new/

https://reviews.llvm.org/D79910



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76458: Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) [by modifying X86RetpolineThunks.cpp]

2020-04-23 Thread Scott Constable via Phabricator via cfe-commits
sconstab abandoned this revision.
sconstab added a comment.

Superseded by D76810 , D76811 
, and D76812 .


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76458/new/

https://reviews.llvm.org/D76458



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-04-27 Thread Scott Constable via Phabricator via cfe-commits
sconstab updated this revision to Diff 260439.
sconstab added a comment.

Removed the `-x86-lvi-no-fixed` CLI flag. This change simplifies the code flow 
quite a bit.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  clang/test/Driver/x86-target-features.c
  llvm/lib/Target/X86/CMakeLists.txt
  llvm/lib/Target/X86/ImmutableGraph.h
  llvm/lib/Target/X86/X86.h
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/test/CodeGen/X86/O0-pipeline.ll
  llvm/test/CodeGen/X86/O3-pipeline.ll
  llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
@@ -0,0 +1,129 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -x86-lvi-load-dot-verify -o %t < %s | FileCheck %s
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @test(i32* %untrusted_user_ptr, i32* %secret, i32 %secret_size) #0 {
+entry:
+  %untrusted_user_ptr.addr = alloca i32*, align 8
+  %secret.addr = alloca i32*, align 8
+  %secret_size.addr = alloca i32, align 4
+  %ret_val = alloca i32, align 4
+  %i = alloca i32, align 4
+  store i32* %untrusted_user_ptr, i32** %untrusted_user_ptr.addr, align 8
+  store i32* %secret, i32** %secret.addr, align 8
+  store i32 %secret_size, i32* %secret_size.addr, align 4
+  store i32 0, i32* %ret_val, align 4
+  call void @llvm.x86.sse2.lfence()
+  store i32 0, i32* %i, align 4
+  br label %for.cond
+
+for.cond: ; preds = %for.inc, %entry
+  %0 = load i32, i32* %i, align 4
+  %1 = load i32, i32* %secret_size.addr, align 4
+  %cmp = icmp slt i32 %0, %1
+  br i1 %cmp, label %for.body, label %for.end
+
+for.body: ; preds = %for.cond
+  %2 = load i32, i32* %i, align 4
+  %rem = srem i32 %2, 2
+  %cmp1 = icmp eq i32 %rem, 0
+  br i1 %cmp1, label %if.then, label %if.else
+
+if.then:  ; preds = %for.body
+  %3 = load i32*, i32** %secret.addr, align 8
+  %4 = load i32, i32* %ret_val, align 4
+  %idxprom = sext i32 %4 to i64
+  %arrayidx = getelementptr inbounds i32, i32* %3, i64 %idxprom
+  %5 = load i32, i32* %arrayidx, align 4
+  %6 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  store i32 %5, i32* %6, align 4
+  br label %if.end
+
+if.else:  ; preds = %for.body
+  %7 = load i32*, i32** %secret.addr, align 8
+  %8 = load i32, i32* %ret_val, align 4
+  %idxprom2 = sext i32 %8 to i64
+  %arrayidx3 = getelementptr inbounds i32, i32* %7, i64 %idxprom2
+  store i32 42, i32* %arrayidx3, align 4
+  br label %if.end
+
+if.end:   ; preds = %if.else, %if.then
+  %9 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  %10 = load i32, i32* %9, align 4
+  store i32 %10, i32* %ret_val, align 4
+  br label %for.inc
+
+for.inc:  ; preds = %if.end
+  %11 = load i32, i32* %i, align 4
+  %inc = add nsw i32 %11, 1
+  store i32 %inc, i32* %i, align 4
+  br label %for.cond
+
+for.end:  ; preds = %for.cond
+  %12 = load i32, i32* %ret_val, align 4
+  ret i32 %12
+}
+
+; CHECK:  digraph "Speculative gadgets for \"test\" function" {
+; CHECK-NEXT: label="Speculative gadgets for \"test\" function";
+; CHECK:  Node0x{{[0-9a-f]+}} [shape=record,color = green,label="{LFENCE\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 0];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{JCC_1 %bb.6, 13, implicit killed $eflags\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{CMP32rm killed renamable $eax, %stack.2.secret_size.addr, 1, $noreg, 0, $noreg, implicit-def $eflags :: (dereferenceable load 4 from %ir.secret_size.addr)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f

[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-05-04 Thread Scott Constable via Phabricator via cfe-commits
sconstab added inline comments.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:322
+  DenseSet> GadgetEdgeSet;
+  auto AnalyzeUse = [&](NodeAddr Use, MachineInstr *MI) {
+assert(!(Use.Addr->getFlags() & NodeAttrs::PhiRef));

craig.topper wrote:
> mattdr wrote:
> > Please add a comment explaining the semantics of the boolean return here. I 
> > //think// it's: `true` if we need to consider defs of this instruction 
> > tainted by this use (and therefore add them for analysis); `false` if this 
> > instruction consumes its use
> Was this comment addressed?
It had not been addressed, so thank you for pointing this out. That lambda was 
doing too many things at once, which made it more confusing than it needed to 
be. So I just inlined it in the
```
for (auto N : Uses) {
…
}
```
loop, and I added some additional clarifying comments.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:341
+GadgetBegin.first, GadgetEnd.first);
+  if (UseMI.mayLoad()) // FIXME: This should be more precise
+return false;  // stop traversing further uses of `Reg`

craig.topper wrote:
> mattdr wrote:
> > Some more detail would be useful here: precise about what? What are the 
> > likely errors?
> > 
> Was this answered somewhere?
This was referring to the use of `mayLoad()`. At the time I wrote that comment, 
I wasn't sure that `mayLoad()` was exactly what was needed there, but I now 
think that it does suffice (SLH also uses `MachineInstr::mayLoad()`).



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:409
+NodeList Defs = SA.Addr->members_if(DataFlowGraph::IsDef, DFG);
+llvm::for_each(Defs, AnalyzeDef);
+  }

craig.topper wrote:
> mattdr wrote:
> > We analyze every def from every instruction in the function, but then 
> > //also// in `AnalyzeDefUseChain` analyze every def of every instruction 
> > with an interesting use. Are we doing a lot of extra work?
> Was this answered somewhere?
Wow, big oversight on my part. @mattdr was correct that this was doing a LOT of 
extra work. I added a memoization scheme that remembers the instructions that 
may transmit for each def. The getGadgetGraph() routine now runs about 75% 
faster.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-05-04 Thread Scott Constable via Phabricator via cfe-commits
sconstab updated this revision to Diff 261968.
sconstab marked 9 inline comments as done.
sconstab added a comment.
Herald added a subscriber: mgrang.

Addressed the previously unaddressed comments, as pointed out by @craig.topper.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  clang/test/Driver/x86-target-features.c
  llvm/lib/Target/X86/CMakeLists.txt
  llvm/lib/Target/X86/ImmutableGraph.h
  llvm/lib/Target/X86/X86.h
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/test/CodeGen/X86/O0-pipeline.ll
  llvm/test/CodeGen/X86/O3-pipeline.ll
  llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
@@ -0,0 +1,129 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -x86-lvi-load-dot-verify -o %t < %s | FileCheck %s
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @test(i32* %untrusted_user_ptr, i32* %secret, i32 %secret_size) #0 {
+entry:
+  %untrusted_user_ptr.addr = alloca i32*, align 8
+  %secret.addr = alloca i32*, align 8
+  %secret_size.addr = alloca i32, align 4
+  %ret_val = alloca i32, align 4
+  %i = alloca i32, align 4
+  store i32* %untrusted_user_ptr, i32** %untrusted_user_ptr.addr, align 8
+  store i32* %secret, i32** %secret.addr, align 8
+  store i32 %secret_size, i32* %secret_size.addr, align 4
+  store i32 0, i32* %ret_val, align 4
+  call void @llvm.x86.sse2.lfence()
+  store i32 0, i32* %i, align 4
+  br label %for.cond
+
+for.cond: ; preds = %for.inc, %entry
+  %0 = load i32, i32* %i, align 4
+  %1 = load i32, i32* %secret_size.addr, align 4
+  %cmp = icmp slt i32 %0, %1
+  br i1 %cmp, label %for.body, label %for.end
+
+for.body: ; preds = %for.cond
+  %2 = load i32, i32* %i, align 4
+  %rem = srem i32 %2, 2
+  %cmp1 = icmp eq i32 %rem, 0
+  br i1 %cmp1, label %if.then, label %if.else
+
+if.then:  ; preds = %for.body
+  %3 = load i32*, i32** %secret.addr, align 8
+  %4 = load i32, i32* %ret_val, align 4
+  %idxprom = sext i32 %4 to i64
+  %arrayidx = getelementptr inbounds i32, i32* %3, i64 %idxprom
+  %5 = load i32, i32* %arrayidx, align 4
+  %6 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  store i32 %5, i32* %6, align 4
+  br label %if.end
+
+if.else:  ; preds = %for.body
+  %7 = load i32*, i32** %secret.addr, align 8
+  %8 = load i32, i32* %ret_val, align 4
+  %idxprom2 = sext i32 %8 to i64
+  %arrayidx3 = getelementptr inbounds i32, i32* %7, i64 %idxprom2
+  store i32 42, i32* %arrayidx3, align 4
+  br label %if.end
+
+if.end:   ; preds = %if.else, %if.then
+  %9 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  %10 = load i32, i32* %9, align 4
+  store i32 %10, i32* %ret_val, align 4
+  br label %for.inc
+
+for.inc:  ; preds = %if.end
+  %11 = load i32, i32* %i, align 4
+  %inc = add nsw i32 %11, 1
+  store i32 %inc, i32* %i, align 4
+  br label %for.cond
+
+for.end:  ; preds = %for.cond
+  %12 = load i32, i32* %ret_val, align 4
+  ret i32 %12
+}
+
+; CHECK:  digraph "Speculative gadgets for \"test\" function" {
+; CHECK-NEXT: label="Speculative gadgets for \"test\" function";
+; CHECK:  Node0x{{[0-9a-f]+}} [shape=record,color = green,label="{LFENCE\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 0];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{JCC_1 %bb.6, 13, implicit killed $eflags\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{CMP32rm killed renamable $eax, %stack.2.secret_size.addr, 1, $noreg, 0, $noreg, implicit-def $eflags :: (dereferenceable load 4 from %ir.secret_size.addr)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %

[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-05-07 Thread Scott Constable via Phabricator via cfe-commits
sconstab added inline comments.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:295
+std::function)> AnalyzeDefUseChain =
+[&](NodeAddr Def) {
+  if (Transmitters.find(Def.Id) != Transmitters.end())

mattdr wrote:
> fwiw, this code would be easier to understand if we didn't shadow `Def` with 
> another variable named `Def`.
Changed the outer def to `SourceDef`, which also seems to make the code after 
the lambda a lot clearer.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:321-322
+  for (auto UseID : Uses) {
+if (!UsesVisited.insert(UseID).second)
+  continue; // Already visited this use of the current def
+

mattdr wrote:
> "current def" is a bit ambiguous here. I _believe_ it means `AnalyzeDef`'s 
> `Def` argument? At least, that's the interpretation that makes the comment 
> make sense since `UsesVisited` is in `AnalyzeDef`'s scope.
I am now trying to be clearer by using capital-d "Def" to refer specifically to 
the def that is being analyzed, and lower-case-d "def" to refer to any other 
defs. Do you think this is better? Good enough?



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:331
+// We naively assume that an instruction propagates any loaded
+// Uses to all Defs, unless the instruction is a call.
+if (UseMI.isCall())

mattdr wrote:
> Copying a comment from a previous iteration:
> 
> > Why is it okay to assume that a call doesn't propagate its uses to defs? Is 
> > it because we can assume the CFI transform is already inserting an LFENCE? 
> > Whatever the reason, let's state it explicitly here
> 
Added clarification to the comment.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:341
+  if (UseMI.mayLoad())
+continue; // Found a transmitting load, stop traversing defs
+}

mattdr wrote:
> The comment doesn't match the loop, which is traversing over `Uses`. More 
> importantly, though: why are we allowed to stop traversing through `Uses` 
> here? This `Def` won't be analyzed again, so this is our only chance to 
> enumerate all transmitters to make sure we have all the necessary source -> 
> sink edges in the gadget graph.
@mattdr I think that the code is correct, and I added more to the comment in an 
attempt to clarify. Let me know if you still think that this is an issue.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:366-369
+  llvm::sort(DefTransmitters);
+  DefTransmitters.erase(
+  std::unique(DefTransmitters.begin(), DefTransmitters.end()),
+  DefTransmitters.end());

mattdr wrote:
> Should `Transmitters` map to an `llvm::SmallSet`?
In my testing, `std::vector` seems a bit faster than `llvm::SmallSet`. I also 
suspect that `llvm::SmallSet` may waste more space because many defs will have 
no transmitters.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-05-07 Thread Scott Constable via Phabricator via cfe-commits
sconstab updated this revision to Diff 262816.
sconstab marked 9 inline comments as done.
sconstab added a comment.

Addressed comments by @mattdr.

Several comments in the code have been updated, but the code has not changed.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  clang/test/Driver/x86-target-features.c
  llvm/lib/Target/X86/CMakeLists.txt
  llvm/lib/Target/X86/ImmutableGraph.h
  llvm/lib/Target/X86/X86.h
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/test/CodeGen/X86/O0-pipeline.ll
  llvm/test/CodeGen/X86/O3-pipeline.ll
  llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
@@ -0,0 +1,129 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -x86-lvi-load-dot-verify -o %t < %s | FileCheck %s
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @test(i32* %untrusted_user_ptr, i32* %secret, i32 %secret_size) #0 {
+entry:
+  %untrusted_user_ptr.addr = alloca i32*, align 8
+  %secret.addr = alloca i32*, align 8
+  %secret_size.addr = alloca i32, align 4
+  %ret_val = alloca i32, align 4
+  %i = alloca i32, align 4
+  store i32* %untrusted_user_ptr, i32** %untrusted_user_ptr.addr, align 8
+  store i32* %secret, i32** %secret.addr, align 8
+  store i32 %secret_size, i32* %secret_size.addr, align 4
+  store i32 0, i32* %ret_val, align 4
+  call void @llvm.x86.sse2.lfence()
+  store i32 0, i32* %i, align 4
+  br label %for.cond
+
+for.cond: ; preds = %for.inc, %entry
+  %0 = load i32, i32* %i, align 4
+  %1 = load i32, i32* %secret_size.addr, align 4
+  %cmp = icmp slt i32 %0, %1
+  br i1 %cmp, label %for.body, label %for.end
+
+for.body: ; preds = %for.cond
+  %2 = load i32, i32* %i, align 4
+  %rem = srem i32 %2, 2
+  %cmp1 = icmp eq i32 %rem, 0
+  br i1 %cmp1, label %if.then, label %if.else
+
+if.then:  ; preds = %for.body
+  %3 = load i32*, i32** %secret.addr, align 8
+  %4 = load i32, i32* %ret_val, align 4
+  %idxprom = sext i32 %4 to i64
+  %arrayidx = getelementptr inbounds i32, i32* %3, i64 %idxprom
+  %5 = load i32, i32* %arrayidx, align 4
+  %6 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  store i32 %5, i32* %6, align 4
+  br label %if.end
+
+if.else:  ; preds = %for.body
+  %7 = load i32*, i32** %secret.addr, align 8
+  %8 = load i32, i32* %ret_val, align 4
+  %idxprom2 = sext i32 %8 to i64
+  %arrayidx3 = getelementptr inbounds i32, i32* %7, i64 %idxprom2
+  store i32 42, i32* %arrayidx3, align 4
+  br label %if.end
+
+if.end:   ; preds = %if.else, %if.then
+  %9 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  %10 = load i32, i32* %9, align 4
+  store i32 %10, i32* %ret_val, align 4
+  br label %for.inc
+
+for.inc:  ; preds = %if.end
+  %11 = load i32, i32* %i, align 4
+  %inc = add nsw i32 %11, 1
+  store i32 %inc, i32* %i, align 4
+  br label %for.cond
+
+for.end:  ; preds = %for.cond
+  %12 = load i32, i32* %ret_val, align 4
+  ret i32 %12
+}
+
+; CHECK:  digraph "Speculative gadgets for \"test\" function" {
+; CHECK-NEXT: label="Speculative gadgets for \"test\" function";
+; CHECK:  Node0x{{[0-9a-f]+}} [shape=record,color = green,label="{LFENCE\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 0];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{JCC_1 %bb.6, 13, implicit killed $eflags\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{CMP32rm killed renamable $eax, %stack.2.secret_size.addr, 1, $noreg, 0, $noreg, implicit-def $eflags :: (dereferenceable load 4 from %ir.secret_size.addr)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)

[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-05-12 Thread Scott Constable via Phabricator via cfe-commits
sconstab added a comment.

In D75936#2032027 , @nikic wrote:

> This change causes a 0.8% compile-time regression for unoptimized builds 
> .
>  Based on the pipeline test diffs, I expect this is because the new pass 
> requests a bunch of analyses, which it most likely (LVI load hardening 
> disabled) will not need. Would it be possible to compute the analyses only if 
> LVI load hardening is actually enabled?


@craig.topper Do you have any ideas on how this could be done?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76812: [X86] Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) [3/3]

2020-03-25 Thread Scott Constable via Phabricator via cfe-commits
sconstab created this revision.
sconstab added reviewers: craig.topper, andrew.w.kaylor, zbrid, chandlerc.
Herald added subscribers: jfb, hiraditya.
sconstab added a parent revision: D76811: [X86] Refactor X86IndirectThunks.cpp 
to Accomodate Mitigations other than Retpoline [2/3].
sconstab retitled this revision from "Add Indirect Thunk Support to X86 to 
mitigate Load Value Injection (LVI) [3/3]" to "[X86] Add Indirect Thunk Support 
to X86 to mitigate Load Value Injection (LVI) [3/3]".

This pass replaces each indirect call/jump with a direct call to a thunk that 
looks like:

  lfence
  jmpq *%r11

This ensures that if the value in register `%r11` was loaded from memory, then
the value in `%r11` is (architecturally) correct prior to the jump.
Also adds a new target feature to X86: +lvi-cfi
("cfi" meaning control-flow integrity)
The feature can be added via clang CLI using `-mlvi-cfi`.

This is an alternate implementation to https://reviews.llvm.org/D75934 That 
merges the thunk insertion functionality with the existing X86 retpoline code.


https://reviews.llvm.org/D76812

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/lib/Target/X86/X86IndirectThunks.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
@@ -0,0 +1,281 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -mattr=+lvi-cfi < %s | FileCheck %s --check-prefix=X64
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -mattr=+lvi-cfi -O0 < %s | FileCheck %s --check-prefix=X64FAST
+;
+; Note that a lot of this code was lifted from retpoline.ll.
+
+declare void @bar(i32)
+
+; Test a simple indirect call and tail call.
+define void @icall_reg(void (i32)* %fp, i32 %x) {
+entry:
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_reg:
+; X64-DAG:   movq %rdi, %[[fp:[^ ]*]]
+; X64-DAG:   movl %esi, %[[x:[^ ]*]]
+; X64:   movl %esi, %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64:   movl %[[x]], %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_reg:
+; X64FAST:   callq bar
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   callq bar
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+@global_fp = external global void (i32)*
+
+; Test an indirect call through a global variable.
+define void @icall_global_fp(i32 %x, void (i32)** %fpp) #0 {
+  %fp1 = load void (i32)*, void (i32)** @global_fp
+  call void %fp1(i32 %x)
+  %fp2 = load void (i32)*, void (i32)** @global_fp
+  tail call void %fp2(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_global_fp:
+; X64-DAG:   movl %edi, %[[x:[^ ]*]]
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_global_fp:
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+%struct.Foo = type { void (%struct.Foo*)** }
+
+; Test an indirect call through a vtable.
+define void @vcall(%struct.Foo* %obj) #0 {
+  %vptr_field = getelementptr %struct.Foo, %struct.Foo* %obj, i32 0, i32 0
+  %vptr = load void (%struct.Foo*)**, void (%struct.Foo*)*** %vptr_field
+  %vslot = getelementptr void(%struct.Foo*)*, void(%struct.Foo*)** %vptr, i32 1
+  %fp = load void(%struct.Foo*)*, void(%struct.Foo*)** %vslot
+  tail call void %fp(%struct.Foo* %obj)
+  tail call void %fp(%struct.Foo* %obj)
+  ret void
+}
+
+; X64-LABEL: vcall:
+; X64:   movq %rdi, %[[obj:[^ ]*]]
+; X64:   movq (%rdi), %[[vptr:[^ ]*]]
+; X64:   movq 8(%[[vptr]]), %[[fp:[^ ]*]]
+; X64:   movq %[[fp]], %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64-DAG:   movq %[[obj]], %rdi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: vcall:
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+declare void @direct_callee()
+
+define void @direct_tail() #0 {
+  tail call void @direct_callee()
+  ret void
+}
+
+; X64-LABEL: direct_tail:
+; X64:   jmp direct_callee # TAILCALL
+; X64FAST-LABEL: direct_tail:
+; X64FAST:   jmp direct_callee # TAILCALL
+
+
+declare void @nonlazybind_callee() #1
+
+define void @nonlazybind_caller() #0 {
+  call void @nonl

[PATCH] D76812: [X86] Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) [3/3]

2020-03-31 Thread Scott Constable via Phabricator via cfe-commits
sconstab updated this revision to Diff 253888.
sconstab added a comment.

Added a comment to the header of X86IndirectThunks.cpp to indicate support for 
LVI thunks.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76812/new/

https://reviews.llvm.org/D76812

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/lib/Target/X86/X86IndirectThunks.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
@@ -0,0 +1,281 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -mattr=+lvi-cfi < %s | FileCheck %s --check-prefix=X64
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -mattr=+lvi-cfi -O0 < %s | FileCheck %s --check-prefix=X64FAST
+;
+; Note that a lot of this code was lifted from retpoline.ll.
+
+declare void @bar(i32)
+
+; Test a simple indirect call and tail call.
+define void @icall_reg(void (i32)* %fp, i32 %x) {
+entry:
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_reg:
+; X64-DAG:   movq %rdi, %[[fp:[^ ]*]]
+; X64-DAG:   movl %esi, %[[x:[^ ]*]]
+; X64:   movl %esi, %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64:   movl %[[x]], %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_reg:
+; X64FAST:   callq bar
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   callq bar
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+@global_fp = external global void (i32)*
+
+; Test an indirect call through a global variable.
+define void @icall_global_fp(i32 %x, void (i32)** %fpp) #0 {
+  %fp1 = load void (i32)*, void (i32)** @global_fp
+  call void %fp1(i32 %x)
+  %fp2 = load void (i32)*, void (i32)** @global_fp
+  tail call void %fp2(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_global_fp:
+; X64-DAG:   movl %edi, %[[x:[^ ]*]]
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_global_fp:
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+%struct.Foo = type { void (%struct.Foo*)** }
+
+; Test an indirect call through a vtable.
+define void @vcall(%struct.Foo* %obj) #0 {
+  %vptr_field = getelementptr %struct.Foo, %struct.Foo* %obj, i32 0, i32 0
+  %vptr = load void (%struct.Foo*)**, void (%struct.Foo*)*** %vptr_field
+  %vslot = getelementptr void(%struct.Foo*)*, void(%struct.Foo*)** %vptr, i32 1
+  %fp = load void(%struct.Foo*)*, void(%struct.Foo*)** %vslot
+  tail call void %fp(%struct.Foo* %obj)
+  tail call void %fp(%struct.Foo* %obj)
+  ret void
+}
+
+; X64-LABEL: vcall:
+; X64:   movq %rdi, %[[obj:[^ ]*]]
+; X64:   movq (%rdi), %[[vptr:[^ ]*]]
+; X64:   movq 8(%[[vptr]]), %[[fp:[^ ]*]]
+; X64:   movq %[[fp]], %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64-DAG:   movq %[[obj]], %rdi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: vcall:
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+declare void @direct_callee()
+
+define void @direct_tail() #0 {
+  tail call void @direct_callee()
+  ret void
+}
+
+; X64-LABEL: direct_tail:
+; X64:   jmp direct_callee # TAILCALL
+; X64FAST-LABEL: direct_tail:
+; X64FAST:   jmp direct_callee # TAILCALL
+
+
+declare void @nonlazybind_callee() #1
+
+define void @nonlazybind_caller() #0 {
+  call void @nonlazybind_callee()
+  tail call void @nonlazybind_callee()
+  ret void
+}
+
+; X64-LABEL: nonlazybind_caller:
+; X64:   movq nonlazybind_callee@GOTPCREL(%rip), %[[REG:.*]]
+; X64:   movq %[[REG]], %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64:   movq %[[REG]], %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+; X64FAST-LABEL: nonlazybind_caller:
+; X64FAST:   movq nonlazybind_callee@GOTPCREL(%rip), %r11
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   movq nonlazybind_callee@GOTPCREL(%rip), %r11
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+; Check that a switch gets lowered using a jump table
+define void @switch_jumptable(i32* %ptr, i64* %sink) #0 {
+; X64-LABEL: switch_jumptable:
+; X64_NOT:  jmpq *
+entry:
+  br label %header
+
+header:
+  %i = load volatile i32, i32* %p

[PATCH] D76812: [X86] Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) [3/3]

2020-04-01 Thread Scott Constable via Phabricator via cfe-commits
sconstab updated this revision to Diff 254276.
sconstab added a comment.

@craig.topper I think that removing spurious MBBs is not really necessary 
because the emitted machine code doesn't contain the spurious MBBs, from what I 
have observed. I added the check anyways, if only because others may look at 
this discrepancy and have the same question.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76812/new/

https://reviews.llvm.org/D76812

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/lib/Target/X86/X86IndirectThunks.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
@@ -0,0 +1,281 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -mattr=+lvi-cfi < %s | FileCheck %s --check-prefix=X64
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -mattr=+lvi-cfi -O0 < %s | FileCheck %s --check-prefix=X64FAST
+;
+; Note that a lot of this code was lifted from retpoline.ll.
+
+declare void @bar(i32)
+
+; Test a simple indirect call and tail call.
+define void @icall_reg(void (i32)* %fp, i32 %x) {
+entry:
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_reg:
+; X64-DAG:   movq %rdi, %[[fp:[^ ]*]]
+; X64-DAG:   movl %esi, %[[x:[^ ]*]]
+; X64:   movl %esi, %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64:   movl %[[x]], %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_reg:
+; X64FAST:   callq bar
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   callq bar
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+@global_fp = external global void (i32)*
+
+; Test an indirect call through a global variable.
+define void @icall_global_fp(i32 %x, void (i32)** %fpp) #0 {
+  %fp1 = load void (i32)*, void (i32)** @global_fp
+  call void %fp1(i32 %x)
+  %fp2 = load void (i32)*, void (i32)** @global_fp
+  tail call void %fp2(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_global_fp:
+; X64-DAG:   movl %edi, %[[x:[^ ]*]]
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_global_fp:
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+%struct.Foo = type { void (%struct.Foo*)** }
+
+; Test an indirect call through a vtable.
+define void @vcall(%struct.Foo* %obj) #0 {
+  %vptr_field = getelementptr %struct.Foo, %struct.Foo* %obj, i32 0, i32 0
+  %vptr = load void (%struct.Foo*)**, void (%struct.Foo*)*** %vptr_field
+  %vslot = getelementptr void(%struct.Foo*)*, void(%struct.Foo*)** %vptr, i32 1
+  %fp = load void(%struct.Foo*)*, void(%struct.Foo*)** %vslot
+  tail call void %fp(%struct.Foo* %obj)
+  tail call void %fp(%struct.Foo* %obj)
+  ret void
+}
+
+; X64-LABEL: vcall:
+; X64:   movq %rdi, %[[obj:[^ ]*]]
+; X64:   movq (%rdi), %[[vptr:[^ ]*]]
+; X64:   movq 8(%[[vptr]]), %[[fp:[^ ]*]]
+; X64:   movq %[[fp]], %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64-DAG:   movq %[[obj]], %rdi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: vcall:
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+declare void @direct_callee()
+
+define void @direct_tail() #0 {
+  tail call void @direct_callee()
+  ret void
+}
+
+; X64-LABEL: direct_tail:
+; X64:   jmp direct_callee # TAILCALL
+; X64FAST-LABEL: direct_tail:
+; X64FAST:   jmp direct_callee # TAILCALL
+
+
+declare void @nonlazybind_callee() #1
+
+define void @nonlazybind_caller() #0 {
+  call void @nonlazybind_callee()
+  tail call void @nonlazybind_callee()
+  ret void
+}
+
+; X64-LABEL: nonlazybind_caller:
+; X64:   movq nonlazybind_callee@GOTPCREL(%rip), %[[REG:.*]]
+; X64:   movq %[[REG]], %r11
+; X64:   callq __llvm_lvi_thunk_r11
+; X64:   movq %[[REG]], %r11
+; X64:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+; X64FAST-LABEL: nonlazybind_caller:
+; X64FAST:   movq nonlazybind_callee@GOTPCREL(%rip), %r11
+; X64FAST:   callq __llvm_lvi_thunk_r11
+; X64FAST:   movq nonlazybind_callee@GOTPCREL(%rip), %r11
+; X64FAST:   jmp __llvm_lvi_thunk_r11 # TAILCALL
+
+
+; Check that a switch gets lowered using a jump table
+

[PATCH] D75934: Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI)

2020-03-10 Thread Scott Constable via Phabricator via cfe-commits
sconstab created this revision.
sconstab added reviewers: craig.topper, andrew.w.kaylor, chandlerc, zbrid.
Herald added subscribers: jfb, hiraditya, mgorny.
Herald added a project: LLVM.

This pass replaces each indirect call/jump with a direct call to a thunk that 
looks like:

  lfence
  jmpq *%r11

This ensures that if the value in register %r11 was loaded from memory, then
the value in %r11 is (architecturally) correct prior to the jump.

Also adds a new target feature to X86: +lvi-cfi
("cfi" meaning control-flow integrity)

The feature can be added via clang CLI using `-mlvi-cfi`.


https://reviews.llvm.org/D75934

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  llvm/lib/Target/X86/CMakeLists.txt
  llvm/lib/Target/X86/X86.h
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86FastISel.cpp
  llvm/lib/Target/X86/X86FrameLowering.cpp
  llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/lib/Target/X86/X86InstrCompiler.td
  llvm/lib/Target/X86/X86InstrControl.td
  llvm/lib/Target/X86/X86InstrInfo.td
  llvm/lib/Target/X86/X86LoadValueInjectionIndirectThunks.cpp
  llvm/lib/Target/X86/X86MCInstLower.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
  llvm/test/CodeGen/X86/O0-pipeline.ll
  llvm/test/CodeGen/X86/O3-pipeline.ll
  llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
@@ -0,0 +1,282 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown < %s | FileCheck %s --check-prefix=X64
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -O0 < %s | FileCheck %s --check-prefix=X64FAST
+;
+; Note that a lot of this code was lifted from retpoline.ll.
+
+declare void @bar(i32)
+
+; Test a simple indirect call and tail call.
+define void @icall_reg(void (i32)* %fp, i32 %x) #0 {
+entry:
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_reg:
+; X64-DAG:   movq %rdi, %[[fp:[^ ]*]]
+; X64-DAG:   movl %esi, %[[x:[^ ]*]]
+; X64:   movl %esi, %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   callq __x86_indirect_thunk_r11
+; X64:   movl %[[x]], %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_reg:
+; X64FAST:   callq bar
+; X64FAST:   callq __x86_indirect_thunk_r11
+; X64FAST:   callq bar
+; X64FAST:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+
+@global_fp = external global void (i32)*
+
+; Test an indirect call through a global variable.
+define void @icall_global_fp(i32 %x, void (i32)** %fpp) #0 {
+  %fp1 = load void (i32)*, void (i32)** @global_fp
+  call void %fp1(i32 %x)
+  %fp2 = load void (i32)*, void (i32)** @global_fp
+  tail call void %fp2(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_global_fp:
+; X64-DAG:   movl %edi, %[[x:[^ ]*]]
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   callq __x86_indirect_thunk_r11
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_global_fp:
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   callq __x86_indirect_thunk_r11
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+
+%struct.Foo = type { void (%struct.Foo*)** }
+
+; Test an indirect call through a vtable.
+define void @vcall(%struct.Foo* %obj) #0 {
+  %vptr_field = getelementptr %struct.Foo, %struct.Foo* %obj, i32 0, i32 0
+  %vptr = load void (%struct.Foo*)**, void (%struct.Foo*)*** %vptr_field
+  %vslot = getelementptr void(%struct.Foo*)*, void(%struct.Foo*)** %vptr, i32 1
+  %fp = load void(%struct.Foo*)*, void(%struct.Foo*)** %vslot
+  tail call void %fp(%struct.Foo* %obj)
+  tail call void %fp(%struct.Foo* %obj)
+  ret void
+}
+
+; X64-LABEL: vcall:
+; X64:   movq %rdi, %[[obj:[^ ]*]]
+; X64:   movq (%rdi), %[[vptr:[^ ]*]]
+; X64:   movq 8(%[[vptr]]), %[[fp:[^ ]*]]
+; X64:   movq %[[fp]], %r11
+; X64:   callq __x86_indirect_thunk_r11
+; X64-DAG:   movq %[[obj]], %rdi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: vcall:
+; X64FAST:   callq __x86_indirect_thunk_r11
+; X64FAST:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+
+declare void @direct_callee()
+
+define void @direct_tail() #0 {
+  tail call void @direct_callee()
+  ret void
+}
+
+; X64-LABEL: direct_tail:
+; X64:   jmp direct_callee # TAILCALL
+; X64FAST-LABEL: direct_tail:
+; X64FAST:   jmp direct_callee # TAILCALL
+
+
+declare

[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets

2020-03-10 Thread Scott Constable via Phabricator via cfe-commits
sconstab created this revision.
sconstab added reviewers: craig.topper, andrew.w.kaylor, chandlerc, zbrid.
Herald added subscribers: jfb, hiraditya, mgorny.
Herald added a project: LLVM.

Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and 
add them to a MachineGadgetGraph.

More specifically, a new X86 machine pass finds Load Value Injection (LVI) 
gadgets consisting of a load from memory (i.e., SOURCE), and any operation that 
may transmit the value loaded from memory over a covert channel, or use the 
value loaded from memory to determine a branch/call target (i.e., SINK).

Also adds a new target feature to X86: +lvi-load-hardening

The feature can be added via the clang CLI using `-mlvi-hardening`.


https://reviews.llvm.org/D75936

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  llvm/lib/Target/X86/CMakeLists.txt
  llvm/lib/Target/X86/ImmutableGraph.h
  llvm/lib/Target/X86/X86.h
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/test/CodeGen/X86/O0-pipeline.ll
  llvm/test/CodeGen/X86/O3-pipeline.ll
  llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
@@ -0,0 +1,129 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -x86-lvi-load-dot-verify -o %t < %s | FileCheck %s
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @test(i32* %untrusted_user_ptr, i32* %secret, i32 %secret_size) #0 {
+entry:
+  %untrusted_user_ptr.addr = alloca i32*, align 8
+  %secret.addr = alloca i32*, align 8
+  %secret_size.addr = alloca i32, align 4
+  %ret_val = alloca i32, align 4
+  %i = alloca i32, align 4
+  store i32* %untrusted_user_ptr, i32** %untrusted_user_ptr.addr, align 8
+  store i32* %secret, i32** %secret.addr, align 8
+  store i32 %secret_size, i32* %secret_size.addr, align 4
+  store i32 0, i32* %ret_val, align 4
+  call void @llvm.x86.sse2.lfence()
+  store i32 0, i32* %i, align 4
+  br label %for.cond
+
+for.cond: ; preds = %for.inc, %entry
+  %0 = load i32, i32* %i, align 4
+  %1 = load i32, i32* %secret_size.addr, align 4
+  %cmp = icmp slt i32 %0, %1
+  br i1 %cmp, label %for.body, label %for.end
+
+for.body: ; preds = %for.cond
+  %2 = load i32, i32* %i, align 4
+  %rem = srem i32 %2, 2
+  %cmp1 = icmp eq i32 %rem, 0
+  br i1 %cmp1, label %if.then, label %if.else
+
+if.then:  ; preds = %for.body
+  %3 = load i32*, i32** %secret.addr, align 8
+  %4 = load i32, i32* %ret_val, align 4
+  %idxprom = sext i32 %4 to i64
+  %arrayidx = getelementptr inbounds i32, i32* %3, i64 %idxprom
+  %5 = load i32, i32* %arrayidx, align 4
+  %6 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  store i32 %5, i32* %6, align 4
+  br label %if.end
+
+if.else:  ; preds = %for.body
+  %7 = load i32*, i32** %secret.addr, align 8
+  %8 = load i32, i32* %ret_val, align 4
+  %idxprom2 = sext i32 %8 to i64
+  %arrayidx3 = getelementptr inbounds i32, i32* %7, i64 %idxprom2
+  store i32 42, i32* %arrayidx3, align 4
+  br label %if.end
+
+if.end:   ; preds = %if.else, %if.then
+  %9 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  %10 = load i32, i32* %9, align 4
+  store i32 %10, i32* %ret_val, align 4
+  br label %for.inc
+
+for.inc:  ; preds = %if.end
+  %11 = load i32, i32* %i, align 4
+  %inc = add nsw i32 %11, 1
+  store i32 %inc, i32* %i, align 4
+  br label %for.cond
+
+for.end:  ; preds = %for.cond
+  %12 = load i32, i32* %ret_val, align 4
+  ret i32 %12
+}
+
+; CHECK:  digraph "Speculative gadgets for \"test\" function" {
+; CHECK-NEXT: label="Speculative gadgets for \"test\" function";
+; CHECK:  Node0x{{[0-9a-f]+}} [shape=record,color = green,label="{LFENCE\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 0];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{JCC_1 %bb.6, 13, implicit killed $eflags\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{CMP32rm killed renamable $eax, %stack.2.secret_size.addr, 1, $noreg, 

[PATCH] D75934: Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) [2/5]

2020-03-11 Thread Scott Constable via Phabricator via cfe-commits
sconstab updated this revision to Diff 249763.
sconstab added a comment.

Added help text for driver CLI options.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75934/new/

https://reviews.llvm.org/D75934

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  llvm/lib/Target/X86/CMakeLists.txt
  llvm/lib/Target/X86/X86.h
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86FastISel.cpp
  llvm/lib/Target/X86/X86FrameLowering.cpp
  llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/lib/Target/X86/X86InstrCompiler.td
  llvm/lib/Target/X86/X86InstrControl.td
  llvm/lib/Target/X86/X86InstrInfo.td
  llvm/lib/Target/X86/X86LoadValueInjectionIndirectThunks.cpp
  llvm/lib/Target/X86/X86MCInstLower.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
  llvm/test/CodeGen/X86/O0-pipeline.ll
  llvm/test/CodeGen/X86/O3-pipeline.ll
  llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
@@ -0,0 +1,282 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown < %s | FileCheck %s --check-prefix=X64
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -O0 < %s | FileCheck %s --check-prefix=X64FAST
+;
+; Note that a lot of this code was lifted from retpoline.ll.
+
+declare void @bar(i32)
+
+; Test a simple indirect call and tail call.
+define void @icall_reg(void (i32)* %fp, i32 %x) #0 {
+entry:
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_reg:
+; X64-DAG:   movq %rdi, %[[fp:[^ ]*]]
+; X64-DAG:   movl %esi, %[[x:[^ ]*]]
+; X64:   movl %esi, %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   callq __x86_indirect_thunk_r11
+; X64:   movl %[[x]], %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_reg:
+; X64FAST:   callq bar
+; X64FAST:   callq __x86_indirect_thunk_r11
+; X64FAST:   callq bar
+; X64FAST:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+
+@global_fp = external global void (i32)*
+
+; Test an indirect call through a global variable.
+define void @icall_global_fp(i32 %x, void (i32)** %fpp) #0 {
+  %fp1 = load void (i32)*, void (i32)** @global_fp
+  call void %fp1(i32 %x)
+  %fp2 = load void (i32)*, void (i32)** @global_fp
+  tail call void %fp2(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_global_fp:
+; X64-DAG:   movl %edi, %[[x:[^ ]*]]
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   callq __x86_indirect_thunk_r11
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_global_fp:
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   callq __x86_indirect_thunk_r11
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+
+%struct.Foo = type { void (%struct.Foo*)** }
+
+; Test an indirect call through a vtable.
+define void @vcall(%struct.Foo* %obj) #0 {
+  %vptr_field = getelementptr %struct.Foo, %struct.Foo* %obj, i32 0, i32 0
+  %vptr = load void (%struct.Foo*)**, void (%struct.Foo*)*** %vptr_field
+  %vslot = getelementptr void(%struct.Foo*)*, void(%struct.Foo*)** %vptr, i32 1
+  %fp = load void(%struct.Foo*)*, void(%struct.Foo*)** %vslot
+  tail call void %fp(%struct.Foo* %obj)
+  tail call void %fp(%struct.Foo* %obj)
+  ret void
+}
+
+; X64-LABEL: vcall:
+; X64:   movq %rdi, %[[obj:[^ ]*]]
+; X64:   movq (%rdi), %[[vptr:[^ ]*]]
+; X64:   movq 8(%[[vptr]]), %[[fp:[^ ]*]]
+; X64:   movq %[[fp]], %r11
+; X64:   callq __x86_indirect_thunk_r11
+; X64-DAG:   movq %[[obj]], %rdi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: vcall:
+; X64FAST:   callq __x86_indirect_thunk_r11
+; X64FAST:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+
+declare void @direct_callee()
+
+define void @direct_tail() #0 {
+  tail call void @direct_callee()
+  ret void
+}
+
+; X64-LABEL: direct_tail:
+; X64:   jmp direct_callee # TAILCALL
+; X64FAST-LABEL: direct_tail:
+; X64FAST:   jmp direct_callee # TAILCALL
+
+
+declare void @nonlazybind_callee() #1
+
+define void @nonlazybind_caller() #0 {
+  call void @nonlazybind_callee()
+  tail call void @nonlazybind_callee()
+  ret void
+}
+
+; X64-LABEL: nonlazybind_caller:
+; X64:   movq nonlazybind_callee@GOTPCREL(%rip), %[[REG:.*]]
+; X64:   movq %[[REG]], %r11
+; X64:   callq __x86_indirect_thunk_r11
+; X64:   movq %[[REG]], %r11
+; X64:   jmp __x86_indirect_th

[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/5]

2020-03-11 Thread Scott Constable via Phabricator via cfe-commits
sconstab updated this revision to Diff 249765.
sconstab added a comment.

Added help text for the CLI options


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  llvm/lib/Target/X86/CMakeLists.txt
  llvm/lib/Target/X86/ImmutableGraph.h
  llvm/lib/Target/X86/X86.h
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/test/CodeGen/X86/O0-pipeline.ll
  llvm/test/CodeGen/X86/O3-pipeline.ll
  llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
@@ -0,0 +1,129 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -x86-lvi-load-dot-verify -o %t < %s | FileCheck %s
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @test(i32* %untrusted_user_ptr, i32* %secret, i32 %secret_size) #0 {
+entry:
+  %untrusted_user_ptr.addr = alloca i32*, align 8
+  %secret.addr = alloca i32*, align 8
+  %secret_size.addr = alloca i32, align 4
+  %ret_val = alloca i32, align 4
+  %i = alloca i32, align 4
+  store i32* %untrusted_user_ptr, i32** %untrusted_user_ptr.addr, align 8
+  store i32* %secret, i32** %secret.addr, align 8
+  store i32 %secret_size, i32* %secret_size.addr, align 4
+  store i32 0, i32* %ret_val, align 4
+  call void @llvm.x86.sse2.lfence()
+  store i32 0, i32* %i, align 4
+  br label %for.cond
+
+for.cond: ; preds = %for.inc, %entry
+  %0 = load i32, i32* %i, align 4
+  %1 = load i32, i32* %secret_size.addr, align 4
+  %cmp = icmp slt i32 %0, %1
+  br i1 %cmp, label %for.body, label %for.end
+
+for.body: ; preds = %for.cond
+  %2 = load i32, i32* %i, align 4
+  %rem = srem i32 %2, 2
+  %cmp1 = icmp eq i32 %rem, 0
+  br i1 %cmp1, label %if.then, label %if.else
+
+if.then:  ; preds = %for.body
+  %3 = load i32*, i32** %secret.addr, align 8
+  %4 = load i32, i32* %ret_val, align 4
+  %idxprom = sext i32 %4 to i64
+  %arrayidx = getelementptr inbounds i32, i32* %3, i64 %idxprom
+  %5 = load i32, i32* %arrayidx, align 4
+  %6 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  store i32 %5, i32* %6, align 4
+  br label %if.end
+
+if.else:  ; preds = %for.body
+  %7 = load i32*, i32** %secret.addr, align 8
+  %8 = load i32, i32* %ret_val, align 4
+  %idxprom2 = sext i32 %8 to i64
+  %arrayidx3 = getelementptr inbounds i32, i32* %7, i64 %idxprom2
+  store i32 42, i32* %arrayidx3, align 4
+  br label %if.end
+
+if.end:   ; preds = %if.else, %if.then
+  %9 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  %10 = load i32, i32* %9, align 4
+  store i32 %10, i32* %ret_val, align 4
+  br label %for.inc
+
+for.inc:  ; preds = %if.end
+  %11 = load i32, i32* %i, align 4
+  %inc = add nsw i32 %11, 1
+  store i32 %inc, i32* %i, align 4
+  br label %for.cond
+
+for.end:  ; preds = %for.cond
+  %12 = load i32, i32* %ret_val, align 4
+  ret i32 %12
+}
+
+; CHECK:  digraph "Speculative gadgets for \"test\" function" {
+; CHECK-NEXT: label="Speculative gadgets for \"test\" function";
+; CHECK:  Node0x{{[0-9a-f]+}} [shape=record,color = green,label="{LFENCE\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 0];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{JCC_1 %bb.6, 13, implicit killed $eflags\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{CMP32rm killed renamable $eax, %stack.2.secret_size.addr, 1, $noreg, 0, $noreg, implicit-def $eflags :: (dereferenceable load 4 from %ir.secret_size.addr)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[labe

[PATCH] D75934: Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) [2/6]

2020-03-18 Thread Scott Constable via Phabricator via cfe-commits
sconstab updated this revision to Diff 251193.
sconstab added a comment.

Addressed some of Zola's comments, and removed some unnecessary assertions.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75934/new/

https://reviews.llvm.org/D75934

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  llvm/lib/Target/X86/CMakeLists.txt
  llvm/lib/Target/X86/X86.h
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86FastISel.cpp
  llvm/lib/Target/X86/X86FrameLowering.cpp
  llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/lib/Target/X86/X86ISelLowering.h
  llvm/lib/Target/X86/X86InstrCompiler.td
  llvm/lib/Target/X86/X86InstrControl.td
  llvm/lib/Target/X86/X86InstrInfo.td
  llvm/lib/Target/X86/X86LoadValueInjectionIndirectThunks.cpp
  llvm/lib/Target/X86/X86MCInstLower.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
  llvm/test/CodeGen/X86/O0-pipeline.ll
  llvm/test/CodeGen/X86/O3-pipeline.ll
  llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
@@ -0,0 +1,282 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown < %s | FileCheck %s --check-prefix=X64
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -O0 < %s | FileCheck %s --check-prefix=X64FAST
+;
+; Note that a lot of this code was lifted from retpoline.ll.
+
+declare void @bar(i32)
+
+; Test a simple indirect call and tail call.
+define void @icall_reg(void (i32)* %fp, i32 %x) #0 {
+entry:
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_reg:
+; X64-DAG:   movq %rdi, %[[fp:[^ ]*]]
+; X64-DAG:   movl %esi, %[[x:[^ ]*]]
+; X64:   movl %esi, %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   callq __x86_indirect_thunk_r11
+; X64:   movl %[[x]], %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_reg:
+; X64FAST:   callq bar
+; X64FAST:   callq __x86_indirect_thunk_r11
+; X64FAST:   callq bar
+; X64FAST:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+
+@global_fp = external global void (i32)*
+
+; Test an indirect call through a global variable.
+define void @icall_global_fp(i32 %x, void (i32)** %fpp) #0 {
+  %fp1 = load void (i32)*, void (i32)** @global_fp
+  call void %fp1(i32 %x)
+  %fp2 = load void (i32)*, void (i32)** @global_fp
+  tail call void %fp2(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_global_fp:
+; X64-DAG:   movl %edi, %[[x:[^ ]*]]
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   callq __x86_indirect_thunk_r11
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_global_fp:
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   callq __x86_indirect_thunk_r11
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+
+%struct.Foo = type { void (%struct.Foo*)** }
+
+; Test an indirect call through a vtable.
+define void @vcall(%struct.Foo* %obj) #0 {
+  %vptr_field = getelementptr %struct.Foo, %struct.Foo* %obj, i32 0, i32 0
+  %vptr = load void (%struct.Foo*)**, void (%struct.Foo*)*** %vptr_field
+  %vslot = getelementptr void(%struct.Foo*)*, void(%struct.Foo*)** %vptr, i32 1
+  %fp = load void(%struct.Foo*)*, void(%struct.Foo*)** %vslot
+  tail call void %fp(%struct.Foo* %obj)
+  tail call void %fp(%struct.Foo* %obj)
+  ret void
+}
+
+; X64-LABEL: vcall:
+; X64:   movq %rdi, %[[obj:[^ ]*]]
+; X64:   movq (%rdi), %[[vptr:[^ ]*]]
+; X64:   movq 8(%[[vptr]]), %[[fp:[^ ]*]]
+; X64:   movq %[[fp]], %r11
+; X64:   callq __x86_indirect_thunk_r11
+; X64-DAG:   movq %[[obj]], %rdi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+; X64FAST-LABEL: vcall:
+; X64FAST:   callq __x86_indirect_thunk_r11
+; X64FAST:   jmp __x86_indirect_thunk_r11 # TAILCALL
+
+
+declare void @direct_callee()
+
+define void @direct_tail() #0 {
+  tail call void @direct_callee()
+  ret void
+}
+
+; X64-LABEL: direct_tail:
+; X64:   jmp direct_callee # TAILCALL
+; X64FAST-LABEL: direct_tail:
+; X64FAST:   jmp direct_callee # TAILCALL
+
+
+declare void @nonlazybind_callee() #1
+
+define void @nonlazybind_caller() #0 {
+  call void @nonlazybind_callee()
+  tail call void @nonlazybind_callee()
+  ret void
+}
+
+; X64-LABEL: nonlazybind_caller:
+; X64:   movq nonlazybind_callee@GOTPCREL(%rip), %[[REG:.*]]
+; X64:   movq %[[REG]], %r11
+; X64:   callq __x86_indirect_thu

[PATCH] D75934: Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) [2/6]

2020-03-18 Thread Scott Constable via Phabricator via cfe-commits
sconstab marked 12 inline comments as done.
sconstab added inline comments.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionIndirectThunks.cpp:79
+
+bool X86LoadValueInjectionIndirectThunksPass::doInitialization(Module &M) {
+  InsertedThunks = false;

zbrid wrote:
> I want to make sure I understand this correctly: You use this function to 
> initialize InsertedThunks so that, for each module, InsertedThunks is shared 
> across all the functions. This enables the Module to ensure the thunk is only 
> inserted once. Is that right?
As mentioned in a code comment at the beginning of this file, a lot of this 
code was lifted from X86RetpolineThunks.cpp, including the logic to insert one 
thunk per module. So I didn't actually compose this logic, but my understanding 
of it seems to match yours.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionIndirectThunks.cpp:87
+  STI = &MF.getSubtarget();
+  if (!(STI->hasSSE2() || STI->is64Bit())) {
+// FIXME: support 32-bit

zbrid wrote:
> Why is 32-bit okay if it has SSE2 features? (Asking to understand since my 
> processor knowledge is a bit weak. Thanks!)
Actually it isn't okay (at this time), so you were correct to point this out. 
The thunks right now clearly only support 64-bit. It may be conceivable to 
support 32-bit that also has SSE2, but I think this would require some extra 
work.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionIndirectThunks.cpp:129
+
+  assert(MF.getName() == "__x86_indirect_thunk_r11" &&
+ "Should only have an r11 thunk on 64-bit targets");

zbrid wrote:
> Should this use R11ThunkName instead of this string literal?
I just removed the assertion because it wasn't really necessary.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionIndirectThunks.cpp:149
+  // inline.
+  AttrBuilder B;
+  B.addAttribute(llvm::Attribute::NoUnwind);

zbrid wrote:
> I see this list is from the retpoline pass. I don't know what each of these 
> things do, but just wondering if you double checked these are the same 
> attributes we want for this thunk?
I do think that these are correct. From the LLVM reference manual, NoUnwind 
means that "This function attribute indicates that the function never raises an 
exception," which should hold for the thunk. The Naked attribute "disables 
prologue / epilogue emission for the function," which is something we would not 
want.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75934/new/

https://reviews.llvm.org/D75934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-03-18 Thread Scott Constable via Phabricator via cfe-commits
sconstab marked 5 inline comments as done.
sconstab added inline comments.



Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:271
+// Apply the mitigation to `MF`, return the number of fences inserted.
+// If `FixedLoads` is `true`, then the mitigation will be applied to both fixed
+// and non-fixed loads; otherwise, only non-fixed loads.

zbrid wrote:
> Am I misunderstanding this comment? It sounds like if FixedLoads is true then 
> BOTH fixed loads and non-fixed loads will be mitigated. Since 
> runOnMachineFunction would call hardenLoads twice for non-fixed loads, would 
> that result in double mitigation for non-fixed loads in the case where we 
> also harden fixed loads? Unfortunately I'm having trouble reasoning through 
> this myself, so I'd appreciate some clarification.
The comment was incorrect.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75936: Add a Pass to X86 that builds a Condensed CFG for Load Value Injection (LVI) Gadgets [4/6]

2020-03-18 Thread Scott Constable via Phabricator via cfe-commits
sconstab updated this revision to Diff 251242.
sconstab added a comment.

Addressed Zola's comments.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75936/new/

https://reviews.llvm.org/D75936

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  llvm/lib/Target/X86/CMakeLists.txt
  llvm/lib/Target/X86/ImmutableGraph.h
  llvm/lib/Target/X86/X86.h
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/test/CodeGen/X86/O0-pipeline.ll
  llvm/test/CodeGen/X86/O3-pipeline.ll
  llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-gadget-graph.ll
@@ -0,0 +1,129 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -x86-lvi-load-dot-verify -o %t < %s | FileCheck %s
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @test(i32* %untrusted_user_ptr, i32* %secret, i32 %secret_size) #0 {
+entry:
+  %untrusted_user_ptr.addr = alloca i32*, align 8
+  %secret.addr = alloca i32*, align 8
+  %secret_size.addr = alloca i32, align 4
+  %ret_val = alloca i32, align 4
+  %i = alloca i32, align 4
+  store i32* %untrusted_user_ptr, i32** %untrusted_user_ptr.addr, align 8
+  store i32* %secret, i32** %secret.addr, align 8
+  store i32 %secret_size, i32* %secret_size.addr, align 4
+  store i32 0, i32* %ret_val, align 4
+  call void @llvm.x86.sse2.lfence()
+  store i32 0, i32* %i, align 4
+  br label %for.cond
+
+for.cond: ; preds = %for.inc, %entry
+  %0 = load i32, i32* %i, align 4
+  %1 = load i32, i32* %secret_size.addr, align 4
+  %cmp = icmp slt i32 %0, %1
+  br i1 %cmp, label %for.body, label %for.end
+
+for.body: ; preds = %for.cond
+  %2 = load i32, i32* %i, align 4
+  %rem = srem i32 %2, 2
+  %cmp1 = icmp eq i32 %rem, 0
+  br i1 %cmp1, label %if.then, label %if.else
+
+if.then:  ; preds = %for.body
+  %3 = load i32*, i32** %secret.addr, align 8
+  %4 = load i32, i32* %ret_val, align 4
+  %idxprom = sext i32 %4 to i64
+  %arrayidx = getelementptr inbounds i32, i32* %3, i64 %idxprom
+  %5 = load i32, i32* %arrayidx, align 4
+  %6 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  store i32 %5, i32* %6, align 4
+  br label %if.end
+
+if.else:  ; preds = %for.body
+  %7 = load i32*, i32** %secret.addr, align 8
+  %8 = load i32, i32* %ret_val, align 4
+  %idxprom2 = sext i32 %8 to i64
+  %arrayidx3 = getelementptr inbounds i32, i32* %7, i64 %idxprom2
+  store i32 42, i32* %arrayidx3, align 4
+  br label %if.end
+
+if.end:   ; preds = %if.else, %if.then
+  %9 = load i32*, i32** %untrusted_user_ptr.addr, align 8
+  %10 = load i32, i32* %9, align 4
+  store i32 %10, i32* %ret_val, align 4
+  br label %for.inc
+
+for.inc:  ; preds = %if.end
+  %11 = load i32, i32* %i, align 4
+  %inc = add nsw i32 %11, 1
+  store i32 %inc, i32* %i, align 4
+  br label %for.cond
+
+for.end:  ; preds = %for.cond
+  %12 = load i32, i32* %ret_val, align 4
+  ret i32 %12
+}
+
+; CHECK:  digraph "Speculative gadgets for \"test\" function" {
+; CHECK-NEXT: label="Speculative gadgets for \"test\" function";
+; CHECK:  Node0x{{[0-9a-f]+}} [shape=record,color = green,label="{LFENCE\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 0];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{JCC_1 %bb.6, 13, implicit killed $eflags\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{CMP32rm killed renamable $eax, %stack.2.secret_size.addr, 1, $noreg, 0, $noreg, implicit-def $eflags :: (dereferenceable load 4 from %ir.secret_size.addr)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} [shape=record,label="{renamable $eax = MOV32rm %stack.4.i, 1, $noreg, 0, $noreg :: (dereferenceable load 4 from %ir.i)\n}"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[color = red, style = "dashed"];
+; CHECK-NEXT: Node0x{{[0-9a-f]+}} -> Node0x{{[0-9a-f]+}}[label = 1];
+

[PATCH] D76458: Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) [by modifying X86RetpolineThunks.cpp]

2020-03-19 Thread Scott Constable via Phabricator via cfe-commits
sconstab created this revision.
sconstab added reviewers: zbrid, craig.topper, andrew.w.kaylor, chandlerc.
Herald added subscribers: jfb, hiraditya.

This patch is an alternate implementation of D75934 
 that mitigates LVI indirect calls/jumps by 
making changes to the existing X86RetpolineThunks pass, instead of introducing 
a new pass.


https://reviews.llvm.org/D76458

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/X86.cpp
  llvm/lib/Target/X86/X86.td
  llvm/lib/Target/X86/X86RetpolineThunks.cpp
  llvm/lib/Target/X86/X86Subtarget.h
  llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll

Index: llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
===
--- /dev/null
+++ llvm/test/CodeGen/X86/lvi-hardening-indirectbr.ll
@@ -0,0 +1,281 @@
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -mattr=+lvi-cfi < %s | FileCheck %s --check-prefix=X64
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown -mattr=+lvi-cfi -O0 < %s | FileCheck %s --check-prefix=X64FAST
+;
+; Note that a lot of this code was lifted from retpoline.ll.
+
+declare void @bar(i32)
+
+; Test a simple indirect call and tail call.
+define void @icall_reg(void (i32)* %fp, i32 %x) {
+entry:
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  tail call void @bar(i32 %x)
+  tail call void %fp(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_reg:
+; X64-DAG:   movq %rdi, %[[fp:[^ ]*]]
+; X64-DAG:   movl %esi, %[[x:[^ ]*]]
+; X64:   movl %esi, %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   callq __llvm_retpoline_r11
+; X64:   movl %[[x]], %edi
+; X64:   callq bar
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __llvm_retpoline_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_reg:
+; X64FAST:   callq bar
+; X64FAST:   callq __llvm_retpoline_r11
+; X64FAST:   callq bar
+; X64FAST:   jmp __llvm_retpoline_r11 # TAILCALL
+
+
+@global_fp = external global void (i32)*
+
+; Test an indirect call through a global variable.
+define void @icall_global_fp(i32 %x, void (i32)** %fpp) #0 {
+  %fp1 = load void (i32)*, void (i32)** @global_fp
+  call void %fp1(i32 %x)
+  %fp2 = load void (i32)*, void (i32)** @global_fp
+  tail call void %fp2(i32 %x)
+  ret void
+}
+
+; X64-LABEL: icall_global_fp:
+; X64-DAG:   movl %edi, %[[x:[^ ]*]]
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   callq __llvm_retpoline_r11
+; X64-DAG:   movl %[[x]], %edi
+; X64-DAG:   movq global_fp(%rip), %r11
+; X64:   jmp __llvm_retpoline_r11 # TAILCALL
+
+; X64FAST-LABEL: icall_global_fp:
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   callq __llvm_retpoline_r11
+; X64FAST:   movq global_fp(%rip), %r11
+; X64FAST:   jmp __llvm_retpoline_r11 # TAILCALL
+
+
+%struct.Foo = type { void (%struct.Foo*)** }
+
+; Test an indirect call through a vtable.
+define void @vcall(%struct.Foo* %obj) #0 {
+  %vptr_field = getelementptr %struct.Foo, %struct.Foo* %obj, i32 0, i32 0
+  %vptr = load void (%struct.Foo*)**, void (%struct.Foo*)*** %vptr_field
+  %vslot = getelementptr void(%struct.Foo*)*, void(%struct.Foo*)** %vptr, i32 1
+  %fp = load void(%struct.Foo*)*, void(%struct.Foo*)** %vslot
+  tail call void %fp(%struct.Foo* %obj)
+  tail call void %fp(%struct.Foo* %obj)
+  ret void
+}
+
+; X64-LABEL: vcall:
+; X64:   movq %rdi, %[[obj:[^ ]*]]
+; X64:   movq (%rdi), %[[vptr:[^ ]*]]
+; X64:   movq 8(%[[vptr]]), %[[fp:[^ ]*]]
+; X64:   movq %[[fp]], %r11
+; X64:   callq __llvm_retpoline_r11
+; X64-DAG:   movq %[[obj]], %rdi
+; X64-DAG:   movq %[[fp]], %r11
+; X64:   jmp __llvm_retpoline_r11 # TAILCALL
+
+; X64FAST-LABEL: vcall:
+; X64FAST:   callq __llvm_retpoline_r11
+; X64FAST:   jmp __llvm_retpoline_r11 # TAILCALL
+
+
+declare void @direct_callee()
+
+define void @direct_tail() #0 {
+  tail call void @direct_callee()
+  ret void
+}
+
+; X64-LABEL: direct_tail:
+; X64:   jmp direct_callee # TAILCALL
+; X64FAST-LABEL: direct_tail:
+; X64FAST:   jmp direct_callee # TAILCALL
+
+
+declare void @nonlazybind_callee() #1
+
+define void @nonlazybind_caller() #0 {
+  call void @nonlazybind_callee()
+  tail call void @nonlazybind_callee()
+  ret void
+}
+
+; X64-LABEL: nonlazybind_caller:
+; X64:   movq nonlazybind_callee@GOTPCREL(%rip), %[[REG:.*]]
+; X64:   movq %[[REG]], %r11
+; X64:   callq __llvm_retpoline_r11
+; X64:   movq %[[REG]], %r11
+; X64:   jmp __llvm_retpoline_r11 # TAILCALL
+; X64FAST-LABEL: nonlazybind_caller:
+; X64FAST:   movq nonlazybind_callee@GOTPCREL(%rip), %r11
+; X64FAST:   callq __llvm_retpoline_r11
+; X64FAST:   movq nonlazybind_callee@GOTPCREL(%rip), %r11
+; X64FAST:   jmp __llvm_retpoline_r11 # TAILCALL
+
+
+; Check that a switch gets lowered using a jump table
+define void @switch_jumptable(i32* %ptr, i64* %sink) #0 {
+; X64-LABEL: switch_jumptabl