rnk added a subscriber: hans.
rnk added a comment.

Here is some feedback, I apologize for dragging my feet.

Also, I would really like to get feedback from @pcc. He's OOO currently, though.



================
Comment at: clang/include/clang/Driver/Options.td:500
 def b : JoinedOrSeparate<["-"], "b">, Flags<[Unsupported]>;
+def cfguard_no_checks : Flag<["-"], "cfguard-no-checks">, Flags<[CC1Option]>,
+  HelpText<"Emit Windows Control Flow Guard tables only (no checks).">;
----------------
@hans, WDYT about the -cc1 interface? I think this is fine.


================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:5972
+    if (Instrument && !NoChecks)
+      //Emit CFG instrumentation and the table of address-taken functions.
       CmdArgs.push_back("-cfguard");
----------------
Comment formatting needs to be fixed, you can use clang-format for this.


================
Comment at: llvm/include/llvm/CodeGen/MachineBasicBlock.h:179
+    assert((!CachedMCSymbol || LabelIsPublic == true) && "Cannot set label to 
public after it has been accessed.");
+    LabelIsPublic = true; 
+  }
----------------
I hesitate to make MBB labels public, I think it might be better to build your 
own MCSymbol label with a public name.


================
Comment at: llvm/lib/CodeGen/CFGuardLongjmp.cpp:101
+  // targets.
+  for (MachineInstr *Setjmp : SetjmpCalls) {
+    MachineBasicBlock *OldMBB = Setjmp->getParent();
----------------
Rather than changing the machine CFG, I would suggest using 
MachineInstr::setPostInstrSymbol, which I've been planning to use for exactly 
this purpose. :) You can make labels with MCContext::createSymbol, and you'll 
want to come up with symbols that won't clash with C or C++ symbols. I see MSVC 
makes labels like `$LN4`, so maybe `$cfgsjN` or something like that. I think 
MCContext will add numbers for you.


================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:11
+/// This file contains the machine instruction transform to fixup the Control
+/// Flow Guard checks inserted by /X86CFGuard.cpp. This pass searches
+/// for such cases and replaces the pair of instructions with a single
----------------
I guess it's not X86CFGuard.cpp now that you've generalized it as a 
cross-platform IR pass.


================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:57-58
+
+/// MachineFunction pass to fix x86 Control Flow Guard checks that have been
+/// generated as separate load and call/invoke instructions.
+class X86FixupCFGuard : public MachineFunctionPass {
----------------
Can you elaborate on what the correctness requirement is? The correctness 
requirement is just that the loaded value is not saved in a CSR which is then 
spilled and reloaded from memory across a call site. If possible, I think it 
would be worth teaching both FastISel and SDISel to build the correct 
MachineInstrs during instruction selection, and turn this into a validation 
pass that fails if it sees indirect calls through registers in functions where 
CFG is enabled.


================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:114
+    OldCallOpcode = X86::CALL64r;
+    NewCallOpcode = X86::CALL64m;
+  } else {
----------------
I think you probably also want to look for tail calls. I think the best way to 
do this would be to use `MI.isCall()` in the loop, and then switch on the 
opcode to handle all call variants.


================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:134
+  // load instruction
+  for (auto I = LoadMI.getIterator(), E = MBB->instr_end(); I != E; ++I) {
+    MachineInstr &MI = cast<MachineInstr>(*I);
----------------
Is it possible to use MachineRegisterInfo to iterate over all the uses of the 
load, instead of scanning the BB? It seems like this could be O(n^2) if a 
single BB contains a long sequence of indirect calls. I think it would look 
like this:
  MachineRegisterInfo *MRI = MF.getRegInfo();
  for (MachineInstr &MI : MRI->use_instructions(Reg))
    // do something with MI
There's also use_operands.


================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:168-170
+  // Erase all marked call/invoke instructions
+  for (MachineInstr *MI : MIsToErase) {
+    MI->eraseFromParent();
----------------
Erasing at the end is always the safest way to do it. :)


================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:209
+  for (MachineInstr *LoadMI : GuardLoads) {
+    fixupGuardLoad(*LoadMI, TII);
+
----------------
This doesn't do anything with the return value, should it?


================
Comment at: llvm/lib/Transforms/CFGuard/CFGuard.cpp:38
+/// architectures (e.g. X86_32 uses cf_check, X86_64 uses cf_dispatch).
+class CFGuard : public ModulePass {
+public:
----------------
Does this need to be a ModulePass? It's preferable to avoid module passes if 
possible. When the pass manager has a sequence of function passes to run, it 
runs all passes on function f, then g, then h, and so on, rather than running 
the function pass over the entire module and then the next pass over the entire 
module, and so on. This is done to get better cache locality: each function 
pass touches f's data structures repeatedly, and then moves on to g without 
ever going back to f.

You should be able to do the global checks in an override of 
`doInitialization`, which receives a Module.


================
Comment at: llvm/lib/Transforms/CFGuard/CFGuard.cpp:42
+
+  enum Mechanism { cf_check, cf_dispatch };
+
----------------
I guess more LLVM-style names for these would be CF_Check, CF_Dispatch. Of 
course, Rui is changing the style soon anyway.


================
Comment at: llvm/lib/Transforms/CFGuard/CFGuard.cpp:220
+    GuardDispatch =
+        B.CreateCall(GuardDispatchType, GuardDispatchLoad, GuardDispatchArgs);
+  } else if (InvokeInst::classof(CB)) {
----------------
You'd want to set the tail call kind here to forward `tail` or `musttail`. This 
matters for things like member pointer function thunks. This C++ makes a 
musttail thunk, for example:
  struct A { virtual void f(); };
  auto mp = &A::f;
Which gives this IR:
  define linkonce_odr void @"??_9A@@$BA@AA"(%struct.A* %this, ...) #0 comdat 
align 2 {
  entry:
    %0 = bitcast %struct.A* %this to void (%struct.A*, ...)***
    %vtable = load void (%struct.A*, ...)**, void (%struct.A*, ...)*** %0, 
align 8, !tbaa !3
    %1 = load void (%struct.A*, ...)*, void (%struct.A*, ...)** %vtable, align 8
    musttail call void (%struct.A*, ...) %1(%struct.A* %this, ...)
    ret void
  }
This does an indirect virtual call through the 0th slot, and if we lose 
musttail, it won't perfectly forward arguments.



================
Comment at: llvm/lib/Transforms/CFGuard/CFGuard.cpp:254
+  // register (i.e. RAX on 64-bit X86 targets)
+  GuardDispatch->setCallingConv(CallingConv::CFGuard_Dispatch);
+
----------------
So, this isn't going to work if the original calling convention was something 
other than the default. For x64, the only commonly used non-standard convention 
would be vectorcall. Changing the convention here in the IR is going to make it 
hard to pass that along.

I think as you can see from how much code you have here already, replacing call 
instructions in general is really hard. These days there are also bundles, 
which I guess is something else missing from the above.

Here's a sketch of an alternative approach:
- load __guard_dispatch_icall_fptr as normal
- get the pre-existing function type of the callee
- cast the loaded pointer to the original function pointer type
- use `CB->setCalledOperand` to replace the call target
- add the original call target as an "argument bundle" to the existing 
instruction
- change SelectionDAGBuilder.cpp to recognize the new bundle kind and create a 
new ArgListEntry in the argument list
- make and set a new bit in ArgListEntry, something like isCFGTarget
- change CC_X86_Win64_C to put CFGTarget arguments in RAX, similar to how 
CCIfNest puts things in R10

This way, the original call instruction remains with exactly the same 
arrangement of args, attributes, callbr successors, tail call kind, etc. All 
you're doing is changing the callee, and passing the original callee as an 
extra argument on the side. Basically, this approach leverages bundles to avoid 
messing with the function type, which more or less requires replacing the call. 
:)

There are probably issues with this design that I haven't foreseen, but I think 
it's worth a try.


================
Comment at: llvm/test/CodeGen/AArch64/cfguard-checks.ll:3-4
+
+; This test is disabled on Linux
+; UNSUPPORTED: linux
+
----------------
These annotations shouldn't be necessary, the tests should pass as written on 
Linux. The mtriple flag above tells llc to generate code for Windows.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65761/new/

https://reviews.llvm.org/D65761



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to