rjmccall added inline comments.

================
Comment at: lib/CodeGen/CGDecl.cpp:1291
+// where the pointer to the local variable is the key in the map.
+void CodeGenFunction::EmitAutoVarNoAlias(const AutoVarEmission &emission) {
+  assert(emission.Variable && "emission was not valid!");
----------------
This should really be a subroutine of EmitAutoVarAlloca; that will implicitly 
pick this logic up for a bunch of less standard places in IRGen that create 
local variables.  Please make it a static function (if possible) called 
EmitNoAliasScope and call it immediately after EmitVarAnnotations.


================
Comment at: lib/CodeGen/CGStmt.cpp:525
+void CodeGenFunction::LexicalNoAliasInfo::addNoAliasMD() {
+  if (MemoryInsts.empty() || NoAliasScopes.empty())
+    return;
----------------
hfinkel wrote:
> rjmccall wrote:
> > It's much more likely that NoAliasScopes will be empty than that 
> > MemoryInsts will be empty.  You should probably fast-path using that, or 
> > better yet, with the RecordMemoryInsts bit.
> I'm not sure that's true; we only record memory-accessing instructions in the 
> first place if there are relevant restrict-qualified pointers around.
Ok, that makes sense.


================
Comment at: lib/CodeGen/CGStmt.cpp:537
+                                      llvm::LLVMContext::MD_noalias),
+                                      NewScopeList));
+
----------------
rjmccall wrote:
> hfinkel wrote:
> > rjmccall wrote:
> > > This is a very strange representation.  Every memory operation in the 
> > > lexical block is annotated with a list of all of the scopes that were 
> > > entered within the block, even if they were entered after the operation.  
> > > But for some reason, not with nested scopes?
> > > 
> > > What's the right patch for me to read about this representation?
> > Perhaps unfortunately, this is an artifact of the way that restrict is 
> > defined in C. It applies to all accesses in the block in which the variable 
> > is declared, even those before the declaration of the restrict-qualified 
> > local itself.
> > 
> > It should work with nested scopes, in the sense that we add these things as 
> > we complete each scope. So we add things to the inner scope, and then when 
> > we complete the outer scope, we go back over the instructions (including 
> > those in the inner scope because the scope recording recurses up the scope 
> > hierarchy), and adds the outer scopes - it concatenates them to any added 
> > by the inner (nested) scopes.
> > 
> > The noalias intrinsic's LangRef updates are in D9375.
> Ah, I see the recursion now.  Please add a comment explaining that 
> expectation here.
I would still like this comment.  It should be enough to explain how 
MemoryInsts actually gets filled: there's a callback from inserting an 
instruction which adds all memory instructions to every active lexical scope 
that's recording them.


================
Comment at: lib/CodeGen/CodeGenFunction.cpp:1986
+  if (I->mayReadOrWriteMemory())
+    recordMemoryInstruction(I);
 }
----------------
This condition will include calls, which I assume is unnecessary (except maybe 
memory intrinsics?).  More seriously, it will also include loads and stores 
into allocas, which will sweep in all sorts of bookkeeping temporaries that 
IRGen uses in more complex code-generation situations.  You might want to 
consider ways to avoid recording this kind of instruction that are very likely 
to just get immediately optimized away by e.g. mem2reg.

Please also modify LexicalScope so that it records whether there's any active 
recording scope in the CodeGenFunction.  Lexical scopes are nested, so that 
should be as easy as saving the current state when you enter a scope and 
restoring it when you leave.  That will allow you to optimize this code by 
completely bypassing the recursion and even the check for whether the 
instruction is a memory instruction in the extremely common case of a function 
with no restrict variables.

In general, a lot of things about this approach seem to have worryingly poor 
asymptotic performance in the face of large functions or (especially) many 
restrict variables.  (You could, for example, have chosen to have each memory 
instruction link to its enclosing noalias scope, which would contain a link to 
its enclosing scope and a list of the variables it directly declares.  This 
would be a more complex representation to consume, but it would not require 
inherently quadratic frontend work building all these lists.)  The only saving 
grace is that we expect very little code to using restrict, so it becomes vital 
to ensure that your code is immediately short-circuited when nothing is using 
restrict.


================
Comment at: lib/CodeGen/CodeGenFunction.h:597
+    // those blocks) so that we can later add the appropriate metadata. Record
+    // this instruction and so the same in any parent scopes.
+    void recordMemoryInstruction(llvm::Instruction *I) {
----------------
I would suggest this wording:

  /// Record the given memory access instruction in this and all of its 
enclosing scopes.
  /// When we close this scope, we'll add the list of all the 
restrict-qualified local variables
  /// it declared to all the memory accesses which occurred within it.  
Recording is only
  /// enabled under optimization, and it's disabled in a scope unless it 
actually declares
  /// any local restrict variables.


https://reviews.llvm.org/D9403



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to