rnk added inline comments.

================
Comment at: lib/CodeGen/CGCUDABuiltin.cpp:105-108
@@ -104,2 +104,6 @@
   } else {
-    BufferPtr = Builder.Insert(new llvm::AllocaInst(
+    // Insert our alloca not into the current BB, but into the function's entry
+    // block.  This is important because nvvm doesn't support alloca -- if we
+    // put the alloca anywhere else, llvm may eventually output
+    // stacksave/stackrestore intrinsics, which cause our nvvm backend to 
choke.
+    auto *Alloca = new llvm::AllocaInst(
----------------
The fact that allocas for local variables should always go in the entry block 
is pretty widespread cultural knowledge in LLVM and clang. Most readers aren't 
going to need this comment, unless you expect that people working on CUDA won't 
have that background. Plus, if you use CreateTempAlloca, there won't be any 
question about which insert point should be used.

================
Comment at: lib/CodeGen/CGCUDABuiltin.cpp:109
@@ -106,1 +108,3 @@
+    // stacksave/stackrestore intrinsics, which cause our nvvm backend to 
choke.
+    auto *Alloca = new llvm::AllocaInst(
         llvm::Type::getInt8Ty(Ctx), llvm::ConstantInt::get(Int32Ty, BufSize),
----------------
You can still use CreateTempAlloca by making an `[i8 x N]` LLVM type. You'll 
have to use CreateStructGEP below for forming GEPs. Overall I think that'd be 
nicer, since you don't need to worry about insertion at all.


http://reviews.llvm.org/D16664



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to