llvmbot wrote:

<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-clang-codegen

Author: Chuanqi Xu (ChuanqiXu9)

<details>
<summary>Changes</summary>

This comes from an internal crash. I know generally it is better to reproduce 
it first but I do feel the pattern is pretty risky. So I am wondering if we can 
discuss it first. So maybe this is more of a discussion instead of a pure PR.

Then story is, when we try to get or create a LLVM global for a C/C++'s global, 
we will try to look up the name first for the existing globals. And if we find 
one, we will perform some checks. If the checks pass, we will return the found 
one. If not, we will create a new one and replace the previous one. (Why do we 
want to do this? My instinct reaction is that we should abort here):
https://github.com/llvm/llvm-project/blob/bf43a138f0a6220cd043a376200bd739cacb25e3/clang/lib/CodeGen/CodeGenModule.cpp#L4966-L4982

https://github.com/llvm/llvm-project/blob/bf43a138f0a6220cd043a376200bd739cacb25e3/clang/lib/CodeGen/CodeGenModule.cpp#L5017-L5032

The problem is, if we store the address of a global variable and the global 
variable got replaced later, the address we stored became a wild pointer! e.g.

https://github.com/llvm/llvm-project/blob/283273fa1e3be4a03f06a5efd08a8c818be981fd/clang/lib/CodeGen/CodeGenModule.cpp#L2092-L2097

I feel this is pretty dangerous. And to my knowledge, I think we'd better to 
not remove things emitted during CodeGen.

Then, one of the trigger for the problem is `CodeGenModule::GetAddrOfGlobalVar`:

https://github.com/llvm/llvm-project/blob/283273fa1e3be4a03f06a5efd08a8c818be981fd/clang/lib/CodeGen/CodeGenModule.cpp#L5232C17-L5243

The arguments except `D` can be omitted. And if we don't specify `Ty`, the 
function will try to deduce the type from `D`. And use the type to get or 
create a LLVM global in the above process. And the `Ty` arguments may not 
always be omitted, e.g., in 
https://github.com/llvm/llvm-project/blob/283273fa1e3be4a03f06a5efd08a8c818be981fd/clang/lib/CodeGen/CodeGenModule.cpp#L5484-L5564,
 we will try to deduce the LLVM type directly.

Then problem happens, sometimes we try to get or create the global variable by 
the AST type, but sometimes we try to get or create the **same** global 
variable by deduced type, and if the two types differs, we may be in the 
trouble of wild pointer.

(the two types are compatible: e.g., one is `struct { %another.struct}` with 
`%another.struct = { ptr }` and another  type is `{ { ptr } }`).

The solution or one workaround I got is, in 
`CodeGenModule::GetAddrOfGlobalVar`, if we didn't specify the `Ty` and we have 
the same variable, return the variable directly. I think it makes sense since 
if the `Ty` is not specified, it implies the caller doesn't care about it too 
much.

WDYT?



---
Full diff: https://github.com/llvm/llvm-project/pull/114948.diff


2 Files Affected:

- (modified) clang/lib/CodeGen/CodeGenModule.cpp (+8-1) 
- (modified) clang/test/CodeGen/attr-weakref2.c (+1-1) 


``````````diff
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index ba376f9ecfacde..9566cfb8d6e794 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -5233,11 +5233,18 @@ llvm::Constant *CodeGenModule::GetAddrOfGlobalVar(const 
VarDecl *D,
                                                   llvm::Type *Ty,
                                            ForDefinition_t IsForDefinition) {
   assert(D->hasGlobalStorage() && "Not a global variable");
+
+  StringRef MangledName = getMangledName(D);
+  llvm::GlobalValue *Entry = GetGlobalValue(MangledName);
   QualType ASTTy = D->getType();
+  LangAS AddrSpace = ASTTy.getAddressSpace();
+
+  if (Entry && !Ty && Entry->getAddressSpace() == 
getContext().getTargetAddressSpace(AddrSpace))
+    return Entry;
+
   if (!Ty)
     Ty = getTypes().ConvertTypeForMem(ASTTy);
 
-  StringRef MangledName = getMangledName(D);
   return GetOrCreateLLVMGlobal(MangledName, Ty, ASTTy.getAddressSpace(), D,
                                IsForDefinition);
 }
diff --git a/clang/test/CodeGen/attr-weakref2.c 
b/clang/test/CodeGen/attr-weakref2.c
index 114f048a851832..a67f906810faf3 100644
--- a/clang/test/CodeGen/attr-weakref2.c
+++ b/clang/test/CodeGen/attr-weakref2.c
@@ -33,7 +33,7 @@ int test4_h(void) {
 }
 int test4_f;
 
-// CHECK: @test5_f = external global i32
+// CHECK: @test5_f = extern_weak global i32
 extern int test5_f;
 static int test5_g __attribute__((weakref("test5_f")));
 int test5_h(void) {

``````````

</details>


https://github.com/llvm/llvm-project/pull/114948
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to