bevin-hansson wrote:

Hi @xedin! We've observed a difference downstream due to this patch and are 
curious whether this was intentional. It seems that the changes to how 
AttributedType is keyed (including the attribute) causes some type duplication 
when attributes are involved. For example, building this (reduced) program with 
`clang -target x86_64 -fsanitize=undefined`:
```
void a() {
  for (unsigned int b;; *(const unsigned int __attribute__((noderef)) *)*(const 
unsigned int __attribute__((noderef)) *)b)
    ;
}
```
(Ignore that there are dereferenced pointers with `deref`; the original repro 
had `address_space` but you don't get sanitizers for such pointers upstream)

Before this patch, there would only be a single type-info struct/string in the 
resulting assembly, but with the patch, there are now two identical ones:
```
        .type   .L__unnamed_3,@object           # @0
        .section        .rodata,"a",@progbits
        .p2align        4, 0x0
 .L__unnamed_3:
        .short  0                               # 0x0
        .short  10                              # 0xa
        .asciz  "'unsigned int const __attribute__((noderef))'"
        .size   .L__unnamed_3, 50
 
        .type   .L__unnamed_1,@object           # @1
        .data
        .p2align        4, 0x0
 .L__unnamed_1:
        .quad   .L.src
        .long   4                               # 0x4
        .long   73                              # 0x49
        .quad   .L__unnamed_3
        .byte   2                               # 0x2
        .byte   0                               # 0x0
        .zero   6
        .size   .L__unnamed_1, 32
 
-       .type   .L__unnamed_2,@object           # @2
+       .type   .L__unnamed_4,@object           # @2
+       .section        .rodata,"a",@progbits
+       .p2align        4, 0x0
+.L__unnamed_4:
+       .short  0                               # 0x0
+       .short  10                              # 0xa
+       .asciz  "'unsigned int const __attribute__((noderef))'"
+       .size   .L__unnamed_4, 50
+
+       .type   .L__unnamed_2,@object           # @3
+       .data
        .p2align        4, 0x0
 .L__unnamed_2:
        .quad   .L.src
        .long   4                               # 0x4
        .long   25                              # 0x19
-       .quad   .L__unnamed_3
+       .quad   .L__unnamed_4
        .byte   2                               # 0x2
        .byte   0                               # 0x0
        .zero   6
        .size   .L__unnamed_2, 32
```
This is possibly happening for the sanitizer emission due to the code in 
CodeGenFunction::EmitCheckTypeDescriptor:
{code}
  // Only emit each type's descriptor once.
  if (llvm::Constant *C = CGM.getTypeDescriptorFromMap(T))
    return C;
{code}
The two types are different for map purposes and type creation (since they have 
different syntactical Attrs) but the actual types are really the same.

I guess this is pretty rare, but it could cause some hefty duplication 
depending on what types are used and how. There might be other effects I don't 
know of either, but this was the noticeable one for us.

https://github.com/llvm/llvm-project/pull/108631
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to