[llvm-bugs] [Bug 122691] Crash at -O2: Assertion `isa(Val) && "cast() argument of incompatible type!"' failed.

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122691




Summary

Crash at -O2: Assertion `isa(Val) && "cast() argument of incompatible type!"' failed.




  Labels
  
crash-on-valid,
llvm:transforms
  



  Assignees
  
dtcxzyw
  



  Reporter
  
  dtcxzyw
  




Reproducer: https://godbolt.org/z/68MKbnE4W
```
; bin/opt -O2 reduced.ll -S
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@g_3 = internal unnamed_addr global i64 0
@g_127 = external local_unnamed_addr global [3 x i32]
@g_211 = external local_unnamed_addr global i32

define i8 @func_125() {
entry:
 %call42 = call ptr @func_178()
  ret i8 0
}

define ptr @func_178() {
entry:
  %0 = load i32, ptr @g_211, align 4
  %tobool158.not = icmp eq i32 %0, 0
  br i1 %tobool158.not, label %for.inc434, label %for.cond166.preheader

for.cond166.preheader:; preds = %entry
  br label %for.cond166

for.cond166: ; preds = %for.cond166.preheader, %for.cond166
  %1 = phi i64 [ %2, %for.cond166 ], [ poison, %for.cond166.preheader ]
  %2 = phi i64 [ %inc255, %for.cond166 ], [ 0, %for.cond166.preheader ]
  %cmp167 = icmp samesign ult i64 %2, 61
  %inc255 = add nuw nsw i64 %2, 1
  br i1 %cmp167, label %for.cond166, label %for.inc434.loopexit

for.inc434.loopexit: ; preds = %for.cond166
  %.lcssa = phi i64 [ %1, %for.cond166 ]
  store i64 %.lcssa, ptr @g_3, align 8
  br label %for.inc434

for.inc434:   ; preds = %for.inc434.loopexit, %entry
  store i32 1, ptr @g_211, align 4
  %3 = load i64, ptr @g_3, align 8
  %conv.i = trunc nuw nsw i64 %3 to i32
  %4 = load i32, ptr @g_127, align 4
  %and.i = and i32 %4, %conv.i
  %cmp.i = icmp eq i32 %and.i, 0
  %conv8.i = sext i32 %and.i to i64
  %sext = shl i64 %3, 32
  %conv9.i = ashr exact i64 %sext, 32
  %g_211.promoted5 = load i32, ptr @g_211, align 4
  br label %for.body447

for.body447: ; preds = %for.inc434, %cleanup552
  %p_181.addr.08 = phi i64 [ -13, %for.inc434 ], [ %conv564, %cleanup552 ]
  %and14.i1.lcssa67 = phi i32 [ %g_211.promoted5, %for.inc434 ], [ %and14.i.lcssa, %cleanup552 ]
 br label %if.then489

if.then489:   ; preds = %for.body447, %safe_div_func_int64_t_s_s.exit
  %l_317.03 = phi i32 [ 2, %for.body447 ], [ %sub, %safe_div_func_int64_t_s_s.exit ]
  %and14.i12 = phi i32 [ %and14.i1.lcssa67, %for.body447 ], [ %and14.i, %safe_div_func_int64_t_s_s.exit ]
  br i1 %cmp.i, label %safe_div_func_int64_t_s_s.exit, label %lor.lhs.false.i

lor.lhs.false.i: ; preds = %if.then489
  %div.i = sdiv i64 %conv8.i, %conv9.i
  %5 = trunc i64 %div.i to i16
  br label %safe_div_func_int64_t_s_s.exit

safe_div_func_int64_t_s_s.exit: ; preds = %lor.lhs.false.i, %if.then489
  %cond.i = phi i16 [ %5, %lor.lhs.false.i ], [ 0, %if.then489 ]
  %call12.i = tail call i16 @llvm.bswap.i16(i16 %cond.i)
  %conv13.i = zext i16 %call12.i to i32
 %and14.i = and i32 %and14.i12, %conv13.i
  %call26.i = load volatile ptr, ptr null, align 8
  %sub = add nsw i32 %l_317.03, -1
  %cmp482 = icmp sgt i32 %l_317.03, 0
  br i1 %cmp482, label %if.then489, label %cleanup552

cleanup552:   ; preds = %safe_div_func_int64_t_s_s.exit
  %and14.i.lcssa = phi i32 [ %and14.i, %safe_div_func_int64_t_s_s.exit ]
  %add.i = add nsw i64 %p_181.addr.08, 7
 %conv564 = and i64 %add.i, 255
  %6 = and i64 %p_181.addr.08, 255
 %cmp445.not = icmp eq i64 %6, 22
  br i1 %cmp445.not, label %for.end565, label %for.body447

for.end565:   ; preds = %cleanup552
  %and14.i.lcssa.lcssa = phi i32 [ %and14.i.lcssa, %cleanup552 ]
  store i32 %conv.i, ptr getelementptr inbounds nuw (i8, ptr @g_127, i64 8), align 4
  store i32 %and14.i.lcssa.lcssa, ptr @g_211, align 4
  ret ptr null
}

define ptr @func_183(i64 %p_185) {
entry:
  %call = call i64 @builtin_uaddl_overflow(i64 %p_185, i64 0)
 unreachable
}

define i64 @builtin_uaddl_overflow(i64 %x, i64 %y) {
entry:
  %0 = call { i64, i1 } @llvm.uadd.with.overflow.i64(i64 %x, i64 %y)
  %1 = extractvalue { i64, i1 } %0, 1
  %conv = zext i1 %1 to i64
 ret i64 %conv
}
```
```
opt: /root/llvm-project/llvm/include/llvm/Support/Casting.h:578: decltype(auto) llvm::cast(From*) [with To = llvm::Instruction; From = llvm::Value]: Assertion `isa(Val) && "cast() argument of incompatible type!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /opt/compiler-explorer/clang-assertions-trunk/bin/opt -o /app/output.s -S -O2 
1.	Running pass "function(float2int,lower-constant-intrinsics,loop(loop-rotate,loop-deletion),loop-distribute,inject-tli-mappings,loop-vectorize,infer-alignment,loop-load-elim,instcombine,simplifycfg,slp-vectorizer,vector-

[llvm-bugs] [Bug 122690] Clang-format 19 regression from 18

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122690




Summary

Clang-format 19 regression from 18




  Labels
  
clang-format
  



  Assignees
  
  



  Reporter
  
  sgh
  




Clang-format 19 generates this :

struct DLL_PUBLIC SomeClas{SomeClass(){}};

however if I remove DLL_PUBLIC I get this:

struct SomeClas {
	SomeClass() {}
};


This is my clang-format

---
Language:Cpp
# BasedOnStyle:  Google
AccessModifierOffset: -4
AlignAfterOpenBracket: Align
AlignArrayOfStructures: Left
AlignConsecutiveAssignments:
  Enabled: false
  AcrossEmptyLines: false
  AcrossComments:  false
 AlignCompound:   false
  PadOperators:true
AlignConsecutiveBitFields:
 Enabled: false
  AcrossEmptyLines: false
  AcrossComments: false
  AlignCompound:   false
  PadOperators: false
AlignConsecutiveDeclarations:
  Enabled: false
 AcrossEmptyLines: false
  AcrossComments:  false
  AlignCompound:   false
 PadOperators:false
AlignConsecutiveMacros:
  Enabled: false
 AcrossEmptyLines: false
  AcrossComments:  false
  AlignCompound:   false
 PadOperators:false
AlignEscapedNewlines: Left
AlignOperands: Align
AlignTrailingComments:
  Kind:Always
  OverEmptyLines: 0
AllowAllArgumentsOnNextLine: true
AllowAllParametersOfDeclarationOnNextLine: true
AllowShortBlocksOnASingleLine: Empty
AllowShortCaseLabelsOnASingleLine: false
AllowShortEnumsOnASingleLine: true
AllowShortFunctionsOnASingleLine: Inline
AllowShortIfStatementsOnASingleLine: Never
AllowShortLambdasOnASingleLine: All
AllowShortLoopsOnASingleLine: false
AlwaysBreakAfterDefinitionReturnType: None
AlwaysBreakAfterReturnType: None
AlwaysBreakBeforeMultilineStrings: true
AlwaysBreakTemplateDeclarations: Yes
AttributeMacros:
  - __capability
BinPackArguments: true
BinPackParameters: true
BitFieldColonSpacing: Both
BraceWrapping:
  AfterCaseLabel:  false
 AfterClass:  false
  AfterControlStatement: Never
  AfterEnum: false
  AfterExternBlock: false
  AfterFunction:   false
  AfterNamespace: false
  AfterObjCDeclaration: false
  AfterStruct: false
 AfterUnion:  false
  BeforeCatch: false
  BeforeElse:  false
 BeforeLambdaBody: false
  BeforeWhile: false
  IndentBraces:false
 SplitEmptyFunction: true
  SplitEmptyRecord: true
  SplitEmptyNamespace: true
BreakAfterAttributes: Never
BreakAfterJavaFieldAnnotations: false
BreakArrays: true
BreakBeforeBinaryOperators: None
BreakBeforeConceptDeclarations: Always
BreakBeforeBraces: Attach
BreakBeforeInlineASMColon: OnlyMultiline
BreakBeforeTernaryOperators: true
BreakConstructorInitializers: BeforeColon
BreakInheritanceList: BeforeColon
BreakStringLiterals: true
ColumnLimit: 120
CommentPragmas: '^ IWYU pragma:'
CompactNamespaces: false
ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
Cpp11BracedListStyle: true
DerivePointerAlignment: false
DisableFormat: false
EmptyLineAfterAccessModifier: Never
EmptyLineBeforeAccessModifier: LogicalBlock
ExperimentalAutoDetectBinPacking: false
FixNamespaceComments: true
ForEachMacros:
  - foreach
  - Q_FOREACH
  - BOOST_FOREACH
IfMacros:
  - KJ_IF_MAYBE
IncludeBlocks: Regroup
IncludeCategories:
  - Regex:   '^'
 Priority:2
SortPriority:0
CaseSensitive:   false
  - Regex:   '^<.*\.h>'
Priority:1
SortPriority: 0
CaseSensitive:   false
  - Regex:   '^<.*'
Priority: 2
SortPriority:0
CaseSensitive:   false
  - Regex: '.*'
Priority:3
SortPriority:0
CaseSensitive: false
IncludeIsMainRegex: '([-_](test|unittest))?$'
IncludeIsMainSourceRegex: ''
IndentAccessModifiers: false
IndentCaseBlocks: false
IndentCaseLabels: true
IndentExternBlock: AfterExternBlock
IndentGotoLabels: true
IndentPPDirectives: None
IndentRequiresClause: true
IndentWidth: 4
IndentWrappedFunctionNames: false
InsertBraces: false
InsertNewlineAtEOF: false
InsertTrailingCommas: None
IntegerLiteralSeparator:
  Binary:  0
  BinaryMinDigits: 0
 Decimal: 0
  DecimalMinDigits: 0
  Hex: 0
 HexMinDigits:0
_javascript_Quotes: Leave
_javascript_WrapImports: true
KeepEmptyLinesAtTheStartOfBlocks: false
LambdaBodyIndentation: Signature
LineEnding:  DeriveLF
MacroBlockBegin: ''
MacroBlockEnd: ''
MaxEmptyLinesToKeep: 1
NamespaceIndentation: None
ObjCBinPackProtocolList: Never
ObjCBlockIndentWidth: 2
ObjCBreakBeforeNestedBlockParam: true
ObjCSpaceAfterProperty: false
ObjCSpaceBeforeProtocolList: true
PackConstructorInitializers: NextLine
PenaltyBreakAssignment: 2
PenaltyBreakBeforeFirstCallParameter: 1
PenaltyBreakComment: 300
PenaltyBreakFirstLessLess: 120
PenaltyBreakOpenParenthesis: 0
PenaltyBreakString: 1000
PenaltyBreakTemplateDeclaration: 10
PenaltyExcessCharacter: 100
PenaltyIndentedWhitespace: 0
PenaltyReturnTypeOnItsOwnLine: 200
PointerAlignment: Left
PPIndentWidth:   -1
Quali

[llvm-bugs] [Bug 122702] Passing Constructor Expressions to conditional operator causes data loss while using clang compiler.

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122702




Summary

Passing Constructor Expressions to conditional operator causes data loss while using clang compiler.




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  bhavayc
  




I use the clang compiler present at my company's common hub to create an AST dump of generated C/C++ code. It works fine for most of my use cases. However, recently, I discovered an edge case where the Clang compiler was not dumping all the information for constructor expressions used with a conditional operator. The code is compiling though. Refer to the below code _expression_, where a constructor _expression_ MyClass(2, "World") is passed to the false condition of the conditional operator in C++.  In the below _expression_ the information about the constructor passed to the if true condition MyClass(1) is stored properly **BUT** **in the Linux platform**, the constructor data for the if false condition gets lost. I want to see if we are missing something in Linux or should add additional flags for linux. Everything works fine in Windows.
Code _expression_ in question
**MyClass cObj = ((foo > threshold) ? MyClass(1) : _MyClass(2, "World"))_;**

This is my source cpp file:
[doit.cpp.txt](https://github.com/user-attachments/files/18396637/doit.cpp.txt)

The command used to invoke the clang compiler:
 -fsyntax-only -fno-color-diagnostics -Xclang -ast-dump=json -x c++ -fopenmp ".cpp" -v > ",.json"

The Lines highlighted in the below image refer to the data that is being lost on linux for the constructor expressions
![Image](https://github.com/user-attachments/assets/05f40f1d-40b0-4380-ad08-6a7d5aa14ce6)


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122685] Adding getNumPredecessors() for mlir::Block

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122685




Summary

Adding getNumPredecessors() for mlir::Block




  Labels
  
mlir
  



  Assignees
  
  



  Reporter
  
  badumbatish
  




I'm browsing the current docs for `mlir::Block` and I'm not seeing a method for `getNumPredecessors()` to compliment `getPredecessors()`, as opposed to the existing getNumSuccessors() to compliment getSuccessors().

I'd love to open a PR for this myself. `makslevental` on Discord suggested the impl is `std::distance(preds_begin, preds_end)` but I also would want further confirmation for the impl


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122695] [mlir] remove-dead-values pass removes value that should be live

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122695




Summary

[mlir] remove-dead-values pass removes value that should be live




  Labels
  
mlir
  



  Assignees
  
  



  Reporter
  
  azecevicTT
  




Here is an example IR:

```mlir
module {
  func.func public @main(%arg0: f32) -> f32 {
%0 = call @f(%arg0) : (f32) -> f32
return %0 : f32
 }

  func.func private @f(%arg0: f32) -> f32 {
%0 = arith.addf %arg0, %arg0 : f32
return %0 : f32
  }
}
```

Running `mlir-opt -remove-dead-values` on it gives an error:

```
bug.mlir:9:5: error: null operand found
return %0 : f32
^
bug.mlir:9:5: note: see current operation: "func.return"(<>) : (<>) -> ()
```

Changing the order of the `@main` and `@f` produces a valid result.

Git version: https://github.com/llvm/llvm-project/commit/2914ba1c01fdc496082197abf7cd35e2af526634
System: `Ubuntu 22.04`


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122704] Global ISel 64 bit left-shift+add results in 32 bit adds

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122704




Summary

Global ISel 64 bit left-shift+add results in 32 bit adds




  Labels
  
llvm:globalisel,
mlir:amdgpu
  



  Assignees
  
  



  Reporter
  
  tpopp
  




I'm seeing instructions like

```
%i12 = sext i32 %i11 to i64
%i62 = shl i64 %i12, 1
%i64 = add i64 0, %i62
%i65 = add i64 %i64, 1
```

resulting in a sequence of 32 bit adds with global isel, while without results in a 64 bit shl+add. I assume this is worse performance. Is there a reason to prefer this or expect the same performance?

I'm using `llc -O3 -march=amdgcn -mcpu=gfx942  -mtriple amdgcn-amd-hmcsa ./reduced.ll -global-isel={true,false} -o -` in looking at this.

[reduced.gisel.txt](https://github.com/user-attachments/files/18396831/reduced.gisel.txt)
[reduced.sdisel.txt](https://github.com/user-attachments/files/18396830/reduced.sdisel.txt)
[reduced.txt](https://github.com/user-attachments/files/18396829/reduced.txt)


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122703] Global ISel does not preload kernel arguments

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122703




Summary

Global ISel does not preload kernel arguments




  Labels
  
llvm:globalisel,
mlir:amdgpu
  



  Assignees
  
  



  Reporter
  
  tpopp
  




As I understand this, while Global ISel does use AMDGPULowerKernelArguments to annotate arguments, it does not have an implementation like here https://github.com/llvm/llvm-project/blob/3efe83291f07dcf2423065e63b826407d1ec2609/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp#L256

I'm not seeing much performance degradation in the case that I'm looking at, but it seems like missing functionality(?)

cc @kerbowa who seems to be working on pieces of this


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122846] `DILocation::getMergedLocation` produces invalid result while merging locations from `DILexicalBlockFile`

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122846




Summary

`DILocation::getMergedLocation` produces invalid result while merging locations from `DILexicalBlockFile`




  Labels
  
wrong-debug,
debuginfo
  



  Assignees
  
  



  Reporter
  
  asl
  




Consider the following input:
```c
# 1 "1.c" 1
# 1 "1.c" 2
int foo(int a) {
  int i = 0;
  if ((a & 1) == 1) {
a -= 1;
# 1 "m.c" 1
# 40 "m.c"
i += a;
i -= 10*a;
i *= a*a;
# 6 "1.c" 2
 } else {
a += 3;
# 1 "m.c" 1
# 40 "m.c"
i += a;
i -= 10*a;
i *= a*a;
# 9 "1.c" 2
 }
  return i;
}
```

The LLVM IR looks like as follows:
```llvm
...
!10 = !DIFile(filename: "1.c", directory: "/Users/asl/Projects/llvm/build/bin")
...
!27 = !DIFile(filename: "m.c", directory: "/Users/asl/Projects/llvm/build/bin")
...
!39 = distinct !DILexicalBlock(scope: !20, file: !10, line: 6, column: 9)
!40 = !DILocation(line: 40, column: 6, scope: !41)
!41 = !DILexicalBlockFile(scope: !39, file: !27, discriminator: 0)
```

Note that here we're having two nested scopes from different files.

The resulting code looks like as:
```
.loc1 42 7 prologue_end is_stmt 1   ; 1.c:42:7
mul w9, w8, w8
.loc1 41 3  ; 1.c:41:3
mul w8, w9, w8
Ltmp2:
.loc1 42 3  ; 1.c:42:3
add w8, w8, w8, lsl #3
neg w0, w8
Ltmp3:
;DEBUG_VALUE: foo:i <- $w0
.loc1 10 3  ; 1.c:10:3
ret
```
and:
```llvm
!10 = !DIFile(filename: "1.c", directory: "/Users/asl/Projects/llvm/build/bin")
...
!19 = distinct !DILexicalBlock(scope: !9, file: !10, line: 3, column: 7)
...
!22 = !DILocation(line: 42, column: 7, scope: !19)
```
So, after merging we end with invalid location info: line numbers are from one file, but file itself is different – from enclosing scope.

Indeed, it seems that `DILocation::getMergedLocation` completely ignores possibility that file could be changed and recreates the scope with invalid location info. It already has the code that handles `inlinedAt`. It think `DILexicalBlockFile` should be treated in a same way as this is essentially "textual inclusion".


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122852] [clang-include-cleaner] suggest to insert header where base class is defined

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122852




Summary

[clang-include-cleaner] suggest to insert header where base class is defined




  Labels
  
false-positive,
clang-include-cleaner
  



  Assignees
  
  



  Reporter
  
  EugeneZelenko
  




`clang-include-cleaner` suggest to include header where base class is defined as well as forward declaration when base/derived class headers should be enough.

`clang-tidy misc-include-cleaner` output:

```
clang-tidy -checks="-*,misc-include-cleaner" Forward.cpp 
5 warnings generated.
Forward.cpp:3:39: warning: no header providing "detail::Forward" is directly included [misc-include-cleaner]
2 | 
 3 | void Test::do_something(const detail::Forward& forward)
  | ^
Forward.cpp:5:5: warning: no header providing "Base" is directly included [misc-include-cleaner]
5 | Base::do_something(forward);
  | ^
```

I tried `clang-include-cleaner` from `main` (7d8b4eb0ead277f41ff69525ed807f9f6e227f37).

Base class header file (`Base.h`):

```
#pragma once

namespace detail
{
class Forward;
}

class Base
{
public:
virtual void do_something(const detail::Forward& forward);
};
```

Header file (`Forward.h`):

```
#pragma once

#include "Base.h"

class Test : public Base
{
public:
void do_something(const detail::Forward& forward) override;
};
```

Source file (`Forward.cpp`):

```
#include "Forward.h"

void Test::do_something(const detail::Forward& forward)
{
 Base::do_something(forward);
}
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122793] identifier `z` crashes LLVM

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122793




Summary

identifier `z` crashes LLVM




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  bottle2
  




check out this error report: https://github.com/emscripten-core/emscripten/issues/23383
maybe you have a regression? emcc says to be using clang version 20.0.0git, but I'm using clang 19.1.6 and no such problem exists.

```
$ clang -v
clang version 19.1.6
Target: x86_64-w64-windows-gnu
Thread model: posix
InstalledDir: C:/msys64/clang64/bin
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122797] clang crash initializing invalid C union with an empty struct

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122797




Summary

clang crash initializing invalid C union with an empty struct




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  justincady
  




Tested with clang 19. Here's the reproducer:

```
typedef union {
struct {} first;
struct { int x; } second;
} foo;

void bar() {
foo f = {};
}
```

[Compiler Explorer](https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:'1',fontScale:14,fontUsePx:'0',j:1,lang:___c,selection:(endColumn:2,endLineNumber:8,positionColumn:1,positionLineNumber:1,selectionStartColumn:2,selectionStartLineNumber:8,startColumn:1,startLineNumber:1),source:'typedef+union+%7B%0Astruct+%7B%7D+first%3B%0Astruct+%7B+int+x%3B+%7D+second%3B%0A%7D+foo%3B%0A%0Avoid+bar()+%7B%0Afoo+f+%3D+%7B%7D%3B%0A%7D'),l:'5',n:'0',o:'C+source+%231',t:'0')),k:33.336,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:cclang1910,filters:(b:'0',binary:'1',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'0',trim:'1',verboseDemangling:'0'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:___c,libs:!(),options:'',overrides:!(),selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:1),l:'5',n:'0',o:'+x86-64+clang+19.1.0+(Editor+%231)',t:'0')),k:33.336,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:output,i:(compilerName:'x86-64+clang+19.1.0',editorid:1,fontScale:14,fontUsePx:'0',j:1,wrap:'1'),l:'5',n:'0',o:'Output+of+x86-64+clang+19.1.0+(Compiler+%231)',t:'0')),k:33.33,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4)

Note that `first` is an empty struct, which I understand to be UB:

> The presence of a struct-declaration-list in a struct-or-union-specifier declares a new type,
within a translation unit. The struct-declaration-list is a sequence of declarations for the
members of the structure or union. If the struct-declaration-list does not contain any
named members, either directly or via an anonymous structure or anonymous union, the
behavior is undefined. The type is incomplete until immediately after the } that
terminates the list, and complete thereafter.
>
> https://www.open-std.org/JTC1/SC22/WG14/www/docs/n1570.pdf


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122792] [DirectX] Add dx.CBuffer handling to DXILResourceAccess pass

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122792




Summary

[DirectX] Add dx.CBuffer handling to DXILResourceAccess pass




  Labels
  
backend:DirectX
  



  Assignees
  
  



  Reporter
  
  hekota
  




Codegen for constant buffers generates `llvm.dx.resource.getpointer` and `getelementptr` intrinsics to access the constant values. DXILResourceAccess pass needs to translate these to `llvm.dx.resource.load.cbuffer` intrinsic calls.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122789] clang+llvm-19.1.0-x86_64-pc-windows-msvc.tar.xz not signed, no .sig file

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122789




Summary

clang+llvm-19.1.0-x86_64-pc-windows-msvc.tar.xz not signed, no .sig file




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  RobertBaruch
  




The Windows binary for release 19.1.0 does not have a .sig file, meaning that this is a vector of attack. Please add a .sig file!


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122791] CodeGen for `ConstantBuffer` resource class

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122791




Summary

CodeGen for `ConstantBuffer` resource class




  Labels
  
HLSL
  



  Assignees
  
  



  Reporter
  
  hekota
  







___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122805] llvm-readelf: Invalid JSON on PE (COFF) object

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122805




Summary

llvm-readelf: Invalid JSON on PE (COFF) object




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  sliedes
  




Similar to, but distinct from, #0

This is on LLVM 19.1.5 on NixOS.

Doing `llvm-readelf --elf-outputs-style=JSON` on a PE object (in this case an UEFI executable from a BIOS dump) produces invalid JSON. A small PE object attached to demonstrate.

```
$ file 137_PspDxe.efi
137_PspDxe.efi: PE32+ executable for EFI (boot service driver), x86-64, 4 sections

$ llvm-readelf --elf-output-style=JSON 137_PspDxe.efi
[
"File":"137_PspDxe.efi","Format":"COFF-x86-64","Arch":"x86_64","AddressSize":"64bit"]
$ llvm-readelf --version
LLVM (http://llvm.org/):
  LLVM version 19.1.5
 Optimized build.
```

The output is not valid JSON because key:value can only exist in an object (`{}`), and here it's in a list.

[137_PspDxe.efi.zip](https://github.com/user-attachments/files/18402591/137_PspDxe.efi.zip)


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122840] Clang incorrectly adds tailcalls after `setjmp(...)` with `-fno-builtin`

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122840




Summary

Clang incorrectly adds tailcalls after `setjmp(...)` with `-fno-builtin`




  Labels
  
clang:codegen
  



  Assignees
  
  



  Reporter
  
  alanzhao1
  




Originally reported by Chromium at https://g-issues.chromium.org/issues/380508212

Example:

```c
#include 

struct JpegCommon {
  jmp_buf jmpbuf;
  int cinfo;
};

int jpeg_start_decompress(int *);

int jpeg_common_start_decompress(struct JpegCommon* jpeg_common) {
  if (setjmp(jpeg_common->jmpbuf) == -1) {
 return 0;
  }
  return jpeg_start_decompress(&jpeg_common->cinfo);
}
```

If this file is compiled with `-fno-builtin`, then Clang will optimize the call to `jpeg_start_decompress(...)` as a tailcall:

```asm
jpeg_common_start_decompress:
pushrbx
 mov rbx, rdi
call_setjmp@PLT
cmp eax, -1
 je  .LBB0_1
add rbx, 200
mov rdi, rbx
 pop rbx
jmp jpeg_start_decompress@PLT
.LBB0_1:
 xor eax, eax
pop rbx
ret
```

This is incorrect. Control flow may resume at `setjmp(...)` from another function. Because of this, the contents of the execution stack must be preserved. The tail call optimization shown above is incorrect because `jpeg_start_decompress(...)` may mess up the stack for when we later enter `jpeg_common_start_decompress(...)` at `setjmp(...)` via a `longjmp(..)` call.

This incorrect codegen only occurs with `-fno-builtin` - if we don't pass `-fno-builtin`, Clang correctly emits a call instruction (thereby preserving the stack).

godbolt: https://godbolt.org/z/1anGj6e3s


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122850] [clang-include-cleaner] suggest to insert header in source file when headers with base/derived class would be enough

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122850




Summary

[clang-include-cleaner] suggest to insert header in source file when headers with base/derived class would be enough




  Labels
  
false-positive,
clang-include-cleaner
  



  Assignees
  
  



  Reporter
  
  EugeneZelenko
  




`clang-include-cleaner` suggest to include header with forward declaration when headers with base/derived classes should be enough.

`clang-tidy misc-include-cleaner` output:

```
clang-tidy -checks="-*,misc-include-cleaner" Forward.cpp 
4 warnings generated.
Forward.cpp:3:39: warning: no header providing "detail::Forward" is directly included [misc-include-cleaner]
2 | 
 3 | void Test::do_something(const detail::Forward& /*forward*/)
  | ^
```

I tried `clang-include-cleaner` from `main` (7d8b4eb0ead277f41ff69525ed807f9f6e227f37).

Base class header file (`Base.h`):

```
#pragma once

namespace detail
{
class Forward;
}

class Base
{
public:
virtual void do_something(const detail::Forward& forward);
};
```

Header file (`Forward.h`):

```
#pragma once

#include "Base.h"

class Test : public Base
{
public:
void do_something(const detail::Forward& forward) override;
};
```

Source file (`Forward.cpp`):

```
#include "Forward.h"

void Test::do_something(const detail::Forward& /*forward*/)
{
}
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122847] [clang-include-cleaner] suggest to insert header in source file when header would be enough

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122847




Summary

[clang-include-cleaner] suggest to insert header in source file when header would be enough




  Labels
  
false-positive,
clang-include-cleaner
  



  Assignees
  
  



  Reporter
  
  EugeneZelenko
  




`clang-include-cleaner` suggest to include `cstddef` because of constructor (or may be method) implementation when it its inclusion in header should be enough.

`clang-tidy misc-include-cleaner` output:

```
clang-tidy -checks="-*,misc-include-cleaner" Method.cpp 
1 warning generated.
Method.cpp:3:18: warning: no header providing "size_t" is directly included [misc-include-cleaner]
2 | 
3 | Test::Test(const size_t size)
  |  ^
```

I tried `clang-include-cleaner` from `main` (7d8b4eb0ead277f41ff69525ed807f9f6e227f37).

Header file (`Method.h`):

```
#pragma once

#include 

class Test
{
public:
Test(const size_t size);

protected:
size_t size_;
};
```

Source file (`Method.cpp`):

```
#include "Method.h"

Test::Test(const size_t size)
:   size_(size)
{
}
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122851] Convert double to ConstantFP

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122851




Summary

Convert double to ConstantFP




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  tianboh
  




I want to generate ConstantFP using C++ double. For some reasons, the double is obtained in runtime, and it may not suits “simple constant value”
> /// This returns a ConstantFP, or a vector containing a splat of a ConstantFP,
  /// for the specified value in the specified type. This should only be used
  /// for simple constant values like 2.0/1.0 etc, that are known-valid both as
  /// host double and as the target format.
  static Constant *get(Type *Ty, double V);

For example, i have a double 233.33, in runtime, the IEEE 754 representation is 0x406D2A8F6000, if I print it, it is 233.33000183105469. Check [IEEE 754 calculator](https://weitz.de/ieee/) for fun. I know floating point is not continuous, so some values may not have exact representation in IEEE 754 standard. They are encoded to nearest valid doubles instead.

I want to keep some fractional part, so rounding is not enough.  But by sacrificing some precision, can I convert my double to ```ConstantFP```?


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122699] [llvm] CMake installscripts for OCAML-bindings don't use CMAKE_INSTALL_PREFIX

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122699




Summary

[llvm] CMake installscripts for OCAML-bindings don't use CMAKE_INSTALL_PREFIX




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  jdumke
  




Hi, there.
The cmake_install.cmake scripts for the OCAML-bindings just set CMAKE_INSTALL_PREFIX, but don't use it anywhere and try always to install to "/usr/" , which isn't suitable for user only setups. 

Example given:
[cmake_install.cmake.txt](https://github.com/user-attachments/files/18396399/cmake_install.cmake.txt)

The problem occures in line 54 where "/usr/" is hard coded instead of use "${CMAKE_INSTALL_PREFIX}",.

Greets. 


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122687] [Clang] Add support for -fhardened

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122687




Summary

[Clang] Add support for -fhardened




  Labels
  
clang:driver
  



  Assignees
  
  



  Reporter
  
  nikic
  




GCC supports an `-fhardened` flag that enables a number of hardening options with one flag. Quoting from https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html.

> Enable a set of flags for C and C++ that improve the security of the generated code without affecting its ABI. The precise flags enabled may change between major releases of GCC, but are currently:
> 
> -D_FORTIFY_SOURCE=3
> -D_GLIBCXX_ASSERTIONS
> -ftrivial-auto-var-init=zero
> -fPIE  -pie  -Wl,-z,relro,-z,now
> -fstack-protector-strong
> -fstack-clash-protection
> -fcf-protection=full (x86 GNU/Linux only)
> 
> The list of options enabled by -fhardened can be generated using the --help=hardened option.
> 
> When the system glibc is older than 2.35, -D_FORTIFY_SOURCE=2 is used instead.
> 
> This option is intended to be used in production builds, not merely in debug builds.
> 
> Currently, -fhardened is only supported on GNU/Linux targets.
> 
> -fhardened only enables a particular option if it wasn’t already specified anywhere on the command line. For instance, -fhardened -fstack-protector will only enable -fstack-protector, but not -fstack-protector-strong.

It would be nice if Clang also accepted the flag.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122701] clang-17 can't find cassert on Ubuntu 22.04 with libstdc++-11-dev installed

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122701




Summary

clang-17 can't find cassert on Ubuntu 22.04 with libstdc++-11-dev installed




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  ggladilov
  




OS: Ubuntu 22.04

If I build my C++17 project with clang-tidy-17 and libstdc++-11-dev installed I get following error:

`
error: 'cassert' file not found [clang-diagnostic-error]
`

`find /usr/include -type f -name cassert` produces:

`
/usr/include/boost/compatibility/cpp_c_headers/cassert
/usr/include/c++/11/cassert
`

I found similar [GitHub issue](https://github.com/llvm/llvm-project/issues/59738) filed before that describes the same issue using clang directly, so I conclude it's not about clang-tidy, but rather clang.

Note: build works perfectly fine with GCC.

The same solution as in the issue above works for me: installing libstdc++-12-dev. Even though libstdc++ is GNU/GCC project and outside of LLVM control, I don't understand why clang can find a system header file in one version of libstdc++ and can't in another, even though the file is present in both.

Is there a requirement to use specific version(s) of libstdc++ with particular release of clang?




___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122739] Request Commit Access For basioli

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122739




Summary

Request Commit Access For basioli




  Labels
  
infra:commit-access-request
  



  Assignees
  
  



  Reporter
  
  basioli-k
  




### Why Are you requesting commit access ?
I am on the MLIR integration team at google


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122745] [libc][build] gcc errors related to FP16

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122745




Summary

[libc][build] gcc errors related to FP16




  Labels
  
libc
  



  Assignees
  
  



  Reporter
  
  Sh0g0-1758
  




After patching #122500, we ran into this new issue: 

```
$ gcc --version | head -n1 
gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
$ cmake ../runtimes -G Ninja -DLLVM_ENABLE_RUNTIMES="libc" -DCMAKE_BUILD_TYPE=Release -DLLVM_LIBC_FULL_BUILD=ON
$ ninja libc-unit-tests
[482/6748] Building CXX object libc/src/math/generic/CMakeFiles/libc.src.math.generic.sinf16.__NO_FMA_OPT.__NO_ROUND_OPT.__internal__.dir/sinf16.cpp.o
...
libc/src/math/generic/sinf16.cpp:58:41:   required from here
libc/src/__support/FPUtil/except_value_utils.h:84:20: error: conversion from ‘int’ to ‘__llvm_libc_20_0_0_git::fputil::ExceptValues<_Float16, 4>::StorageType’ {aka ‘short unsigned int’} may change value [-Werror=conversion]
   84 |   out_bits += sign ? values[i].rnd_downward_offset
  |   ~^~~
   85 |: values[i].rnd_upward_offset;
  |~
libc/src/__support/FPUtil/except_value_utils.h:88:20: error: conversion from ‘int’ to ‘__llvm_libc_20_0_0_git::fputil::ExceptValues<_Float16, 4>::StorageType’ {aka ‘short unsigned int’} may change value [-Werror=conversion]
   88 |   out_bits += sign ? values[i].rnd_upward_offset
  |   ~^
   89 |: values[i].rnd_downward_offset;
  |~~~
cc1plus: all warnings being treated as errors
...
```

cc: @nickdesaulniers @lntue 


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122760] [NVPTX] atomicrmw on <4 x float> relies on __atomic_compare_exchange_16

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122760




Summary

[NVPTX] atomicrmw on <4 x float> relies on __atomic_compare_exchange_16




  Labels
  
backend:NVPTX
  



  Assignees
  
Artem-B
  



  Reporter
  
  Artem-B
  




NVPTX currently lowers atomixrmw on `<4 x float>` as a call to `__atomic_compare_exchange_16` which does not exist on the GPU:
https://godbolt.org/z/ovf4cqKK5

Newer GPUs do have support for vectorized atomic ops on some data types, but on the older GPUs they must be lowered without relying on runtime.




___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122813] Builds in release branch failing.

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122813




Summary

Builds in release branch failing.




  Labels
  
new issue
  



  Assignees
  
boomanaiden154
  



  Reporter
  
  boomanaiden154
  




The release branch needs the same fixes as #11.

We should figure out a better strategy to version the CI container in the future so that this doesn't happen. Just versioning based on the LLVM version and pinning the release branches to the release version container probably makes a decent amount of sense, but we will have to see.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122815] [clang] Crash with -std=gnu++20 and lambda.

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122815




Summary

[clang] Crash with -std=gnu++20 and lambda.




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  ayermolo
  




I saw a crash in clang that is build with assertions in our internal codebase.
I was able to reduce it to a small example:

```
enum class Signal2 {
 ENUM1,
 ENUM2,
};
void func() {
  const auto funcL =
 [](const Signal2(&srcSignals)[]) {};
   funcL({Signal2::ENUM1, Signal2::ENUM2});
}
```

Full stack trace:

```
clang++: /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/CGCall.cpp:4536: void clang::CodeGen::CodeGenFunction::EmitCallArgs(CallArgList &, PrototypeWrapper, llvm::iterator_range, AbstractCallee, unsigned int, EvaluationOrder): Assertion `(isGenericMethod || Ty->isVariablyModifiedType() || Ty.getNonReferenceType()->isObjCRetainableType() || getContext() .getCanonicalType(Ty.getNonReferenceType()) .getTypePtr() == getContext().getCanonicalType((*Arg)->getType()).getTypePtr()) && "type mismatch in call argument!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/ayermolo/local/llvm-build-upstream-release/bin/clang++ main.cpp -S -emit-llvm -o main.o -std=gnu++20
1.	 parser at end of file
2.	main.cpp:5:6: LLVM IR generation of declaration 'func'
3.	main.cpp:5:6: Generating code for declaration 'func'
 #0 0x5557c7baa498 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x5557c7ba7fde llvm::sys::RunSignalHandlers() /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x5557c7b193f6 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5
 #3 0x5557c7b193f6 CrashRecoverySignalHandler(int) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:390:51
 #4 0x7f2098a3e730 __restore_rt (/usr/lib64/libc.so.6+0x3e730)
 #5 0x7f2098a8bacc __pthread_kill_implementation (/usr/lib64/libc.so.6+0x8bacc)
 #6 0x7f2098a3e686 gsignal (/usr/lib64/libc.so.6+0x3e686)
 #7 0x7f2098a28833 abort (/usr/lib64/libc.so.6+0x28833)
 #8 0x7f2098a2875b _nl_load_domain.cold (/usr/lib64/libc.so.6+0x2875b)
 #9 0x7f2098a373c6 (/usr/lib64/libc.so.6+0x373c6)
#10 0x5557c7e64f96 clang::QualType const* std::__find_if)::$_0>>(clang::QualType const*, clang::QualType const*, __gnu_cxx::__ops::_Iter_pred)::$_0>, std::random_access_iterator_tag) /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/stl_algobase.h:2089:22
#11 0x5557c7e64f96 clang::QualType const* std::__find_if)::$_0>>(clang::QualType const*, clang::QualType const*, __gnu_cxx::__ops::_Iter_pred)::$_0>) /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/stl_algobase.h:2117:14
#12 0x5557c7e64f96 clang::QualType const* std::find_if)::$_0>(clang::QualType const*, clang::QualType const*, hasInAllocaArgs(clang::CodeGen::CodeGenModule&, clang::CallingConv, llvm::ArrayRef)::$_0) /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/stl_algo.h:3910:14
#13 0x5557c7e64f96 bool std::none_of)::$_0>(clang::QualType const*, clang::QualType const*, hasInAllocaArgs(clang::CodeGen::CodeGenModule&, clang::CallingConv, llvm::ArrayRef)::$_0) /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/stl_algo.h:471:24
#14 0x5557c7e64f96 bool std::any_of)::$_0>(clang::QualType const*, clang::QualType const*, hasInAllocaArgs(clang::CodeGen::CodeGenModule&, clang::CallingConv, llvm::ArrayRef)::$_0) /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/stl_algo.h:490:15
#15 0x5557c7e64f96 bool llvm::any_of&, hasInAllocaArgs(clang::CodeGen::CodeGenModule&, clang::CallingConv, llvm::ArrayRef)::$_0>(llvm::ArrayRef&, hasInAllocaArgs(clang::CodeGen::CodeGenModule&, clang::CallingConv, llvm::ArrayRef)::$_0) /home/ayermolo/local/upstream-llvm/llvm-project/llvm/include/llvm/ADT/STLExtras.h:1747:10
#16 0x5557c7e64f96 hasInAllocaArgs(clang::CodeGen::CodeGenModule&, clang::CallingConv, llvm::ArrayRef) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/CGCall.cpp:4466:10
#17 0x5557c7e64f96 clang::CodeGen::CodeGenFunction::EmitCallArgs(clang::CodeGen::CallArgList&, clang::CodeGen::CodeGenFunction::PrototypeWrapper, llvm::iterator_range>, clang::CodeGen::CodeGenFunction::AbstractCallee, unsigned int, clang::CodeGen::CodeGenFunction::EvaluationOrder) /home/ayermolo/local/upstream-llvm/llvm-project/clang/lib/CodeGen/CGCall.cpp:4585:7
#18 0x5557c8149ce8 commonEmitCXXMemberOrOperatorCall(clang::CodeGen::

[llvm-bugs] [Bug 122819] [Clang] Consider adding `__builtin_rotate{left, right}g`

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122819




Summary

[Clang] Consider adding `__builtin_rotate{left,right}g`




  Labels
  
enhancement,
clang
  



  Assignees
  
  



  Reporter
  
  philnik777
  




I don't think there is much of a reason not to add them and would avoid a long list of `if constexpr` just for clang to go through the same code gen path in the end. A generic version would also add support for other integral types like `_BitInt` and `__int128`.

P.S. I don't really understand why this hasn't been done in the first place. Now we just have a bunch of `g` version for all the builtins as well as numbered ones without any benefit I can see.



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122823] [Flang] Compilation abnormally terminates when using concat-op as an argument to adjustr intrinsic function in where construct

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122823




Summary

[Flang] Compilation abnormally terminates when using concat-op as an argument to adjustr intrinsic function in where construct




  Labels
  
flang:ir
  



  Assignees
  
  



  Reporter
  
  ohno-fj
  




```
Version of flang : 20.0.0(f4230b4332262dffb0bd3b7a2f8d6deb2e96488e)/AArch64
```

When using `concat-op (//)` as an argument to `adjustr` intrinsic function in `where` construct, a compilation terminates abnormally (Lowering to LLVM IR failed). 
Compilation ends normally in the following cases:
- Use the `adjustr` intrinsic function outside `where` construct  
  The above program is `sngg4151_21.F`.
- Do not use `concat-op` in an argument of `adjustr` intrinsic function  
  The above program is `sngg4151_22.F`.

The following are the test program, Flang, Gfortran and ifx compilation/execution result.

sngg4151_20.F:
```fortran
  program main
  logical ,dimension(1):: mask=.true.
  character(len=2),dimension(1):: d1="a "
  character(len=4),dimension(1):: d4
where (mask)
 d4=adjustr(d1//d1)
end where
write(6,*) "d4 =", d4
 end
```

```
$ flang sngg4151_20.F
error: loc("/work/home/ohno/CT/test/fort/tp/reproducerJ/MCS/wsf/sngg4151_20.F":6:11): 'fir.convert' op invalid type conversion'!fir.ref>' / '!fir.boxchar<1>'
error: Lowering to LLVM IR failed
error: loc("/work/home/ohno/CT/test/fort/tp/reproducerJ/MCS/wsf/sngg4151_20.F":1:7): cannot be converted to LLVM IR: missing `LLVMTranslationDialectInterface` registration for dialect for op: func.func
error: failed to create the LLVM module
$
```

```
$ gfortran sngg4151_20.F; ./a.out
 d4 = a a
$
```

```
$ ifx sngg4151_20.F; ./a.out
 d4 = a a
$
```

sngg4151_21.F:
```fortran
  program main
  logical ,dimension(1):: mask=.true.
  character(len=2),dimension(1):: d1="a "
  character(len=4),dimension(1):: d4
!where (mask)
 d4=adjustr(d1//d1)
!end where
write(6,*) "d4 =", d4
  end
```

```
$ flang sngg4151_21.F; ./a.out
 d4 = a a
$
```

sngg4151_22.F:
```fortran
  program main
  logical ,dimension(1):: mask=.true.
  character(len=2),dimension(1):: d1="a "
  character(len=4),dimension(1):: d4
where (mask)
 d4=adjustr("a a ")
end where
write(6,*) "d4 =", d4
  end
```

```
$ flang sngg4151_22.F; ./a.out
 d4 = a a
$
```



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122822] [Flang] Compilation abnormally terminates when dummy procedure name is the same as common-block-name

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122822




Summary

[Flang] Compilation abnormally terminates when dummy procedure name is the same as common-block-name




  Labels
  
flang:ir
  



  Assignees
  
  



  Reporter
  
  ohno-fj
  




```
Version of flang : 20.0.0(f4230b4332262dffb0bd3b7a2f8d6deb2e96488e)/AArch64
```

When `dummy procedure name` is the same as `common-block-name`, a compilation terminates abnormally (Lowering to LLVM IR failed).  
There should be no problem that the dummy procedure name and common-block-name have the same name.

The following are the test program, Flang, Gfortran and ifx compilation/execution result.

terrcom3_2.f:
```fortran
  subroutine ss5()
  common /com_dummy1/ x
  interface
 subroutine com_dummy1()
 end subroutine
  end interface
  print *,fun_sub(com_dummy1)
  end
```

```
$ flang terrcom3_2.f -c
error: loc("/work/home/ohno/CT/test/fort/tp/reproducerJ/MCS/fe2ferr/terrcom3_2.f":4:21): redefinition of symbol named 'com_dummy1_'
error: Lowering to LLVM IR failed
error: loc("/work/home/ohno/CT/test/fort/tp/reproducerJ/MCS/fe2ferr/terrcom3_2.f":2:15): LLVM Translation failed for operation: fir.global
error: failed to create the LLVM module
$
```

```
$ gfortran terrcom3_2.f -c
terrcom3_2.f:2:25:

2 |   common /com_dummy1/ x
  | 1
Error: COMMON block ‘com_dummy1’ at (1) cannot have the EXTERNAL attribute
$
```

```
$ ifx terrcom3_2.f -c; ls -al *.o
-rw-r--r--. 1 32800043 32800043 1136 Jan  8 02:00 terrcom3_2.o
$
```



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122826] libcxx presubmit should have bootstrapping-build variant for mac and windows, not just linux

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122826




Summary

libcxx presubmit should have bootstrapping-build variant for mac and windows, not just linux




  Labels
  
libc++
  



  Assignees
  
  



  Reporter
  
  zeroomega
  




Most of current libcxx presubmit builders uses the host clang, which is the current release version. There is a special bootstraping-build presubmit builder that builds the ToT Clang and used it to build and test libcxx (e.g.). However, this builder only covers the Linux platform and it missed following breaking changes in the past:

* https://github.com/llvm/llvm-project/commit/10c6d6349e51bb245b9deec4aafca9885971135b and https://github.com/llvm/llvm-project/commit/987087df90026605fc8d03ebda5a1cd31b71e609 (which broke `llvm-libc++-static-clangcl.cfg.in :: libcxx/fuzzing/random.pass.cpp` on Windows)
* PR #90394 which broke `chrono.compile.pass.cpp` on Windows
* PR #76246 which broke `libcxx/selftest/modules/std-and-std.compat-module.sh.cpp` on Windows

By supporting the Mac and Windows in the bootstraping-build builder, these failures could be caught, reducing downstream disturbances.  





___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122757] Request Commit Access For ZenithalHourlyRate

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122757




Summary

Request Commit Access For ZenithalHourlyRate




  Labels
  
  



  Assignees
  
  



  Reporter
  
  ZenithalHourlyRate
  




### Why Are you requesting commit access ?

As part of https://discourse.llvm.org/t/rfc-introduce-opasm-type-attr-interface-for-pretty-print-in-asmprinter/83792, I need commit access for more ease of merging documentation part like #121698.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122758] Thread-safety analysis produces incorrect results for cleanup functions

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122758




Summary

Thread-safety analysis produces incorrect results for cleanup functions




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  bvanassche
  




Building the following code with `-Wthread-safety` produces compiler warnings while no warnings should be reported:
```
#include 

struct __attribute__((capability("mutex"))) mutex {
};

static void mutex_lock(struct mutex *m)
 __attribute__((exclusive_lock_function(*m)))
 __attribute__((no_thread_safety_analysis))
{
 puts(__func__);
}

static void mutex_unlock(struct mutex *m)
 __attribute__((unlock_function(*m)))
 __attribute__((no_thread_safety_analysis))
{
 puts(__func__);
}

static void mutex_cleanup(struct mutex **m)
 __attribute__((unlock_function(**m)))
{
puts(__func__);
 mutex_unlock(*m);
}

int main(void)
{
struct mutex m = {};
 mutex_lock(&m);
#if 1
struct mutex *m_ptr __attribute__((cleanup(mutex_cleanup))) = &m;
#else
 mutex_unlock(&m);
#endif
return 0;
}
```
The following compiler warnings are reported:
```
annotated-cleanup.c:32:19: warning: releasing mutex 'm_ptr' that was not held [-Wthread-safety-analysis]
   32 | struct mutex *m_ptr __attribute__((cleanup(mutex_cleanup))) = &m;
  | ^
annotated-cleanup.c:37:1: warning: mutex 'm' is still held at the end of function [-Wthread-safety-analysis]
   37 | }
  | ^
annotated-cleanup.c:30:5: note: mutex acquired here
   30 | mutex_lock(&m);
  | ^
2 warnings generated.
```
The program output shows that both the lock and unlock functions are called:
```
mutex_lock
mutex_cleanup
mutex_unlock
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122764] [Flang] Incorrect diagnostic on renaming generic operator

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122764




Summary

[Flang] Incorrect diagnostic on renaming generic operator




  Labels
  
flang:frontend
  



  Assignees
  
  



  Reporter
  
  DanielCChen
  




Consider the following code:
```
module opmod

  type modreal
real :: x

 contains
  procedure :: plus
  procedure :: plus2
  generic, private :: operator(.add.) => plus

  end type

  interface operator(.adda.)
module procedure plus2
  end interface

  contains
 function plus(a,b)
  type(modreal) :: plus
  class(modreal), intent(in) :: a,b
  plus%x = a%x+b%x*2.0
end function plus

 function plus2(a,b)
  type(modreal) :: plus2
  class(modreal), intent(in) :: a,b
  plus2%x = a%x+b%x
end function plus2


end module


program main
use opmod , operator(.add.) => operator(.adda.)
end program
```

Flang currently issues an error as:
```
t.f:35:22: error: Generic 'OPERATOR(.add.)' may not have specific procedures 'plus2' and 'modreal%plus' as their interfaces are not distinguishable
  use opmod , operator(.add.) => operator(.adda.)
 ^
./a6.f:35:5: 'plus2' is USE-associated from module 'opmod'
  use opmod , operator(.add.) => operator(.adda.)
 ^
./a6.f:35:5: 'plus' is USE-associated from module 'opmod'
  use opmod , operator(.add.) => operator(.adda.)
  ^
```

Module procedure `plus` is not part of the accessible generic `operator(.add.)` in the scope of main. The error message seems wrong.
All ifort, gfortran and XLF all compile the code successfully.



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122827] free(): invalid pointer error

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122827




Summary

free(): invalid pointer error




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  zaneenders
  




Hi,

I have a [SwiftNIO](https://github.com/apple/swift-nio) project I have been working on and I tripped over this bug. I am not sure how to reproduce it, unfortunately. Let me know if there is anything I can do to help.

Thanks,
Zane




___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122828] [BOLT] Crash and FDE Mismatches when trying to BOLT glibc 2.40 on x86_64

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122828




Summary

[BOLT] Crash and FDE Mismatches when trying to BOLT glibc 2.40 on x86_64




  Labels
  
BOLT
  



  Assignees
  
  



  Reporter
  
  ms178
  




With 3def49cb64ec1298290724081bd37dbdeb2ea5f8, I am seeing the following crash and FDE mismatches when trying to BOLT Glibc 2.40 (7648e3c8e80b3f1b3b43506b2fbe370e4824ab97). For further details, see the trace at the end.

I am using this PKGBUILD on CachyOS:

[PKGBUILD.bolt.txt](https://github.com/user-attachments/files/18403403/PKGBUILD.bolt.txt)

Even if I reduce the BOLT options further down to just `dyna-stats`, the crash remains.

This is the output of `objdump -drwC /tmp/makepkg/glibc/src/tmp-root/usr/lib/libc.so.6` 

[objdump-glibc.txt](https://github.com/user-attachments/files/18403435/objdump-glibc.txt)

This is the output of `readelf -w /tmp/makepkg/glibc/src/tmp-root/usr/lib/libc.so.6`

[readelf-glibc.txt](https://github.com/user-attachments/files/18403446/readelf-glibc.txt)

`Compiler: gcc 14.2.1 20240910`

CPU: Intel 14700KF

```
CFLAGS="-O2 -march=native -mtune=native -falign-functions=32 -mtls-dialect=gnu2 -fcf-protection=none -mharden-sls=none -w -fno-reorder-blocks-and-partition -fPIC"
CXXFLAGS="$CFLAGS -Wp,-U_GLIBCXX_ASSERTIONS"
LDFLAGS="-Wl,-O3,--as-needed,--sort-common -Wl,-z,now -Wl,--emit-relocs"
CCLDFLAGS="$LDFLAGS"
CXXLDFLAGS="$LDFLAGS"
FFLAGS="$CFLAGS"
FCFLAGS="$CFLAGS"
GOAMD64="v3"
ASFLAGS="-D__AVX__=1 -D__AVX2__=1 -msse2avx -D__FMA__=1"
```

```
BOLT-INFO: shared object or position-independent executable detected
PERF2BOLT: Starting data aggregation job for /tmp/makepkg/glibc/src/glibc-build/perf.data
PERF2BOLT: spawning perf job to read branch events
PERF2BOLT: spawning perf job to read mem events
PERF2BOLT: spawning perf job to read process events
PERF2BOLT: spawning perf job to read task events
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 3def49cb64ec1298290724081bd37dbdeb2ea5f8
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x20, offset 0x20
BOLT-INFO: enabling relocation mode
BOLT-INFO: enabling strict relocation mode for aggregation purposes
BOLT-ERROR: function __restore_rt/1 is in conflict with FDE [3aa4f, 3aa59). Skipping.
BOLT-WARNING: sizes differ for function __setcontext/1. FDE : 332; symbol table : 352. Using max size.
BOLT-WARNING: sizes differ for function setcontext. FDE : 332; symbol table : 352. Using max size.
BOLT-WARNING: sizes differ for function __GI___clone/1. FDE : 52; symbol table : 95. Using max size.
BOLT-WARNING: sizes differ for function clone. FDE : 52; symbol table : 95. Using max size.
BOLT-WARNING: sizes differ for function __clone. FDE : 52; symbol table : 95. Using max size.
BOLT-WARNING: sizes differ for function __clone3/1. FDE : 27; symbol table : 71. Using max size.
BOLT-WARNING: sizes differ for function __GI___clone3/1. FDE : 27; symbol table : 71. Using max size.
BOLT-WARNING: FDE [0x3fbec, 0x3fc00) conflicts with function __setcontext/1(*2)
BOLT-WARNING: FDE [0xfe04e, 0xfe05e) conflicts with function __GI___clone/1(*3)
BOLT-WARNING: FDE [0xfe05e, 0xfe06f) conflicts with function __GI___clone/1(*3)
BOLT-WARNING: FDE [0xfe205, 0xfe216) conflicts with function __clone3/1(*2)
BOLT-WARNING: FDE [0xfe216, 0xfe227) conflicts with function __clone3/1(*2)
BOLT-ERROR: symbol seen in the middle of the function __BOLT_FDE_FUNCat3aa4f. Skipping.
BOLT-INFO: pre-processing profile using perf data aggregator
BOLT-INFO: binary build-id is: 0d57a2dbb20b38d7381aae0c7b25f4a0625509ad
PERF2BOLT: spawning perf job to read buildid list
PERF2BOLT: matched build-id and file name
PERF2BOLT: waiting for perf mmap events collection to finish...
PERF2BOLT: parsing perf-script mmap events output
PERF2BOLT: waiting for perf task events collection to finish...
PERF2BOLT: parsing perf-script task events output
PERF2BOLT: input binary is associated with 87 PID(s)
PERF2BOLT: waiting for perf events collection to finish...
PERF2BOLT: parse branch events...
PERF2BOLT: read 98826 samples and 3057341 LBR entries
PERF2BOLT: 0 samples (0.0%) were ignored
PERF2BOLT: traces mismatching disassembled function contents: 138 (0.0%)
PERF2BOLT: out of range traces involving unknown regions: 281115 (9.5%)
PERF2BOLT: waiting for perf mem events collection to finish...
BOLT-INFO: fixed PIC indirect branch detected in __GI___res_context_query/1(*2) at 0x119e65 referencing data at 0x187320 the destination value is 0x11a18b
 #0 0x57f646b7a9c5 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) Signals.cpp:0:0
 #1 0x57f646b7adbc SignalHandler(int) Signals.cpp:0:0
 #2 0x71235506aa50 __restore_rt libc_sigaction.c:0:0
 #3 0x57f64712057e llvm::bolt::BinaryFunction::disassemble() (/home/marcus/llvm20/bin/llvm-bolt+0x3d2057e)
 #4 0x57f646c1c6c5 llvm::bolt::Rewri

[llvm-bugs] [Bug 122829] Infinite loop in MLGO regalloc advisor

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122829




Summary

Infinite loop in MLGO regalloc advisor




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  pirama-arumuga-nainar
  




See reproducer at https://drive.google.com/file/d/1sdznuNjRXXl2whBczMhL0OpLlLVzXBRP/view?usp=sharing. (NB: the reproducer uses ThinLTO and was built against 99d0780f050c830c046c6f8790821880ab7c71f5)

The above reproducer hangs when building ToT LLVM with Android's [regalloc models] (https://android.googlesource.com/platform/prebuilts/clang/host/linux-x86/+/refs/heads/main/mlgo-models/arm64/regalloc-evict-aosp/). (I haven't been able to build with official MLGO-released-models yet).

cc: @mtrofin @kongy 


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122710] [clang] incorrectly tries to capture constexpr variable in lambda with parenthesized noexcept specifier

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122710




Summary

[clang] incorrectly tries to capture constexpr variable in lambda with parenthesized noexcept specifier




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  cgnitash
  




The following program https://godbolt.org/z/jTdY6d81n:
```
template
void f(Ts...) {}

int main (){
constexpr int n = 42;
 f(n, []() noexcept (true) {});
}
```
is rejected by Clang starting from llvm-17 with 
>  error: variable 'n' cannot be implicitly captured in a lambda with no capture-default specified

which is obviously wrong.

Note that the `noexcept (/* expr */)` is necessary to reproduce this (dropping the `noexcept`, or using just the `noexcept` without a `(/* expr */)` doesn't manifest the bug).

This is very similar to https://github.com/llvm/llvm-project/issues/120503 except that 
 1. The cause in that case seems unrelated to `noexcept`.
 2. That bug was introduced in llvm-16, whereas this bug only happens from llvm-17.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122733] Request Commit Access For willfroom

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122733




Summary

Request Commit Access For willfroom




  Labels
  
infra:commit-access-request
  



  Assignees
  
  



  Reporter
  
  WillFroom
  




### Why Are you requesting commit access ?

I am on the MLIR integration team at google


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122724] [AMDGPU][GISel] Dead code generated by GISel

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122724




Summary

[AMDGPU][GISel] Dead code generated by GISel




  Labels
  
backend:AMDGPU,
llvm:globalisel
  



  Assignees
  
  



  Reporter
  
  tyb0807
  




Given [this IR](https://github.com/user-attachments/files/18398484/input.txt), the last basic block is translated with `llc -O3 -march=amdgcn -mcpu=gfx942 -print-after-all -mtriple amdgcn-amd-hmcsa -global-isel --asm-verbose --asm-show-inst input.txt -o gisel.s 2> gisel.mir` into
```
 bb.4 (%ir-block.1168):
 ; predecessors: %bb.1
   liveins: $agpr55, $sgpr15, $sgpr22, $vgpr53, $agpr32_agpr33_agpr34_agpr35, $agpr36_agpr37_agpr38_agpr39, $agpr40_agpr41_agpr42_agpr43, $agpr44_agpr45_agpr46_agpr47, $agpr48_agpr49_agpr50_agpr51, $agpr56_agpr57_agpr58_agpr59, $agpr60_agpr61_
 $vgpr16_vgpr17_vgpr18_vgpr19 = SCRATCH_LOAD_DWORDX4_ST 0, 0, implicit $exec, implicit $flat_scr :: (load (s128) from %stack.0, align 4, addrspace 5)
   renamable $vgpr20, dead renamable $sgpr0_sgpr1 = V_DIV_SCALE_F32_e64 0, $vgpr53, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
 renamable $vgpr28 = nofpexcept V_RCP_F32_e32 $vgpr20, implicit $mode, implicit $exec
   renamable $vgpr21, renamable $vcc = V_DIV_SCALE_F32_e64 0, 1065353216, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
 $vgpr32 = V_MOV_B32_e32 $vgpr53, implicit $exec, implicit $exec
 renamable $vgpr22 = nofpexcept V_FMA_F32_e64 1, $vgpr20, 0, $vgpr28, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr28 = nofpexcept V_FMAC_F32_e32 killed $vgpr22, killed $vgpr28, $vgpr28(tied-def 0), implicit $mode, implicit $exec
   renamable $vgpr29 = nofpexcept V_MUL_F32_e32 $vgpr21, $vgpr28, implicit $mode, implicit $exec
   renamable $vgpr22 = nofpexcept V_FMA_F32_e64 1, $vgpr20, 0, $vgpr29, 0, $vgpr21, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr29 = nofpexcept V_FMAC_F32_e32 killed $vgpr22, $vgpr28, killed $vgpr29(tied-def 0), implicit $mode, implicit $exec
   renamable $vgpr30 = nofpexcept V_FMA_F32_e64 1, killed $vgpr20, 0, $vgpr29, 0, killed $vgpr21, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr31 = nofpexcept V_DIV_FMAS_F32_e64 0, $vgpr30, 0, $vgpr28, 0, $vgpr29, 0, 0, implicit $mode, implicit $vcc, implicit $exec
 renamable $vgpr36 = nofpexcept V_DIV_FIXUP_F32_e64 0, killed $vgpr31, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr31 = nofpexcept V_DIV_FMAS_F32_e64 0, $vgpr30, 0, $vgpr28, 0, $vgpr29, 0, 0, implicit $mode, implicit $vcc, implicit $exec
   renamable $vgpr37 = nofpexcept V_DIV_FIXUP_F32_e64 0, killed $vgpr31, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr31 = nofpexcept V_DIV_FMAS_F32_e64 0, $vgpr30, 0, $vgpr28, 0, $vgpr29, 0, 0, implicit $mode, implicit $vcc, implicit $exec
   renamable $vgpr38 = nofpexcept V_DIV_FIXUP_F32_e64 0, killed $vgpr31, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr31 = nofpexcept V_DIV_FMAS_F32_e64 0, $vgpr30, 0, $vgpr28, 0, $vgpr29, 0, 0, implicit $mode, implicit $vcc, implicit $exec
   renamable $vgpr39 = nofpexcept V_DIV_FIXUP_F32_e64 0, killed $vgpr31, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr31 = nofpexcept V_DIV_FMAS_F32_e64 0, $vgpr30, 0, $vgpr28, 0, $vgpr29, 0, 0, implicit $mode, implicit $vcc, implicit $exec
   renamable $vgpr40 = nofpexcept V_DIV_FIXUP_F32_e64 0, killed $vgpr31, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr31 = nofpexcept V_DIV_FMAS_F32_e64 0, $vgpr30, 0, $vgpr28, 0, $vgpr29, 0, 0, implicit $mode, implicit $vcc, implicit $exec
   renamable $vgpr41 = nofpexcept V_DIV_FIXUP_F32_e64 0, killed $vgpr31, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr31 = nofpexcept V_DIV_FMAS_F32_e64 0, $vgpr30, 0, $vgpr28, 0, $vgpr29, 0, 0, implicit $mode, implicit $vcc, implicit $exec
   renamable $vgpr42 = nofpexcept V_DIV_FIXUP_F32_e64 0, killed $vgpr31, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr31 = nofpexcept V_DIV_FMAS_F32_e64 0, $vgpr30, 0, $vgpr28, 0, $vgpr29, 0, 0, implicit $mode, implicit $vcc, implicit $exec
   renamable $vgpr43 = nofpexcept V_DIV_FIXUP_F32_e64 0, killed $vgpr31, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr31 = nofpexcept V_DIV_FMAS_F32_e64 0, $vgpr30, 0, $vgpr28, 0, $vgpr29, 0, 0, implicit $mode, implicit $vcc, implicit $exec
   renamable $vgpr44 = nofpexcept V_DIV_FIXUP_F32_e64 0, killed $vgpr31, 0, $vgpr53, 0, 1065353216, 0, 0, implicit $mode, implicit $exec
   renamable $vgpr31 = nofpexcept V_DIV_FMAS_F32_e64 0, $vgpr30, 0, $vgpr28, 0, $vgpr29, 0, 0, implicit $mode, implicit $vcc, implicit $exec
   renamable $vgpr45 = nofpexcept V_DIV_FIXUP_F32_e64 0, killed $vgpr31, 0, $vgpr53, 0, 10

[llvm-bugs] [Bug 122728] [PowerPC] li of 0 into arg registers of unused arguments

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122728




Summary

[PowerPC] li of 0 into arg registers of unused arguments




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  diggerlin
  




cat > test.c 

```
typedef signed char sb;
void foo(sb);
void __attribute__((noinline)) bar(sb sb1, sb var2, sb var3, sb var4,
   sb sb5)
{
 foo(sb5);
}
void __attribute__((noinline)) test() {
  bar(1, 2, 3, 4, 125);
}
```

bash-5.2$  /home/zhijian/llvm/dev/build/bin/ibm-clang  seg.c -m32 -S  -o seg.s

there is code as 

```

.bar:
# %bb.0: # %entry
mflr 0
stwu 1, -64(1)
 mr  3, 7
stw 0, 72(1)
bl .foo[PR]
nop
 addi 1, 1, 64
lwz 0, 8(1)
mtlr 0
 blr


.test:
# %bb.0:# %entry
 mflr 0
stwu 1, -64(1)
li 3, 0
li 4, 0
li 5, 0
li 6, 0
stw 0, 72(1)
li 7, 125
bl .bar
nop
```




___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122681] RISC-V EVL tail folding failure on SPEC CPU 2017 525.x264_r

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122681




Summary

RISC-V EVL tail folding failure on SPEC CPU 2017 525.x264_r




  Labels
  
backend:RISC-V,
vectorizers
  



  Assignees
  
  



  Reporter
  
  lukel97
  




Split out from the discussion here: https://github.com/llvm/llvm-project/pull/122458#issuecomment-2585713670 

On RISC-V with `-march=rva22u64_v -O3 -flto -mllvm -force-tail-folding-style=data-with-evl -mllvm -prefer-predicate-over-epilogue=predicate-else-scalar-epilogue`, the SPEC CPU 2017 525.x264_r benchmark fails in the train dataset, likely due to a miscompile.

It's been failing since at least 6ad0dcf67f5dccdf8506ce5f51d793062a1c6879, detected from this LNT run: https://lnt.lukelau.me/db_default/v4/nts/89


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122682] RISC-V EVL tail folding failure on SPEC CPU 2017 502.gcc_r

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122682




Summary

RISC-V EVL tail folding failure on SPEC CPU 2017 502.gcc_r




  Labels
  
backend:RISC-V,
vectorizers
  



  Assignees
  
lukel97
  



  Reporter
  
  lukel97
  




Split out from the discussion here: https://github.com/llvm/llvm-project/pull/122458#issuecomment-2585713670 

On RISC-V with `-march=rva22u64_v -O3 -flto -mllvm -force-tail-folding-style=data-with-evl -mllvm -prefer-predicate-over-epilogue=predicate-else-scalar-epilogue`, the SPEC CPU 2017 502.gcc_r benchmark fails in the train dataset, likely due to a miscompile.

It's been failing since at least 6ad0dcf67f5dccdf8506ce5f51d793062a1c6879, detected from this LNT run: https://lnt.lukelau.me/db_default/v4/nts/89


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122707] Windows aarch64 (w/ inline asm .align): Failed to evaluate function length in SEH unwind info

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122707




Summary

Windows aarch64 (w/ inline asm .align): Failed to evaluate function length in SEH unwind info




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  hmartinez82
  




When adding an `.align` pseudo op to inline assembly (built with `clang -c`):
```c
int f(int i) {
int result;
 __asm__ (
".align 5 \n"
"add %w0, %w1, #41"
: "=r" (result)
: "r" (i)
:
);
return result;
}
```
Clang crashes with:
```
fatal error: error in backend: Failed to evaluate function length in SEH unwind info
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: C:\\msys64\\clangarm64\\bin\\clang.exe -c align.c
1.   parser at end of file
2.  Code generation
3. Running pass 'Function Pass Manager' on module 'align.c'.
4.  Running pass 'AArch64 Assembly Printer' on function '@f'
Exception Code: 0xE046
#0 0x7ff94adb6248 (C:\Windows\System32\KERNELBASE.dll+0xb6248)
#1 0x730afff85102d7d8
clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 19.1.6
Target: aarch64-w64-windows-gnu
Thread model: posix
InstalledDir: C:/msys64/clangarm64/bin
clang: note: diagnostic msg:


PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: C:/msys64/tmp/align-f60711.c
clang: note: diagnostic msg: C:/msys64/tmp/align-f60711.sh
clang: note: diagnostic msg:


```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122688] GlobalISel sdiv/sext shift right heavy

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122688




Summary

GlobalISel sdiv/sext shift right heavy




  Labels
  
backend:AMDGPU,
llvm:globalisel
  



  Assignees
  
  



  Reporter
  
  tpopp
  




[reduced.gisel.txt](https://github.com/user-attachments/files/18395905/reduced.gisel.txt)
[reduced.sdisel.txt](https://github.com/user-attachments/files/18395904/reduced.sdisel.txt)
[reduced.txt](https://github.com/user-attachments/files/18395903/reduced.txt)

Given input IR like:

```
  %i78 = sdiv i32 %i, 128
  %i80 = sext i32 %i78 to i64
  %i83 = sdiv i64 %i80, 4
```

I see many more `ashr` and `lshr` instructions in the result with global-isel. I believe SDIsel is instead using `v_bfe` instead of some 2-4 SHR instructions.

I've been looking at this and reducing with the instruction `llc -O3 -march=amdgcn -mcpu=gfx942 -mtriple amdgcn-amd-hmcsa  ./reduced.ll  -global-isel={true,false}`


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122689] Unnecessary call to `memset` when initializing an array of structs with non zero member initialization.

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122689




Summary

Unnecessary call to `memset` when initializing an array of structs with non zero member initialization.




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  gchatelet
  




[godbolt link](https://godbolt.org/z/h39sa6fnv)

```
struct T {
  int a = 1;
 int b = 0;
};

template 
struct S { T a[n]; };

S<75> F() { return {}; }
```

Compiled with `-O3 -std=c++20 -DNDEBUG -fno-exceptions -march=skylake` generates the following assembly
```
.LCPI0_0:
 .long   1
.long   0
F():
pushq   %rbx
movq %rdi, %rbx
movl$600, %edx
xorl%esi, %esi
 callq   memset@PLT
vbroadcastsd.LCPI0_0(%rip), %ymm0
 vmovups %ymm0, 32(%rbx)
vmovups %ymm0, (%rbx)
vmovups %ymm0, 96(%rbx)
vmovups %ymm0, 64(%rbx)
vmovups %ymm0, 160(%rbx)
vmovups %ymm0, 128(%rbx)
vmovups %ymm0, 224(%rbx)
vmovups %ymm0, 192(%rbx)
vmovups %ymm0, 288(%rbx)
vmovups %ymm0, 256(%rbx)
vmovups %ymm0, 352(%rbx)
vmovups %ymm0, 320(%rbx)
vmovups %ymm0, 416(%rbx)
vmovups %ymm0, 384(%rbx)
vmovups %ymm0, 480(%rbx)
vmovups %ymm0, 448(%rbx)
vmovups %ymm0, 544(%rbx)
vmovups %ymm0, 512(%rbx)
vmovups %xmm0, 576(%rbx)
movq$1, 592(%rbx)
movq%rbx, %rax
 popq%rbx
vzeroupper
retq
```

The compiler first clears `S<75>` content with a call to `memset` and then sets its content through an unrolled loop of YMM stores.
If using `S<74>` instead of `S<75>` the call to `memset` goes away.

Apparently the clearing part  is created in the frontend (clang), here is the LLVM IR with `-O0`:
```
%struct.S = type { [75 x %struct.T] }
%struct.T = type { i32, i32 }

define dso_local void @F()(ptr dead_on_unwind noalias writable sret(%struct.S) align 4 %agg.result) {
entry:
  call void @llvm.memset.p0.i64(ptr align 4 %agg.result, i8 0, i64 592, i1 false)
  %a = getelementptr inbounds nuw %struct.S, ptr %agg.result, i32 0, i32 0
  %arrayinit.end = getelementptr inbounds %struct.T, ptr %a, i64 75
  br label %arrayinit.body

arrayinit.body:
  %arrayinit.cur = phi ptr [ %a, %entry ], [ %arrayinit.next, %arrayinit.body ]
  %a1 = getelementptr inbounds nuw %struct.T, ptr %arrayinit.cur, i32 0, i32 0
  store i32 1, ptr %a1, align 4
  %b = getelementptr inbounds nuw %struct.T, ptr %arrayinit.cur, i32 0, i32 1
  store i32 0, ptr %b, align 4
  %arrayinit.next = getelementptr inbounds %struct.T, ptr %arrayinit.cur, i64 1
  %arrayinit.done = icmp eq ptr %arrayinit.next, %arrayinit.end
  br i1 %arrayinit.done, label %arrayinit.end2, label %arrayinit.body

arrayinit.end2:
  ret void
}

declare void @llvm.memset.p0.i64(ptr nocapture writeonly, i8, i64, i1 immarg) #1
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122706] Global ISel packing earlier with many masks

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122706




Summary

Global ISel packing earlier with many masks




  Labels
  
llvm:globalisel,
mlir:amdgpu
  



  Assignees
  
  



  Reporter
  
  tpopp
  




I've reduced the function a lot to hopefully make the information more useful, but in the full function, this has been more noticeably excessive. This is using commands like `llc -O3 -march=amdgcn -mcpu=gfx942  -mtriple amdgcn-amd-hmcsa -global-isel={true,false}`.

Both cases pack inputs and use `v_pk_fma_f16` instructions, but global isel will put them early and mask the values to get high/low words for various other instructions, resulting in a lot of extra masking computations, while sd-isel inserts them just before the fma calls. I haven't yet seen if there is some heuristic that could be tweaked to tradeoff the cost of extra masking.

```
target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
target triple = "amdgcn-amd-amdhsa"

define amdgpu_kernel void @"main$async_dispatch_157_elementwise_2x1024x5120_f16xf16xf16xf32xi8"(<4 x half> %i37) {
bb:
  %i53 = fcmp olt <4 x half> %i37, zeroinitializer
 %i54 = select <4 x i1> %i53, <4 x half> zeroinitializer, <4 x half> splat (half 0xH9AC3)
  %i55 = select <4 x i1> %i53, <4 x half> splat (half 0xH3C00), <4 x half> zeroinitializer
  %i57 = select <4 x i1> %i53, <4 x half> zeroinitializer, <4 x half> splat (half 0xH95CA)
  %i59 = select <4 x i1> %i53, <4 x half> zeroinitializer, <4 x half> splat (half 0xH7E00)
  %i63 = select <4 x i1> %i53, <4 x half> zeroinitializer, <4 x half> splat (half 0xH3C00)
  %i66 = tail call <4 x half> @llvm.fma.v4f16(<4 x half> zeroinitializer, <4 x half> %i59, <4 x half> %i57)
  %i67 = tail call <4 x half> @llvm.fma.v4f16(<4 x half> zeroinitializer, <4 x half> %i66, <4 x half> %i55)
  %i68 = tail call <4 x half> @llvm.fma.v4f16(<4 x half> zeroinitializer, <4 x half> %i67, <4 x half> %i54)
  %i74 = fadd <4 x half> %i63, %i68
  %i87 = tail call <4 x half> @llvm.roundeven.v4f16(<4 x half> %i74)
  %.inv = fcmp oge <4 x half> %i87, splat (half 0xHD800)
  %i88 = select <4 x i1> %.inv, <4 x half> %i87, <4 x half> splat (half 0xHD800)
 %i90 = fptosi <4 x half> %i88 to <4 x i8>
  store <4 x i8> %i90, ptr addrspace(1) null, align 1
  ret void
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare <4 x half> @llvm.roundeven.v4f16(<4 x half>) #0

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare <4 x half> @llvm.fma.v4f16(<4 x half>, <4 x half>, <4 x half>) #0

; uselistorder directives
uselistorder ptr @llvm.fma.v4f16, { 2, 1, 0 }

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```

[reduced.gisel.txt](https://github.com/user-attachments/files/18397297/reduced.gisel.txt)
[reduced.sdisel.txt](https://github.com/user-attachments/files/18397295/reduced.sdisel.txt)
[reduced.txt](https://github.com/user-attachments/files/18397296/reduced.txt)


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122709] [AMDGPU][GISel] GlobalISel doesn’t make use enough of _e64 instructions

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122709




Summary

[AMDGPU][GISel] GlobalISel doesn’t make use enough of _e64 instructions




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  tyb0807
  




Given the input IR in [input_ir.txt](https://github.com/user-attachments/files/18397617/input_ir.txt), GISel is generating code using mostly `_e32` instructions, while SelectionDAG is able to make use of `_e64` instructions, which resulted in much more compact code:

```
 ; Function info: (SelectionDAG)
 ; codeLenInByte = 212
 ; NumSgprs: 38
 ; NumVgprs: 32
 ; NumAgprs: 0
 ; TotalNumVgprs: 32
 ; ScratchSize: 324
 ; MemoryBound: 0

 ; Function info: (GlobalISel)
 ; codeLenInByte = 292
 ; NumSgprs: 38
 ; NumVgprs: 32
 ; NumAgprs: 0
 ; TotalNumVgprs: 32
 ; ScratchSize: 324
 ; MemoryBound: 0
```

Using `llc -O3 -march=amdgcn -mcpu=gfx942  -print-after-all -mtriple amdgcn-amd-hmcsa --asm-verbose --asm-show-inst input_ir.txt -o seldag.s 2> seldag.mir` and `llc -O3 -march=amdgcn -mcpu=gfx942  -print-after-all -mtriple amdgcn-amd-hmcsa -global-isel --asm-verbose --asm-show-inst input_ir.txtl -o gisel.s 2> gisel.mir`, we can see from the final MIR that, for instance, to compute the value to be stored at `$vgpr12`:

SelectionDAG makes extensive use of `_e64` instructions
```
renamable $vgpr7 = V_MOV_B32_e32 0, implicit $exec
renamable $vgpr0 = V_AND_B32_e32 -2, killed $vgpr0, implicit $exec
renamable $vgpr2 = V_LSHRREV_B32_e32 2, $vgpr0, implicit $exec
$vgpr1 = V_MOV_B32_e32 $vgpr7, implicit $exec, implicit $exec
$vgpr5 = V_MOV_B32_e32 killed $sgpr0, implicit $exec, implicit $exec
renamable $vgpr48 = V_SUB_CO_U32_e32 killed $sgpr12, killed $vgpr2, implicit-def $vcc, implicit $exec

renamable $vgpr6 = V_AND_B32_e32 1023, $vgpr31, implicit $exec
renamable $vgpr0_vgpr1 = V_LSHLREV_B64_e64 5, killed $vgpr0_vgpr1, implicit $exec
renamable $vgpr49 = V_SUBBREV_U32_e32 0, killed $vgpr5, implicit-def dead $vcc, implicit killed $vcc, implicit $exec

renamable $vgpr48_vgpr49 = nsw V_LSHLREV_B64_e64 7, killed $vgpr48_vgpr49, implicit $exec
renamable $vgpr0_vgpr1 = V_LSHL_ADD_U64_e64 killed $vgpr0_vgpr1, 0, $vgpr6_vgpr7, implicit $exec

renamable $vgpr48_vgpr49 = V_LSHL_ADD_U64_e64 killed $vgpr0_vgpr1, 0, killed $vgpr48_vgpr49, implicit $exec

SCRATCH_STORE_DWORDX2 killed renamable $vgpr48_vgpr49, killed renamable $vgpr12, 0, 0, implicit $exec, implicit $flat_scr :: (store (s64) into %ir.i44.out, addrspace 5)
```

While GlobalISel does not and thus generates much more verbose code
```
 renamable $vgpr8 = V_AND_B32_e32 1023, $vgpr31, implicit $exec
   renamable $vgpr48 = V_ASHRREV_I32_e32 31, $vgpr8, implicit $exec
   renamable $vgpr49 = V_XOR_B32_e32 $vgpr48, $vgpr8, implicit $exec
   renamable $vgpr50 = V_ASHRREV_I32_e32 31, $vgpr49, implicit $exec
   renamable $vgpr6 = V_LSHRREV_B32_e32 26, $vgpr50, implicit $exec
   renamable $vgpr6 = V_ADD_U32_e32 $vgpr49, killed $vgpr6, implicit $exec
   renamable $vgpr6 = V_ASHRREV_I32_e32 6, killed $vgpr6, implicit $exec
   renamable $vgpr31 = V_BFE_U32_e64 killed $vgpr31, 10, 10, implicit $exec
   renamable $vgpr6 = V_XOR_B32_e32 killed $vgpr6, $vgpr48, implicit $exec
   renamable $vgpr5 = V_ASHRREV_I32_e32 31, $vgpr31, implicit $exec
   renamable $vgpr7 = V_ASHRREV_I32_e32 31, $vgpr6, implicit $exec
   renamable $vgpr33 = V_ADD_U32_e32 killed $vgpr7, killed $vgpr5, implicit $exec
   renamable $vgpr5 = V_ASHRREV_I32_e32 31, $vgpr33, implicit $exec
   renamable $vgpr6 = V_XOR_B32_e32 killed $vgpr6, $vgpr48, implicit $exec
   renamable $vgpr32 = nuw V_ADD_U32_e32 killed $vgpr6, $vgpr31, implicit $exec
   renamable $vgpr6 = V_SUB_CO_U32_e32 killed $vgpr32, $vgpr5, implicit-def $vcc, implicit $exec

   renamable $vgpr6_vgpr7 = nsw V_LSHLREV_B64_e64 5, killed $vgpr6_vgpr7, implicit $exec
   renamable $vgpr6 = V_ADD_CO_U32_e32 killed $vgpr6, $vgpr8, implicit-def $vcc, implicit $exec

   renamable $vgpr32_vgpr33 = nsw V_LSHLREV_B64_e64 7, killed $vgpr32_vgpr33, implicit $exec

   renamable $vgpr34 = V_ADD_CO_U32_e32 killed $vgpr6, killed $vgpr32, implicit-def $vcc, implicit $exec
 
   renamable $vgpr5 = V_LSHRREV_B32_e32 27, $vgpr50, implicit $exec
   renamable $vgpr5 = V_ADD_U32_e32 $vgpr49, killed $vgpr5, implicit $exec
   renamable $vgpr5 = V_ASHRREV_I32_e32 5, killed $vgpr5, implicit $exec
   renamable $vgpr5 = V_XOR_B32_e32 killed $vgpr5, $vgpr48, implicit $exec
   renamable $vgpr36 = V_ASHRREV_I32_e32 31, $vgpr5, implicit $exec
   renamable $vgpr7 = V_SUBB_U32_e32 killed $vgpr33, $vgpr36, implicit-def dead $vcc, implicit killed $vcc, implicit $exec

   renamable $vgpr6_vgpr7 = nsw V_LSHLREV_B64_e64 5, killed $vgpr6_vgpr7, implicit $exec
   renamable $vgpr7 = V_ADDC_U32_e32 killed $vgpr7, $vgpr48, implicit-def dead $vcc, implicit killed $vcc, implicit $exec
 
   renamable $vgpr7 = V_ADDC_U32_e32 0, killed $vgpr7, implicit-def dead $vcc, im

[llvm-bugs] [Bug 122744] Request Commit Access For shivaramaarao

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122744




Summary

Request Commit Access For shivaramaarao




  Labels
  
  



  Assignees
  
  



  Reporter
  
  shivaramaarao
  




I would like to contribute flang project. I have a ready pull request and will be working on OpenMP support in llvm-flang.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122766] [SPIRV] Add pre legalizer instCombine for GL\CL extentions

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122766




Summary

[SPIRV] Add pre legalizer instCombine for GL\CL extentions




  Labels
  
new issue
  



  Assignees
  
farzonl
  



  Reporter
  
  farzonl
  




We will want to be able to change length(A-B) to distance(A,B) so that  we use the right spirv extension function. 


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122767] [HLSL] RWBuffer resource variable has external linkage

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122767




Summary

[HLSL] RWBuffer resource variable has external linkage




  Labels
  
new issue
  



  Assignees
  
s-perron
  



  Reporter
  
  s-perron
  




When trying to compile an HLSL file with a resource, the compiler generates a variable to hold the handle to the resource. That variable is not actually the resource, and I believe it should have internal linkage, so that the optimizer could remove all references to it. See https://godbolt.org/z/v559r6axr

```
RWBuffer buffer : register(u2);

[numthreads(1,1,1)]
void main()
{
buffer[0] = 0;
}
```

This turns into:

```
@buffer = local_unnamed_addr global %"class.hlsl::RWBuffer" zeroinitializer, align 4, !dbg !0

; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(write, inaccessiblemem: none)
define void @main() local_unnamed_addr #0 {
  %1 = tail call target("dx.TypedBuffer", <4 x i32>, 1, 0, 0) @llvm.dx.resource.handlefrombinding.tdx.TypedBuffer_v4i32_1_0_0t(i32 0, i32 2, i32 1, i32 0, i1 false), !dbg !45
  store target("dx.TypedBuffer", <4 x i32>, 1, 0, 0) %1, ptr @buffer, align 4, !dbg !45
#dbg_value(ptr @buffer, !49, !DIExpression(), !54)
#dbg_value(i32 0, !52, !DIExpression(), !54)
  %2 = tail call noundef nonnull align 16 dereferenceable(16) ptr @llvm.dx.resource.getpointer.p0.tdx.TypedBuffer_v4i32_1_0_0t(target("dx.TypedBuffer", <4 x i32>, 1, 0, 0) %1, i32 0), !dbg !54
  store <4 x i32> zeroinitializer, ptr %2, align 16, !dbg !59, !tbaa !60
  ret void
}
```

Note the store to `@buffer` was not removed. This causes problems for the SPIR-V backend:

1. We cannot remove the store to `@buffer` in the backend because it could be externally visible.
2. SPIR-V has a concept of externally visible variable, but it is not allowed in Vulkan.

Is there anything we can do to allow the optimizer to remove the store?


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122774] libcxx presubmit should be tested against minimal supported glibc (2.24)

2025-01-13 Thread LLVM Bugs via llvm-bugs


Issue

122774




Summary

libcxx presubmit should be tested against minimal supported glibc (2.24)




  Labels
  
libc++
  



  Assignees
  
  



  Reporter
  
  zeroomega
  




In https://libcxx.llvm.org/#platform-and-compiler-support, it mentioned the minimal supported glibc on Linux is glibc-2.24, which is from Debian 9 Stretch. However, all libcxx presubmit builders uses a docker image from Ubuntu 20.04, which has glibc-2.35, see:

```
➜  ~ docker run -it --entrypoint bash 'ghcr.io/llvm/libcxx-linux-builder:b9a2658a3e8bd13b0f9e7a8a440832a95b377216'
runner@0b34c8fcc3c5:~$ /usr/lib/x86_64-linux-gnu/libc.so.6
GNU C Library (Ubuntu GLIBC 2.35-0ubuntu3.8) stable release version 2.35.
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 11.4.0.
libc ABIs: UNIQUE IFUNC ABSOLUTE
For bug reporting instructions, please see:
.
```

This could potentially cause issues as non of the presubmit builder could catch a change that breaks glibc 2.24. We should explicitly use the minimal supported glibc on part of, if not all of libcxx presbumit builders for linux.

FYI, Ubuntu 18.04 LTS Bionic has glibc 2.27. Ubuntu 16.04 LTS Xenial has glibc 2.23.




___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs