[Bug c++/99893] New: C++20 unexpanded parameter packs falsely not detected (lambda is involved)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99893 Bug ID: 99893 Summary: C++20 unexpanded parameter packs falsely not detected (lambda is involved) Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- GCC produces false error message: bug1.cc: In function ‘consteval void VerifyHash()’: bug1.cc:20:70: error: operand of fold expression has no unexpanded parameter packs 20 | [](){static_assert(hash(s.data(), s.size()) == expected_hash);}() | ~~~^~ On this code: #include // copy_n and size_t static constexpr unsigned hash(const char* s, std::size_t length) { s=s; return length; } template struct fixed_string { constexpr fixed_string(const char (&s)[N]) { std::copy_n(s, N, str); } consteval const char* data() const { return str; } consteval std::size_t size() const { return N-1; } char str[N]; }; template static consteval void VerifyHash() { ( [](){static_assert(hash(s.data(), s.size()) == expected_hash);}() ,...); // ^ Falsely reports that there are no unexpanded parameter packs, // while there definitely is ("s" is used). } void foo() { VerifyHash<5, "khaki", "plums">(); } Compiler version: g++-10 (Debian 10.2.1-6) 10.2.1 20210110
[Bug c++/99895] New: Function parameters generated wrong in call to member of non-type template parameter in lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99895 Bug ID: 99895 Summary: Function parameters generated wrong in call to member of non-type template parameter in lambda Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- GCC produces false error message: bug1.cc: In instantiation of ‘consteval void VerifyHash() [with unsigned int expected_hash = 5; fixed_string<...auto...> ...s = {fixed_string<6>{"khaki"}, fixed_string<6>{"plums"}}]’: bug1.cc:24:37: required from here bug1.cc:19:41: error: no matching function for call to ‘fixed_string<6>::data(const fixed_string<6>*)’ 19 | [](auto){static_assert(hash(s.data(), s.size()) == expected_hash);}(s) | ~~^~ bug1.cc:11:27: note: candidate: ‘consteval const char* fixed_string::data() const [with long unsigned int N = 6]’ 11 | consteval const char* data() const { return str; } | ^~~~ bug1.cc:11:27: note: candidate expects 0 arguments, 1 provided On this code: #include // copy_n and size_t static constexpr unsigned hash(const char* s, std::size_t length) { s=s; return length; } template struct fixed_string { constexpr fixed_string(const char (&s)[N]) { std::copy_n(s, N, str); } consteval const char* data() const { return str; } consteval std::size_t size() const { return N-1; } char str[N]; }; template static consteval void VerifyHash() { ( [](auto){static_assert(hash(s.data(), s.size()) == expected_hash);}(s) ,...); // The compiler mistakenly translates s.data() into s.data(&s) // and then complains that the call is not valid, because // the function expects 0 parameters and 1 "was provided". } void foo() { VerifyHash<5, "khaki", "plums">(); } Compiler version: g++-10 (Debian 10.2.1-6) 10.2.1 20210110
[Bug tree-optimization/116013] New: Missed optimization opportunity with andn involving consts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116013 Bug ID: 116013 Summary: Missed optimization opportunity with andn involving consts Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- Below are two short functions which work identically. While GCC utilizes the ANDN instruction (of Intel BMI1) for test2, it fails to see that it could do the same with test1. #include uint64_t test1(uint64_t value) { return ~(value | 0x7F7F7F7F7F7F7F7F); } uint64_t test2(uint64_t value) { return ~value & ~0x7F7F7F7F7F7F7F7F; } Assembler listings of both functions are below (-Ofast -mbmi): test1: movabsq $9187201950435737471, %rdx movq%rdi, %rax orq %rdx, %rax notq%rax ret test2: movabsq $-9187201950435737472, %rax andn%rax, %rdi, %rax ret Tested compiler version: GCC: (Debian 14-20240330-1) 14.0.1 20240330 (experimental) [master r14-9728-g6fc84f680d0] This optimization makes only sense if one of the operands is a compile-time constant. If neither operand is a compile-time constant, then the opposite optimization makes more sense — which GCC already does. It is also worth noting, that GCC already compiles ~(var1 | ~var2) into ~var1 & var2, utilizing ANDN. This is good.
[Bug tree-optimization/116014] New: Missed optimization opportunity: inverted shift count
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116014 Bug ID: 116014 Summary: Missed optimization opportunity: inverted shift count Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- Below are six short functions which perform bit-shifts by a non-constant inverted amount. GCC fails to generate most optimal code. Further explanation is given below the assembler code. #include uint64_t shl_m64(uint64_t value, uint8_t k) { return value << (64-k); } uint64_t shl_m63(uint64_t value, uint8_t k) { return value << (63-k); } uint64_t shr_m64(uint64_t value, uint8_t k) { return value >> (64-k); } uint64_t shr_m63(uint64_t value, uint8_t k) { return value >> (63-k); } int64_t asr_m64(int64_t value, uint8_t k) { return value >> (64-k); } int64_t asr_m63(int64_t value, uint8_t k) { return value >> (63-k); } Below is the code generated by GCC, using -Ofast -mbmi2 -masm=intel. BMI2 is used just to make the assembler code more succinct; it is not relevant for the report. shl_m64: mov eax, 64 sub eax, esi shlxrax, rdi, rax ret shl_m63: mov eax, 63 sub eax, esi shlxrax, rdi, rax ret shr_m64: mov eax, 64 sub eax, esi shrxrax, rdi, rax ret shr_m63: mov eax, 63 sub eax, esi shrxrax, rdi, rax ret asr_m64: mov eax, 64 sub eax, esi sarxrax, rdi, rax ret asr_m63: mov eax, 63 sub eax, esi sarxrax, rdi, rax ret GCC fails to utilize the fact that on Intel, the shift instructions automatically mask the shift-count into the target register width. That is, shift of a 64-bit operand by 68 is the same as shift by 68%64 = 4, and shift of a 32-bit operand by 100 is the same shift by 100%32 = 4. Utilizing this knowledge permits the use of single-insn neg/not to replace the subtract, which requires two insns. In comparison, Clang (version 16) produces this (optimal) code: shl_m64: neg sil shlxrax, rdi, rsi ret shl_m63: not sil shlxrax, rdi, rsi ret shr_m64: neg sil shrxrax, rdi, rsi ret shr_m63: not sil shrxrax, rdi, rsi ret asr_m64: neg sil sarxrax, rdi, rsi ret asr_m63: not sil sarxrax, rdi, rsi ret Tested GCC version: GCC: (Debian 14-20240330-1) 14.0.1 20240330 (experimental) [master r14-9728-g6fc84f680d0]
[Bug middle-end/116013] Missed optimization opportunity with andn involving consts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116013 --- Comment #1 from Joel Yliluoma --- Should be noted that this is not x86_64 specific; andn exists for other platforms too, and even for platforms that don’t have it, changing `~(expr|const)` into `~expr & ~const` is unlikely to be a pessimization.
[Bug target/116014] Missed optimization opportunity: inverted shift count
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116014 --- Comment #2 from Joel Yliluoma --- (In reply to Andi Kleen from comment #1) > is that from some real code? why would a programmer write shifts like that? Yes, it is from actual code: uint64_t readvlq() { uint64_t x, f = ~(uint64_t)0, ones8 = f / 255, pat80 = ones8*0x80, pat7F=ones8*0x7F; memcpy(&x, ptr, sizeof(x)); uint8_t n = __builtin_ctzll(~(x|pat7F)) + 1; ptr += n/8; return _pext_u64(x, pat7F >> (64-n)); } This function reads a variable-length encoded integer (as in General MIDI) from a bytestream without loops or branches. It essentially does the same as this: uint64_t readvlq() { uint64_t result = 0; do { result = (result << 7) | (*ptr & 0x7F); } while(*ptr++ & 0x80); return result; } It isn’t too hard to think of plausible other cases where bitshifts with numberofbits(tgt)-variable may occur. In fact, after just 2 minutes of searching with `grep`, I found this line in LLVM (llvm-17/llvm/Bitstream/BitstreamWriter.h), where CurValue is a 32-bit entity: CurValue = Val >> (32-CurBit);