[Bug target/117860] New: GCC emits an unnecessary mov for x86 _addcarry/_subborrow intrinsic calls where the second operand is a constant that is within the range of a 32-bit integer

2024-11-30 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117860 Bug ID: 117860 Summary: GCC emits an unnecessary mov for x86 _addcarry/_subborrow intrinsic calls where the second operand is a constant that is within the range of a

[Bug target/113484] Add support for _Float16 type on PowerPC

2024-07-29 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113484 --- Comment #2 from John Platts --- (In reply to Joseph S. Myers from comment #1) > It would of course be necessary to define the ABI used for _Float16 (and > _Complex _Float16) argument passing and return (in each PowerPC ABI for > which we sup

[Bug target/114944] Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2

2024-05-06 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944 --- Comment #2 from John Platts --- Here is more optimal codegen for SSE2ShuffleI8 on x86_64: SSE2ShuffleI8(long long __vector(2), long long __vector(2)): pandxmm1, XMMWORD PTR .LC0[rip] movaps XMMWORD PTR [rsp-24], xmm0

[Bug target/114944] Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2

2024-05-04 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944 John Platts changed: What|Removed |Added Target||x86_64-*-*, i?86-*-* --- Comment #1 from

[Bug target/114944] New: Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2

2024-05-04 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944 Bug ID: 114944 Summary: Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2 Product: gcc Version: 13.2.0 Status: UNCONFIRMED Sever

[Bug target/113484] New: Add support for _Float16 type on PowerPC

2024-01-18 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113484 Bug ID: 113484 Summary: Add support for _Float16 type on PowerPC Product: gcc Version: 12.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: targe

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-09-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #11 from John Platts --- Created attachment 55869 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55869&action=edit Test program to reproduce GCC 12 compilation bug Here is the expected output of the ppc9_test_sat_add_090923.cp

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-09-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #10 from John Platts --- Created attachment 55868 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55868&action=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug The ppc9_test_sat_widen_pairwise_add_090923_2b

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-09-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #9 from John Platts --- Created attachment 55867 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55867&action=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug The attached ppc9_test_sat_widen_pairwise_add_0

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-10 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #6 from John Platts --- Need to use revision ff1ad85a96c0bc8483b582d6dbceb8bc07edd226 of Google Highway to reproduce the PPC9 codegen bug with GCC 12 as the TestSatWidenMulPairwiseAdd will now pass on PPC9 due to a recent update to T

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #5 from John Platts --- The version of Google Highway with the TestSatWidenMulPairwiseAdd changes to get TestSatWidenMulPairwiseAdd to pass successfully on POWER9 with the "-mcpu=power9 -DHWY_DISABLED_TARGETS=6918232715082858496 -DHW

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #4 from John Platts --- I had made some changes to TestSatWidenMulPairwiseAdd in hwy/tests/mul_test.cc that would get TestSatWidenMulPairwiseAdd to pass successfully on POWER9 when compiled with GCC 12 with the "-mcpu=power9 -DHWY_DI

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #3 from John Platts --- Here is the output of running the "./tests/mul_test" program in the Google Highway test suite when compiled with the "-mcpu=power8 -DHWY_DISABLED_TARGETS=6917951240106147840" options when compiled with GCC 12:

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #2 from John Platts --- Created attachment 55711 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55711&action=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug (requires CMake and Google Highway)

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway Test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #1 from John Platts --- Created attachment 55710 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55710&action=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug The attached ppc9_sat_widen_mul_pairwise_add_te

[Bug target/110960] New: TestSatWidenMulPairwiseAdd in the Google Highway Test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 Bug ID: 110960 Summary: TestSatWidenMulPairwiseAdd in the Google Highway Test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option Product: gcc

[Bug target/110741] New: vec_ternarylogic intrinsic generates incorrect code on POWER10 target when compiled with GCC

2023-07-19 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110741 Bug ID: 110741 Summary: vec_ternarylogic intrinsic generates incorrect code on POWER10 target when compiled with GCC Product: gcc Version: 12.1.1 Status: UNCONFIRMED

[Bug target/109069] Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2

2023-03-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069 --- Comment #5 from John Platts --- Here is another test program that shows the same code generation bug when a splat followed by a vec_sld is incorrectly optimized by gcc 12.2.0 on powerpc64-linux-gnu and powerpc64le-linux-gnu with the -mcpu=po

[Bug target/109069] Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2

2023-03-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069 --- Comment #4 from John Platts --- Here is another test program that exposes the optimization bug with applying the vec_sl operation to a constant vector (which generates incorrect results on both big-endian and little-endian POWER10 when compi

[Bug target/109069] Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2

2023-03-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069 --- Comment #3 from John Platts --- Here is another test program that reproduces the vector truncation test issue: #pragma push_macro("vector") #pragma push_macro("pixel") #pragma push_macro("bool") #undef vector #undef pixel #undef bool #incl

[Bug target/109069] Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2

2023-03-08 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069 --- Comment #1 from John Platts --- The C++ test program below does generate the correct results when compiled with the -mcpu=power10 -O0 options.

[Bug target/109069] New: Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2

2023-03-08 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069 Bug ID: 109069 Summary: Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2 Product: gcc Version: 1

[Bug target/108614] New: _subborrow_u32 generates suboptimal code when second subtraction operand is constant on x86 targets

2023-01-31 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108614 Bug ID: 108614 Summary: _subborrow_u32 generates suboptimal code when second subtraction operand is constant on x86 targets Product: gcc Version: 12.2.0 Status: UNCONFIR

[Bug target/105354] New: __builtin_shuffle for alignr generates suboptimal code unless SSSE3 is enabled

2022-04-22 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105354 Bug ID: 105354 Summary: __builtin_shuffle for alignr generates suboptimal code unless SSSE3 is enabled Product: gcc Version: 11.2.0 Status: UNCONFIRMED Keyword

[Bug c++/105353] New: __builtin_shufflevector with template parameter fails to compile on GCC 12 but compiles on clang

2022-04-22 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105353 Bug ID: 105353 Summary: __builtin_shufflevector with template parameter fails to compile on GCC 12 but compiles on clang Product: gcc Version: 12.0 Status: UNCONFIRMED

[Bug target/103611] GCC generates suboptimal code for SSE2/SSE4.1 64-bit integer element extraction on 32-bit x86 targets

2021-12-07 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103611 --- Comment #4 from John Platts --- (In reply to Andrew Pinski from comment #3) > Hmm, GCC 4.8.1-5.5.0 produces: > long long SSE2ExtractInt64<0>(long long __vector): > .LFB499: > .cfi_startproc > pshufd xmm1, xmm0, 1 > m

[Bug target/103611] GCC generates suboptimal code for SSE2/SSE4.1 64-bit integer element extraction on 32-bit x86 targets

2021-12-07 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103611 --- Comment #2 from John Platts --- Here is some code for extracting 64-bit integers from a SSE2 vector using GCC vector extensions: #include #include using Int64M128Vect [[__gnu__::__vector_size__(16)]] = std::int64_t; template std::int64_t

[Bug target/103611] GCC generates suboptimal code for SSE2/SSE4.1 64-bit integer element extraction on 32-bit x86 targets

2021-12-07 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103611 --- Comment #1 from John Platts --- Here is some C++ code for extracting 64-bit integers from a __m128i vector using SSE4.1: #include #include template std::int64_t SSE41ExtractInt64(__m128i vect) noexcept { static_assert(ElemIdx == (Elem

[Bug target/103611] New: GCC generates suboptimal code for SSE2/SSE4.1 64-bit integer element extraction on 32-bit x86 targets

2021-12-07 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103611 Bug ID: 103611 Summary: GCC generates suboptimal code for SSE2/SSE4.1 64-bit integer element extraction on 32-bit x86 targets Product: gcc Version: 11.2.0 Status: UNCONF