[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2018-11-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 Richard Biener changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|---

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2018-11-19 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 --- Comment #21 from Marc Glisse --- (In reply to Matthias Kretz from comment #20) > The original issue I meant to report is fixed. There are many more missed > optimizations in the original example, though. ok, your choice if you prefer to clos

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2018-11-19 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 --- Comment #20 from Matthias Kretz --- The original issue I meant to report is fixed. There are many more missed optimizations in the original example, though. I.e. https://godbolt.org/z/7P1o3O should compile to: use_insert_extract(): vmovdqu

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2018-11-19 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 --- Comment #19 from Marc Glisse --- (In reply to Matthias Kretz from comment #18) > FWIW, the issue is resolved on trunk. GCC8.2 still has the missed > optimization: https://godbolt.org/z/hbgIIi If I use exactly the testcase from the original d

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2018-11-19 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 --- Comment #18 from Matthias Kretz --- FWIW, the issue is resolved on trunk. GCC8.2 still has the missed optimization: https://godbolt.org/z/hbgIIi

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2018-11-19 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 Marc Glisse changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2018-11-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 Martin Liška changed: What|Removed |Added Status|NEW |RESOLVED CC|

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2016-02-11 Thread helloqirun at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 Qirun Zhang changed: What|Removed |Added CC||helloqirun at gmail dot com --- Comment #1

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2014-07-26 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 --- Comment #14 from Marc Glisse --- Author: glisse Date: Sat Jul 26 09:00:31 2014 New Revision: 213076 URL: https://gcc.gnu.org/viewcvs?rev=213076&root=gcc&view=rev Log: 2014-07-26 Marc Glisse PR target/44551 gcc/ * simplify-rtx.c (

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2014-06-10 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 --- Comment #13 from Marc Glisse --- Created attachment 32915 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=32915&action=edit simplify vec_select(vec_concat) A simpler/safer version of the patch linked in comment #12 (untested). It optimi

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2012-12-01 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551 Marc Glisse changed: What|Removed |Added CC||glisse at gcc dot gnu.org --- Com

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-28 Thread hjl dot tools at gmail dot com
--- Comment #11 from hjl dot tools at gmail dot com 2010-06-28 19:17 --- Testcase is [...@gnu-6 44551]$ cat c.c #include __m128i foo (__m256i x, __m128i y) { __m256i r = _mm256_insertf128_si256(x, y, 1); __m128i a = _mm256_extractf128_si256(r, 1); return a; } [...@gnu-6 44551]$

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-28 Thread hjl dot tools at gmail dot com
--- Comment #10 from hjl dot tools at gmail dot com 2010-06-28 19:17 --- Here is a small testcase: [...@gnu-6 44551]$ cat c.s .file "c.c" .text .p2align 4,,15 .globl foo .type foo, @function foo: .LFB798: .cfi_startproc pushq %rbp

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-17 Thread pinskia at gcc dot gnu dot org
--- Comment #9 from pinskia at gcc dot gnu dot org 2010-06-18 00:49 --- (In reply to comment #8) > Can we use subreg instead of vec_select? Kinda, you need to do triple subregs, first to an integer mode and then to a smaller integer mode and then to the other vector mode. subreg on vec

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-17 Thread hjl dot tools at gmail dot com
--- Comment #8 from hjl dot tools at gmail dot com 2010-06-18 00:46 --- Can we use subreg instead of vec_select? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-17 Thread hjl dot tools at gmail dot com
--- Comment #7 from hjl dot tools at gmail dot com 2010-06-17 22:01 --- Created an attachment (id=20934) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20934&action=view) A patch to split cast Here is a patch to split cast. But it doesn't remove redundant vinsertf128/vextractf128.

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-16 Thread kretz at kde dot org
--- Comment #6 from kretz at kde dot org 2010-06-16 21:21 --- (In reply to comment #4) > You can also cast 128bit to 256bit with upper 128bit undefined. If you cast from xmm to ymm after a 128bit instruction coded with VEX prefix then the upper 128bit are actually guaranteed to be zero.

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-16 Thread pinskia at gcc dot gnu dot org
--- Comment #5 from pinskia at gcc dot gnu dot org 2010-06-16 20:46 --- (In reply to comment #4) > You can cast 256bit to 128bit to get the lower 128bit. This way can be represented using vec_select. And then later on using a split (after reload) turned into a move. > You can also cas

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-16 Thread hjl dot tools at gmail dot com
--- Comment #4 from hjl dot tools at gmail dot com 2010-06-16 20:42 --- You can cast 256bit to 128bit to get the lower 128bit. You can also cast 128bit to 256bit with upper 128bit undefined. If I use union, it will always generate 2 moves via memory. -- http://gcc.gnu.org/bugzilla/s

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-16 Thread pinskia at gcc dot gnu dot org
--- Comment #3 from pinskia at gcc dot gnu dot org 2010-06-16 20:00 --- Well for one, you could have a splitter if the case which_alternative == 0 so that an reg rename can do its magic. Also what does UNSPEC_CAST really do? From the looks of it is just a move which you could use a spl

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-16 Thread hjl dot tools at gmail dot com
--- Comment #2 from hjl dot tools at gmail dot com 2010-06-16 19:50 --- The problem is UNSPEC_CAST. There is no good way to model it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44551

[Bug target/44551] [missed optimization] AVX vextractf128 after vinsertf128

2010-06-16 Thread rguenth at gcc dot gnu dot org
--- Comment #1 from rguenth at gcc dot gnu dot org 2010-06-16 09:02 --- This is probably missing combiner patterns in sse.md. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added ---