[Bug target/87455] sse_packed_single_insn_optimal is suboptimal on Zen

2020-01-04 Thread fanael4 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87455 --- Comment #6 from Fanael --- Any hope of getting this fixed in GCC 10? It should just be a matter of removing Zen[12] from X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL.

[Bug target/87455] sse_packed_single_insn_optimal is suboptimal on Zen

2018-10-12 Thread fanael4 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87455 --- Comment #5 from Fanael --- Created attachment 44829 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44829&action=edit WIP patch > We already have TARGET_SSE_TYPELESS_STORES for stores, so perhaps we want > something like typeless reg-r

[Bug target/87455] sse_packed_single_insn_optimal is suboptimal on Zen

2018-09-28 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87455 --- Comment #4 from Jan Hubicka --- We already have TARGET_SSE_TYPELESS_STORES for stores, so perhaps we want something like typeless reg-reg moves and loads flag? Honza

[Bug target/87455] sse_packed_single_insn_optimal is suboptimal on Zen

2018-09-28 Thread fanael4 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87455 --- Comment #3 from Fanael --- > May be we should remove xorps generation part. If it were up to me, I'd keep to for BDVER[1234] only, because xorps is still one byte shorted than either xorpd or pxor and is as fast there, and introduce a separa

[Bug target/87455] sse_packed_single_insn_optimal is suboptimal on Zen

2018-09-28 Thread vekumar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87455 --- Comment #2 from vekumar at gcc dot gnu.org --- This tuning was intended to generate movups instead of movupd as movups is 1 byte lesser than movupd. May be we should remove xorps generation part.

[Bug target/87455] sse_packed_single_insn_optimal is suboptimal on Zen

2018-09-27 Thread fanael4 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87455 --- Comment #1 from Fanael --- Assembly diff between the two: --- /dev/fd/63 2018-09-27 17:59:06.120507763 +0200 +++ /dev/fd/62 2018-09-27 17:59:06.120507763 +0200 @@ -7,21 +7,21 @@ main: .LFB5179: .cfi_startproc - movaps .LC0