On Tue, 19 May 2020, Uros Bizjak wrote: > Hello! > > Attached patch adds missing vector zero/sign_extend expanders to allow > vectorization of operations between different vector sizes. > > The patch regresses (progresses?): > > FAIL: gcc.target/i386/pr92645-4.c scan-tree-dump-times optimized > "vec_unpack_lo" 3 > > but eyeballing the asm code before/after the patch, we get much better: > > .L3: > - vmovdqu (%rsi,%rax), %xmm6 > - vpxor %xmm5, %xmm5, %xmm5 > - vmovdqa %ymm5, -32(%rsp) > - vmovdqa %xmm6, -32(%rsp) > - vpmovzxbw -32(%rsp), %ymm0 > + vpmovzxbw (%rsi,%rax), %ymm0 > vpmullw %ymm4, %ymm0, %ymm0 > vpaddw %ymm2, %ymm0, %ymm0 > vpsrlw $8, %ymm0, %ymm0 > > and even more differences to a much better code in the loop prologue. > > (Please note a strange double-save to a stack slot in the old code). > > Richi, I guess that the testcase you introduced needs some adjustment.
I will deal with the FAIL once you commit the patch, the testcase is for forwprop code which indeed also knows how to exercise those missing patterns. IIRC I filed the PR when working on those (and may in turn remove the VEC_UNPACK_* support from forwprop again!) > As discussed in the PR, there are a couple of XFAILs, where the > compiler is not able to vectorize the code. The named expanders are > there, but for the reason, explained in PR comment #8, middle-end > doesn't exercise them. OK, so we should track this in a separate PR? Can you point to the specific expander and the XFAILed testcases there? Thanks a lot! Richard. > gcc/ChangeLog: > > 2020-05-19 Uroš Bizjak <ubiz...@gmail.com> > > PR target/92658 > * config/i386/sse.md (<code>v16qiv16hi2): New expander. > (<code>v32qiv32hi2): Ditto. > (<code>v8qiv8hi2): Ditto. > (<code>v16qiv16si2): Ditto. > (<code>v8qiv8si2): Ditto. > (<code>v4qiv4si2): Ditto. > (<code>v16hiv16si2): Ditto. > (<code>v8hiv8si2): Ditto. > (<code>v4hiv4si2): Ditto. > (<code>v8qiv8di2): Ditto. > (<code>v4qiv4di2): Ditto. > (<code>v2qiv2di2): Ditto. > (<code>v8hiv8di2): Ditto. > (<code>v4hiv4di2): Ditto. > (<code>v2hiv2di2): Ditto. > (<code>v8siv8di2): Ditto. > (<code>v4siv4di2): Ditto. > (<code>v2siv2di2): Ditto. > > testsuite/ChangeLog: > > 2020-05-19 Uroš Bizjak <ubiz...@gmail.com> > > PR target/92658 > * gcc.target/i386/pr92658-sse4.c: New test. > * gcc.target/i386/pr92658-avx2.c: New test. > * gcc.target/i386/pr92658-avx512bw.c: New test. > > Patch si bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > > Uros. > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)