On Fri, Dec 23, 2022 at 12:09 AM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > With many thanks to H.J. for doing all the hard work, this patch resolves > two P1 regressions; PR target/106933 and PR target/106959. > > Although superficially similar, the i386 backend's two scalar-to-vector > (STV) passes perform their transformations in importantly different ways. > The original pass converting SImode and DImode operations to V4SImode > or V2DImode operations is "soft", allowing values to be maintained in > both integer and vector hard registers. The newer pass converting TImode > operations to V1TImode is "hard" (all or nothing) that converts all uses > of a pseudo to vector form. To implement this it invokes powerful ju-ju > calling SET_MODE on a REG_rtx, which due to RTL sharing, often updates > this pseudo's mode everywhere in the RTL chain. Hence, TImode STV can only > be performed when all uses of a pseudo are convertible to V1TImode form. > To ensure this the STV passes currently use data-flow analysis to inspect > all DEFs and USEs in a chain. This works fine for chains that are in > the usual single assignment form, but the occurrence of uninitialized > variables, or multiple assignments that split a pseudo's usage into > several independent chains (lifetimes) can lead to situations where > some but not all of a pseudo's occurrences need to be updated. This is > safe for the SImode/DImode pass, but leads to the above bugs during > the TImode pass. > > My one minor tweak to HJ's patch from comment #4 of bugzilla PR106959 > is to only perform the new single_def_chain_p check for TImode STV; it > turns out that STV of SImode/DImode min/max operates safely on multiple-def > chains, and prohibiting this leads to testsuite regressions. We don't > (yet) support V1TImode min/max, so this idiom isn't an issue during the > TImode STV pass. > > For the record, the two alternate possible fixes are (i) make the TImode > STV pass "soft", by eliminating use of SET_MODE, instead using replace_rtx > with a new pseudo, or (ii) merging "chains" so that multiple DFA > chains/lifetimes are considered a single STV chain.
I assume these two alternatives would result in much more invasive surgery, so let's consider these "for the future". > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32}, > with no new failures. Ok for mainline? > > > 2022-12-22 H.J. Lu <hjl.to...@gmail.com> > Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog > PR target/106933 > PR target/106959 > * config/i386/i386-features.cc (single_def_chain_p): New predicate > function to check that a pseudo's use-def chain is in SSA form. > (timode_scalar_to_vector_candidate_p): Check that TImode regs that > are SET_DEST or SET_SRC of an insn match/are single_def_chain_p. > > gcc/testsuite/ChangeLog > PR target/106933 > PR target/106959 > * gcc.target/i386/pr106933-1.c: New test case. > * gcc.target/i386/pr106933-2.c: Likewise. > * gcc.target/i386/pr106959-1.c: Likewise. > * gcc.target/i386/pr106959-2.c: Likewise. > * gcc.target/i386/pr106959-3.c: Likewise. OK. Thanks, Uros.