On Mon, Jan 02, 2023 at 08:45:15AM -0700, Jeff Law via Gcc-patches wrote: > On 1/1/23 08:55, Roger Sayle wrote: > > In 2011, the rtl.texi documentation was updated to reflect that the > > modes of the RTX unary operations FFS, POPCOUNT and PARITY must > > match those of their operands. Unfortunately, some of the transformations > > in simplify-rtx.cc predate this tightening of RTL semantics, and have > > not (until now) been updated/fixed. i.e. The POPCOUNT and PARITY > > optimizations were "correct" when I originally added them back in 2007. > > > > Segher requested that I split this piece out from a fix for PR 106594 in > > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check, both with and without --target_board=unix{-m32}, > > with no new failures. Ok for mainline? > > > > > > 2023-01-01 Roger Sayle <ro...@nextmovesoftware.com> > > > > gcc/ChangeLog > > * gcc/simplify-rtx.cc (simplify_unary_operation_1) <case FFS>: > > Avoid generating FFS with mismatched operand and result modes, by > > using an explicit SIGN_EXTEND/ZERO_EXTEND instead. > > <case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND. > > <case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND. > ?!? The docs still seem to indicate to me that the modes of the input and > output operands can differ. Let's take PARITY as an example:
See the PR50161 thread in https://gcc.gnu.org/legacy-ml/gcc-patches/2011-08/threads.html#01847 The options are to disallow different modes, which is what my patch did (perhaps not all documentation has been tweaked), or ensure that the operand of those is never constant. The latter is much harder and needs to be done in many places. While for SUBREG/ZERO_EXTEND/SIGN_EXTEND and to some extend also FLOAT/UNSIGNED_FLOAT we already try hard not to fold those immediately (and still find every now and then spots where we don't do that), for the rarely used unary rtls we certainly don't. > In fact Raphael and I were about to submit a patch which takes advantage of > that capability to improve the code slightly for risc-v. Just use a pattern with zero_extend or sign_extend around it or subreg of it? Jakub