On Thu, Jul 30, 2020 at 01:55:03PM +0100, Roger Sayle wrote: > Now that you mention it, I'm not sure whether PR rtl-optimization 94543 > is a bug at all, but with you and Richard Henderson weighing in, I suspect > that I must be missing something subtle. > > The initial text of the bug report complains about an AND of 0xff not being > optimized away, as a zero extension is still present. I believe that the AND > with 0xff has been completely optimized away, and the remaining movzwl > (not movzbl) is the zero extension of the short result to the integer ABI > return > type. If the result is written as a short to memory, there is no extension > (or AND). > > But perhaps the problem isn't with min/and at all, but about whether function > arguments and return values are guaranteed to be extended on input/return, > so perhaps this is related to SUBREG_PROMOTED_P and friends? > > Thoughts?
Yeah, given that the RTL performs the computation in HImode rather than SImode, an extension is needed at least from the RTL POV, because there is no guarantee what behavior will be for the upper bits of the GPR. (insn 10 6 9 (set (reg:HI 88) (const_int 255 [0xff])) "x.c":1:56 -1 (nil)) ... (insn 11 9 12 (set (reg:HI 87) (if_then_else:HI (leu (reg:CC 17 flags) (const_int 0 [0])) (reg/v:HI 84 [ x ]) (reg:HI 88))) "x.c":1:56 -1 (nil)) If these two (or just one of them) insns were emitted as movw $255, %ax or cmova %ax, %di rather than movl $255, %eax or cmova %eax, %edi, then the upper 16 bits would contain the previous content rather than being cleared. So, if we want to get rid of the useless zero extension, we'd need some target specific pass or hook or whatever that would query instruction attributes/whatever to figure out what can or can't be trusted to be zero. E.g. the conditions on when movhi_internal uses mode attribute of SI vs. HI is quite complicated, but I guess hopefully it is reliable. I'm afraid for many insns it is not. And then there is also the on the side knowledge that pretty much all insns for TARGET_64BIT with GPRs zero extend if they are 32-bit insns, but the question is if one can trust mode attribute for those. Another possibility would be to move the zero extensions earlier in this case, so use set (reg:SI 88) (const_int 255) and 32-bit conditional move rather than 16-bit, and in those cases perhaps with help of range information during expansion we could create SUBREGs and use SUBREG_PROMOTED_* on those. Jakub