Hi! On Mon, Feb 17, 2020 at 09:54:58PM +0100, Marcus Geelnard wrote: > On 2020-02-17 02:12, Segher Boessenkool wrote: > >>Trying 10, 9 -> 11: > >>Failed to match this instruction: > >>(set (reg:SI 84) > >> (mult:SI (sign_extend:SI (subreg:HI (reg:SI 87) 0)) > >> (sign_extend:SI (subreg:HI (reg:SI 86) 0)))) > >And neither do both together. Do you have an instruction that can do > >this? How expensive is it? > > Unfortunately I don't have an instruction specifically for mult:SI > (sign_extend:SI (HI)) (sign_extend:SI (HI)).
> >In the meantime, you can add a pattern for the result of 9+10+11: > > > >(set (match_operand:SI ...) > > (mult:SI (sign_extend:SI (match_operand:HI ...)) > > (sign_extend:SI (match_operand:HI ...)))) > > > >(which you then have to handle, of course, either with a machine insn > >if that exists, or some other way, a libcall perhaps; you already have > >some way to do mulsi3 I guess?) > > Yes, I have mulsi3, but the thing is that I have instructions that > exactly match the definition of smulhsqi3/smulhshi3/smulhssi3. They're > called MULQ.B, MULQ.H and MULQ (for Q-format fixed point), and they > typically have a throughput of 1 operation / cycle. See [1]. I'd really > like to find a way to tell gcc to emit those instructions. Sure, but it won't go there in one step currently, since it would have to combine four insns, and it doesn't try that. But, if you add a pattern for this mulsihi2, it can create that one first, and then combine that single insn with the shift right. If this mulsihi2 isn't combined in that way, you will have to split it into separate extends and a mulsi3 later (or do it some other way, if you can do it cheaper for example). > If naive C code does not generate a matching pattern (it would be nice > if it did, though), is there something else that can be used (e.g. a > builtin)? You can always create a builtin for it, sure. Segher