https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722

--- Comment #14 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
(In reply to Li Pan from comment #7)
> Created attachment 59661 [details]
> with usad pattern

Can you please post the patch, lest we duplicate your effort.
It would be nice to test it on real hardware.

@Robin, it seems the current codegen generates 2 widening ops, which might not
be as efficient. We have done some profiling of widening add throughput and
Edwin's data tells me that the throughput might not be the same.

Reply via email to