https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109156
--- Comment #2 from Tamar Christina <tnfchris at gcc dot gnu.org> --- (In reply to Richard Biener from comment #1) > (In reply to Tamar Christina from comment #0) > > 2. It looks like all targets that implement SAD do so with an instruction > > that does ABD and then perform a reduction. So it looks like no target has > > the semantics for SAD. > > x86 for example does SAD on 16 QImode data and 4 SImode accumulators which > means it sums 4 QImode absolute differences each (but SAD_EXPR leaves > unspecified which, so SAD_EXPR is only usable when you in the end sum > the accumulator lanes as well). > Oh I see, psadbw is actually SAD. sorry I missed the `s` in the instruction! > So I don't think 2. is true. > > > So this brings up the question of why the detection wasn't done based on ABD > > instead and leaving the reduction explicit in the vectorizer. > > > > So question is, should we create a completely new standalone pattern for ABD > > or should be make ABD the thing being detected and change SAD_EXPR to > > recognize ADB + reduction. > > > > Removing SAD completely in favor of ABD + reduction means that hand > > optimized versions in targets need updating so I'm in favor of still > > emitting SAD. > > I'd do a separate internal function for ABD, possibly sharing part of the > detection as you proposed. Great, will do so, thanks!