https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906
Bug ID: 95906 Summary: Failure to recognize max pattern with mask Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- typedef int8_t v16i8 __attribute__((__vector_size__ (16))); v16i8 f(v16i8 a, v16i8 b) { v16i8 cmp = (a > b); return (cmp & a) | (~cmp & b); } int f2(int a, int b) { int cmp = -(a > b); return (cmp & a) | (~cmp & b); } f can be optimized to `__builtin_ia32_pmaxsb128` (on x86 with `-msse4`) (the `pmax` instructions can be used for the same pattern with similar types) and `f2` can be optimized to using `MAX_EXPR` (they're essentially the same but I've included the pattern for vectorized types because I originally found this in a function (which was made before SSE4) made for SSE). LLVM does these transformations, but GCC does not.