https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56829
Peter Cordes <peter at cordes dot ca> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |peter at cordes dot ca --- Comment #3 from Peter Cordes <peter at cordes dot ca> --- x86 has packed-compare and movemask instructions, but it also has a PTEST instruction that sets flags directly from the result of a vector op. In some cases it's more efficient than movemsk + test/jcc (esp. if you can make use of the AND / ANDN ops it does, instead of just testing a vector against itself). I recently wrote an answer on stackoverflow comparing PTEST vs. PCMPEQB / PMOVMSKB for comparing two vectors for equality. Lower latency, but only equal or fewer uops in this case that was ideal for PTEST. http://stackoverflow.com/a/31198132/224132 Just something to keep in mind when designing gcc's arch-agnostic vector support, that at least x86 can branch on vector PTEST, without needing any compare / movemask. Requiring things to be written in terms of a movemask wouldn't be horrible for x86, though.