https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64015
--- Comment #2 from Zhenqiang Chen <zhenqiang.chen at arm dot com> --- You force it to register? In fact, I tend to not force it to register in gen_ccmp_next, since it will introduce more overhead for ccmp, which performance maybe worse. My patch to fix the issue is at: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02966.html For CCMP, we still miss two optimizations for it: 1) Change the order of compares. In the case, if you change it to b > 252 && a > 10 You don't need "mov w0, 252" uxtb w1, w1 uxtb w0, w0 cmp w1, 252 ccmp w0, 10, 0, hi cset w0, hi ret 2) How to justify it is valueable (the overhead of ccmp is OK) when generating ccmp?