https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118206
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> --- It is in fold_truth_andor_for_ifcombine. I'll note that the first_bit/end_bit determination code seems to ignore the [lr]r_and_mask. We have 16-bit access with lr_and_mask 15, and 8-bit access with rr_and_mask -8. So if we figured out that for the 16-bit access with lr_and_mask 15 on little endian we really need just the 8-bit access too, we could emit more efficient code. Anyway, the wrong-code issue is elsewhere in that function. ll_align is just 8, not 16, because it performs a possibly unaligned 16-bit load, MEM <unsigned short> [(char * {ref-all})x_5(D)] p debug_tree (ll_inner->typed.type) <integer_type 0x7fffea2e82a0 public unsigned HI size <integer_cst 0x7fffea14e120 type <integer_type 0x7fffea14c0a8 bitsizetype> constant 16> unit-size <integer_cst 0x7fffea14e138 type <integer_type 0x7fffea14c000 sizetype> constant 2> user align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7fffea2e81f8 precision:16 min <integer_cst 0x7fffea318300 0> max <integer_cst 0x7fffea3182e8 65535>> And there doesn't seem to be some special case for when ll_inner as is already includes all the bits that need to be queried, in that case perhaps it isn't worth it to try get_best_mode etc. at all and we could just use ll_inner result (at least if it dominates both tests). Anyway, because of this we decide to split the load, i.e. /* If we can't have a single load, but can with two, figure out whether the two compares can be separated, i.e., whether the entirety of the first original compare is encompassed by the entirety of the first combined compare. If the first original compare is past the alignment boundary, arrange to compare that range first, by setting first1 (meaning make cmp[1] first, instead of cmp[0]). */ l_split_load = true; parts = 2; So far so good (although inefficient). But, we really should use (low_byte & 143) == 8 && (high_byte & 0) == 0 rather than && (high_byte & 15) == 0. BTW, the formatting is off: if (l_split_load) { gimple *point[2]; point[0] = ll_load; point[1] = rl_load; build_split_load (ld_arg[0], bitpos[0], bitsiz[0], toshift[0], shifted[0], rl_loc[3], ll_inner, ll_arg, lnmode, lnmode2, lnbitpos, ll_reversep, point); }