https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118206

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
It is in fold_truth_andor_for_ifcombine.
I'll note that the first_bit/end_bit determination code seems to ignore the
[lr]r_and_mask.  We have 16-bit access with lr_and_mask 15, and 8-bit access
with rr_and_mask -8.  So if we figured out that for the 16-bit access with
lr_and_mask 15 on little endian we really need just the 8-bit access too, we
could emit more efficient code.

Anyway, the wrong-code issue is elsewhere in that function.
ll_align is just 8, not 16, because it performs a possibly unaligned 16-bit
load,
MEM <unsigned short> [(char * {ref-all})x_5(D)]
p debug_tree (ll_inner->typed.type)
 <integer_type 0x7fffea2e82a0 public unsigned HI
    size <integer_cst 0x7fffea14e120 type <integer_type 0x7fffea14c0a8
bitsizetype> constant 16>
    unit-size <integer_cst 0x7fffea14e138 type <integer_type 0x7fffea14c000
sizetype> constant 2>
    user align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7fffea2e81f8 precision:16 min <integer_cst 0x7fffea318300 0> max <integer_cst
0x7fffea3182e8 65535>>

And there doesn't seem to be some special case for when ll_inner as is already
includes all the bits that need to be queried, in that case perhaps it isn't
worth it to try get_best_mode etc. at all and we could just use ll_inner result
(at least if it dominates both tests).

Anyway, because of this we decide to split the load, i.e.
      /* If we can't have a single load, but can with two, figure out whether
         the two compares can be separated, i.e., whether the entirety of the
         first original compare is encompassed by the entirety of the first
         combined compare.  If the first original compare is past the alignment
         boundary, arrange to compare that range first, by setting first1
         (meaning make cmp[1] first, instead of cmp[0]).  */
      l_split_load = true;
      parts = 2;
So far so good (although inefficient).
But, we really should use (low_byte & 143) == 8 && (high_byte & 0) == 0
rather than && (high_byte & 15) == 0.

BTW, the formatting is off:
      if (l_split_load)
            {
              gimple *point[2];
              point[0] = ll_load;
              point[1] = rl_load;
              build_split_load (ld_arg[0], bitpos[0], bitsiz[0], toshift[0],
                                shifted[0], rl_loc[3], ll_inner, ll_arg,
                                lnmode, lnmode2, lnbitpos, ll_reversep, point);
            }

Reply via email to