On 5/29/24 5:12 AM, Mariam Arutunian wrote:


    IIRC we looked at the problem of canonicalizing the loop into a form
    where we didn't necessarily have conditional blocks, instead we had
    branchless sequences for the conditional xor and dealing with the high
    bit in the crc.  My recollection was that the coremark CRC loop would
    always canonicalize, but that in general we still saw multiple CRC
    implementations that did not canonicalize and thus we still needed the
    more complex matching.  Correct?


The loop in CoreMark is not fully canonicalized in that form,
as there are still branches present for the conditional XOR operation.
I checked that using the -O2 and -O3 flags.
A bit of a surprise. Though it may be the case that some of the canonicalization steps are happening later in the pipeline. No worries as I think we'd already concluded that we'd see at least some CRC implementations that wouldn't canonicalize down to branchless sequences for the conditional xor.




     > +
     > +gimple *
     > +crc_optimization::find_shift_after_xor (tree xored_crc)
     > +{
     > +  imm_use_iterator imm_iter;
     > +  use_operand_p use_p;
     > +
     > +  if (TREE_CODE (xored_crc) != SSA_NAME)
     > +    return nullptr;
    If we always expect XORED_CRC to be an SSA_NAME, we might be able to use
    gcc_assert TREE_CODE (XORED_CRC) == SSA_NAME);

I'm not sure that it always has to be an SSA_NAME.
For a logical operation like XOR it should always have the form

SSA_NAME = SSA_NAME ^ (SSA_NAME | CONSTANT)

The constant might be a vector constant, but the basic form won't change. It's one of the nicer properties of gimple. In contrast RTL would allow a variety of lvalues and rvalues, including MEMs, REGs, SUBREGs, extensions, other binary ops, etc etc.


     > +
     > +/* Set M_PHI_FOR_CRC and M_PHI_FOR_DATA fields.
     > +   Returns false if there are more than two (as in CRC
    calculation only CRC's
     > +   and data's phi may exist) or no phi statements in STMTS (at
    least there must
     > +   be CRC's phi).
     > +   Otherwise, returns true.  */
     > +
     > +bool
     > +crc_optimization::set_crc_and_data_phi (auto_vec<gimple *> &stmts)
     > +{
     > +  for (auto stmt_it = stmts.begin (); stmt_it != stmts.end ();
    stmt_it++)
     > +    {
     > +      if (is_a<gphi *> (*stmt_it) && bb_loop_header_p (gimple_bb
    (*stmt_it)))
     > +     {
     > +       if (!m_phi_for_crc)
     > +         m_phi_for_crc = as_a<gphi *> (*stmt_it);
     > +       else if (!m_phi_for_data)
     > +         m_phi_for_data = as_a<gphi *> (*stmt_it);
     > +       else
     > +         {
     > +           if (dump_file && (dump_flags & TDF_DETAILS))
     > +             fprintf (dump_file, "Xor-ed variable depends on
    more than 2 "
     > +                                 "phis.\n");
     > +           return false;
     > +         }
     > +     }
     > +    }
     > +  return m_phi_for_crc;
    Hmm.  For a given PHI, how do we know if it's for the data item or the
    crc item, or something else (like a loop counter) entirely?



I trace the def-use chain upwards from the XOR statement to determine which PHI node corresponds to CRC and data. Since we assume the loop calculates CRC, I expect only variables representing data and CRC to participate in these operations. In the implementations I support, the loop counter is used only for the iteration. Any misidentification of CRC and data would occur only if the loop doesn't calculate CRC, in which case next checks would fail, leading the algorithm to identify it as not CRC.

Here, the PHI nodes for CRC and data might be mixed in places.
I just assume that the first found PHI is CRC, second data.
I correctly determine them later with the | *swap_crc_and_data_if_needed*| function.
Ah, OK.  That probably deserves a comment in this code.


jeff

Reply via email to