12] Add a new pass for naive CRC loops detection

Jeff Law Sat, 08 Jun 2024 15:00:58 -0700



On 5/29/24 5:12 AM, Mariam Arutunian wrote:


    IIRC we looked at the problem of canonicalizing the loop into a form
    where we didn't necessarily have conditional blocks, instead we had
    branchless sequences for the conditional xor and dealing with the high
    bit in the crc.  My recollection was that the coremark CRC loop would
    always canonicalize, but that in general we still saw multiple CRC
    implementations that did not canonicalize and thus we still needed the
    more complex matching.  Correct?


The loop in CoreMark is not fully canonicalized in that form,
as there are still branches present for the conditional XOR operation.
I checked that using the -O2 and -O3 flags.

A bit of a surprise. Though it may be the case that some of thecanonicalization steps are happening later in the pipeline. No worriesas I think we'd already concluded that we'd see at least some CRCimplementations that wouldn't canonicalize down to branchless sequencesfor the conditional xor.



     > +
     > +gimple *
     > +crc_optimization::find_shift_after_xor (tree xored_crc)
     > +{
     > +  imm_use_iterator imm_iter;
     > +  use_operand_p use_p;
     > +
     > +  if (TREE_CODE (xored_crc) != SSA_NAME)
     > +    return nullptr;
    If we always expect XORED_CRC to be an SSA_NAME, we might be able to use
    gcc_assert TREE_CODE (XORED_CRC) == SSA_NAME);

I'm not sure that it always has to be an SSA_NAME.

For a logical operation like XOR it should always have the form

SSA_NAME = SSA_NAME ^ (SSA_NAME | CONSTANT)

The constant might be a vector constant, but the basic form won'tchange. It's one of the nicer properties of gimple. In contrast RTLwould allow a variety of lvalues and rvalues, including MEMs, REGs,SUBREGs, extensions, other binary ops, etc etc.

     > +
     > +/* Set M_PHI_FOR_CRC and M_PHI_FOR_DATA fields.
     > +   Returns false if there are more than two (as in CRC
    calculation only CRC's
     > +   and data's phi may exist) or no phi statements in STMTS (at
    least there must
     > +   be CRC's phi).
     > +   Otherwise, returns true.  */
     > +
     > +bool
     > +crc_optimization::set_crc_and_data_phi (auto_vec<gimple *> &stmts)
     > +{
     > +  for (auto stmt_it = stmts.begin (); stmt_it != stmts.end ();
    stmt_it++)
     > +    {
     > +      if (is_a<gphi *> (*stmt_it) && bb_loop_header_p (gimple_bb
    (*stmt_it)))
     > +     {
     > +       if (!m_phi_for_crc)
     > +         m_phi_for_crc = as_a<gphi *> (*stmt_it);
     > +       else if (!m_phi_for_data)
     > +         m_phi_for_data = as_a<gphi *> (*stmt_it);
     > +       else
     > +         {
     > +           if (dump_file && (dump_flags & TDF_DETAILS))
     > +             fprintf (dump_file, "Xor-ed variable depends on
    more than 2 "
     > +                                 "phis.\n");
     > +           return false;
     > +         }
     > +     }
     > +    }
     > +  return m_phi_for_crc;
    Hmm.  For a given PHI, how do we know if it's for the data item or the
    crc item, or something else (like a loop counter) entirely?

I trace the def-use chain upwards from the XOR statement to determinewhich PHI node corresponds to CRC and data.Since we assume the loop calculates CRC, I expect only variablesrepresenting data and CRC to participate in these operations.In the implementations I support, the loop counter is used only for theiteration.Any misidentification of CRC and data would occur only if the loopdoesn't calculate CRC, in which case next checks would fail, leading thealgorithm to identify it as not CRC.


Here, the PHI nodes for CRC and data might be mixed in places.
I just assume that the first found PHI is CRC, second data.

I correctly determine them later with the |*swap_crc_and_data_if_needed*| function.

Ah, OK.  That probably deserves a comment in this code.


jeff

Re: [RFC/RFA] [PATCH 08/12] Add a new pass for naive CRC loops detection

Reply via email to