On 5/29/24 5:12 AM, Mariam Arutunian wrote:
IIRC we looked at the problem of canonicalizing the loop into a form
where we didn't necessarily have conditional blocks, instead we had
branchless sequences for the conditional xor and dealing with the high
bit in the crc. My recollection was that the coremark CRC loop would
always canonicalize, but that in general we still saw multiple CRC
implementations that did not canonicalize and thus we still needed the
more complex matching. Correct?
The loop in CoreMark is not fully canonicalized in that form,
as there are still branches present for the conditional XOR operation.
I checked that using the -O2 and -O3 flags.
A bit of a surprise. Though it may be the case that some of the
canonicalization steps are happening later in the pipeline. No worries
as I think we'd already concluded that we'd see at least some CRC
implementations that wouldn't canonicalize down to branchless sequences
for the conditional xor.
> +
> +gimple *
> +crc_optimization::find_shift_after_xor (tree xored_crc)
> +{
> + imm_use_iterator imm_iter;
> + use_operand_p use_p;
> +
> + if (TREE_CODE (xored_crc) != SSA_NAME)
> + return nullptr;
If we always expect XORED_CRC to be an SSA_NAME, we might be able to use
gcc_assert TREE_CODE (XORED_CRC) == SSA_NAME);
I'm not sure that it always has to be an SSA_NAME.
For a logical operation like XOR it should always have the form
SSA_NAME = SSA_NAME ^ (SSA_NAME | CONSTANT)
The constant might be a vector constant, but the basic form won't
change. It's one of the nicer properties of gimple. In contrast RTL
would allow a variety of lvalues and rvalues, including MEMs, REGs,
SUBREGs, extensions, other binary ops, etc etc.
> +
> +/* Set M_PHI_FOR_CRC and M_PHI_FOR_DATA fields.
> + Returns false if there are more than two (as in CRC
calculation only CRC's
> + and data's phi may exist) or no phi statements in STMTS (at
least there must
> + be CRC's phi).
> + Otherwise, returns true. */
> +
> +bool
> +crc_optimization::set_crc_and_data_phi (auto_vec<gimple *> &stmts)
> +{
> + for (auto stmt_it = stmts.begin (); stmt_it != stmts.end ();
stmt_it++)
> + {
> + if (is_a<gphi *> (*stmt_it) && bb_loop_header_p (gimple_bb
(*stmt_it)))
> + {
> + if (!m_phi_for_crc)
> + m_phi_for_crc = as_a<gphi *> (*stmt_it);
> + else if (!m_phi_for_data)
> + m_phi_for_data = as_a<gphi *> (*stmt_it);
> + else
> + {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "Xor-ed variable depends on
more than 2 "
> + "phis.\n");
> + return false;
> + }
> + }
> + }
> + return m_phi_for_crc;
Hmm. For a given PHI, how do we know if it's for the data item or the
crc item, or something else (like a loop counter) entirely?
I trace the def-use chain upwards from the XOR statement to determine
which PHI node corresponds to CRC and data.
Since we assume the loop calculates CRC, I expect only variables
representing data and CRC to participate in these operations.
In the implementations I support, the loop counter is used only for the
iteration.
Any misidentification of CRC and data would occur only if the loop
doesn't calculate CRC, in which case next checks would fail, leading the
algorithm to identify it as not CRC.
Here, the PHI nodes for CRC and data might be mixed in places.
I just assume that the first found PHI is CRC, second data.
I correctly determine them later with the |
*swap_crc_and_data_if_needed*| function.
Ah, OK. That probably deserves a comment in this code.
jeff