On Mon, 8 Jul 2024, Filip Kastl wrote:
> Hi,
>
> I'm replying to Richard and keeping Andrew in cc since your suggestions
> overlap.
>
>
> On Tue 2024-06-11 14:48:06, Richard Biener wrote:
> > On Thu, 30 May 2024, Filip Kastl wrote:
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2 -fdump-tree-switchconv -march=znver3" } */
> >
> > I think it's better to enable -mpopcnt and -mbmi (or what remains
> > as minimal requirement).
>
> Will do. Currently the testcases are in the i386 directory. After I exchange
> the -march for -mpopcnt -mbmi can I put these testcases into gcc.dg/tree-ssa?
> Will the -mpopcnt -mbmi options work with all target architectures?
No, those are i386 specific flags. At least for popcount there's
dejagnu effective targets popcount, popcountl and popcountll so you
could do
/* { dg-additional-options "-mpopcnt" { target { x86_64-*-* i?86-*-* } } }
*/
and guard the tree dump scan with { target popcount } to cover other
archs that have popcount (without adding extra flags).
> > > +/* Check that the "exponential index transform" can be applied to this
> > > switch.
> > > +
> > > + See comment of the exp_index_transform function for details about this
> > > + transformation.
> > > +
> > > + We want:
> > > + - This form of the switch is more efficient
> > > + - Cases are powers of 2
> > > +
> > > + Expects that SWTCH has at least one case. */
> > > +
> > > +bool
> > > +switch_conversion::is_exp_index_transform_viable (gswitch *swtch)
> > > +{
> > > + tree index = gimple_switch_index (swtch);
> > > + tree index_type = TREE_TYPE (index);
> > > + basic_block swtch_bb = gimple_bb (swtch);
> > > + unsigned num_labels = gimple_switch_num_labels (swtch);
> > > +
> > > + /* Check that we can efficiently compute logarithm of 2^k (using FFS)
> > > and
> > > + test that a given number is 2^k for some k (using POPCOUNT). */
> > > + optimization_type opt_type = bb_optimization_type (swtch_bb);
> > > + if (!direct_internal_fn_supported_p (IFN_FFS, index_type, opt_type)
> > > + || !direct_internal_fn_supported_p (IFN_POPCOUNT, index_type,
> > > opt_type))
> > > + return false;
> > > +
> >
> > See above, I think this can be improved. Would be nice to split out
> > a can_pow2p (type) and can_log2 (type) and a corresponding
> > gen_pow2p (op) and gen_log2 (op) function so this could be re-used
> > and alternate variants added when required.
> >
>
> Just to check that I understand this correctly: You'd like me to create
> functions can_pow2p, can_log2. Those functions will check that there are
> optabs for the target machine which allow us to efficiently test
> power-of-2-ness of a number and which allow us to efficiently compute the
> base-2 log of a power-of-2 number. You'd also like me to create functions
> gen_pow2p and gen_log2 which generate this code. For now these functions will
> just use POPCOUNT and FFS but they can be later extended to also consider
> different instructions. Is that right?
Right.
> Into which file should I put these functions?
Just in this file for now.
> Is can_pow2p and gen_pow2p necessary? As you noted one can always use
> (x & -x) == x so testing pow2p can always be done efficiently.
If you add this fallback then can_pow2p / gen_pow2p wouldn't be
necessary indeed.
> > > + /* Insert a statement that takes the logarithm of the index variable.
> > > */
> > > + tree tmp2 = make_ssa_name (index_type);
> > > + gsi = gsi_start_bb (swtch_bb);
> >
> > Please use gsi_after_labels (swtch_bb) even though you know there's no
> > labels there.
> >
> > > + gcall *stmt_ffs = gimple_build_call_internal (IFN_FFS, 1, index);
> > > + gimple_call_set_lhs (stmt_ffs, tmp2);
> > > + gsi_insert_before (&gsi, stmt_ffs, GSI_SAME_STMT);
> > > +
> > > + tree tmp3 = make_ssa_name (index_type);
> > > + gassign *stmt_minus_one = gimple_build_assign (tmp3, MINUS_EXPR, tmp2,
> > > one);
> > > + gsi_insert_before (&gsi, stmt_minus_one, GSI_SAME_STMT);
> >
> > You could also use
> >
> > tree tmp2 = gimple_build (gsi, true, GSI_SAME_STMT,
> > UNKNOWN_LOCATION, IFN_FFS, index);
> > tree tmp3 = gimple_build (gsi, true, GSI_SAME_STMT,
> > UNKNOWN_LOCATION, MINUS_EXPR, tmp2, one);
> >
> > which does the stmt building, temporary SSA name creation and insertion
> > plus eventually folding things with a definition. There isn't a
> > gimple_build_cond with this API, but it would still work for the
> > popcount build above.
>
> I've tried using the gimple_build API and it is indeed nicer. However, I
> wasn't able to get it working with the FFS internal function. IFN_FFS is of a
> different type than what gimple_build accepts. I've tried this
>
> tree tmp2 = gimple_build (&gsi, true, GSI_SAME_STMT, UNKNOWN_LOCATION,
> as_combined_fn (IFN_FFS), index);
>
> but that only caused an ICE. I tried looking for an example of using
> gimple_build to build an internal function call in the sources but I haven't
> found any.
>
> If you'd be willing to help me with this, I will happily use the gimple_build
> API. If I don't figure this out I will just stick with the less clean
> original
> version.
I think you need to use
tree tmp2 = gimple_build (&gsi, true, GSI_SAME_STMT, UNKNOWN_LOCATION,
IFN_FFS, integer_type_node, index);
that is, you always need to specify the type of the value produced
(here the return type of ffs).
> > > + /* Use the result of the logarithm as the new index variable. */
> > > + gimple_switch_set_index (swtch, tmp3);
> > > + update_stmt (swtch);
> > > +
> > > + /* Replace each case number with its logarithm. */
> > > + unsigned i;
> > > + for (i = 1; i < num_labels; i++)
> > > + {
> > > + tree label = gimple_switch_label (swtch, i);
> > > + unsigned HOST_WIDE_INT label_hwi = tree_to_uhwi (CASE_LOW (label));
> > > + CASE_LOW (label) = build_int_cst (index_type, floor_log2
> > (label_hwi));
> >
> > You should be able to directly use
> >
> > CASE_LOW (label) = tree_log2 (CASE_LOW (label));
> >
> > Do you have test coverage for a signed index and a 0x80000000 case?
> > It would satisfy popcount and ffs should work but the constant
> > doesnt fit an uhwi and likely tree_log2 would ICE on it.
>
> A situation with signed index variable and a 0x80000000 case shouldn't be a
> problem since I check all cases for tree_fits_uhwi_p. To make sure I made
> this
> testcase
>
> int foo()
> {
> int a, b;
> switch (a)
> {
> case 0x10000000:
> b = 0;
> break;
> case 0x20000000:
> b = 1;
> break;
> case 0x40000000:
> b = 2;
> break;
> case 0x80000000:
> b = 3;
> break;
> }
> return b;
> }
>
> and indeed, is_exp_transform_viable returned false due to the last case. No
> ICE happened. I added this testcase to my personal set of testcases that I'll
> run on subsequent versions of this patch.
>
> >
> > As Andrew said working with wide_int might be best though signedness
> > might interfere here as well.
> >
>
> Ok, I'm looking into using wide_ints instead of HOST_WIDE_INTs. However, I
> need some help: How should I check that case numbers are nonnegative? If I
> understand it correctly wide_ints don't carry signedness information so I'll
> have to somehow check nonnegativity of trees, right? Should I continue using
> tree_fits_uhwi_p for this?
You can use wi::ge_p (wide-int, 0, TYPE_SIGN (type-of-constant)) that
will then dispatch to ge{s,u}_p based on whether the number is signed
or unsigned.
> > > + }
> > > +
> > > + /* Update information about the switch statement. */
> > > + tree first_label = gimple_switch_label (swtch, 1);
> > > + tree last_label = gimple_switch_label (swtch, num_labels - 1);
> > > +
> > > + m_range_min = CASE_LOW (first_label);
> > > + m_range_max = CASE_LOW (last_label);
> > > + m_index_expr = gimple_switch_index (swtch);
> > > + m_switch_bb = swtch_bb;
> > > +
> > > + unsigned HOST_WIDE_INT range_size_hwi = tree_to_uhwi (m_range_max)
> > > + - tree_to_uhwi (m_range_min);
> > > + m_range_size = build_int_cst (index_type, range_size_hwi);
> >
> > Use int_const_binop (MINUS_EXPR, m_range_max, m_range_min) as is done
> > elsewhere (m_range_size should be a wide_int, but that's out of scope
> > here).
>
> Just to check that I understand the bit in parentheses correctly: You think
> that the switch conversion pass should represent m_range_size as a wide_int
> instead of a tree. However, that isn't something that should be changed in my
> patch (it could be changed in some future patch instead), right?
Right.
Thanks,
Richard.
>
> Thanks for the suggestions.
>
> Cheers,
> Filip Kastl
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)