Hi,

I'm replying to Richard and keeping Andrew in cc since your suggestions
overlap.


On Tue 2024-06-11 14:48:06, Richard Biener wrote:
> On Thu, 30 May 2024, Filip Kastl wrote:
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-switchconv -march=znver3" } */
> 
> I think it's better to enable -mpopcnt and -mbmi (or what remains
> as minimal requirement).

Will do.  Currently the testcases are in the i386 directory.  After I exchange
the -march for -mpopcnt -mbmi can I put these testcases into gcc.dg/tree-ssa?
Will the -mpopcnt -mbmi options work with all target architectures?

> > +/* Check that the "exponential index transform" can be applied to this 
> > switch.
> > +
> > +   See comment of the exp_index_transform function for details about this
> > +   transformation.
> > +
> > +   We want:
> > +   - This form of the switch is more efficient
> > +   - Cases are powers of 2
> > +
> > +   Expects that SWTCH has at least one case.  */
> > +
> > +bool
> > +switch_conversion::is_exp_index_transform_viable (gswitch *swtch)
> > +{
> > +  tree index = gimple_switch_index (swtch);
> > +  tree index_type = TREE_TYPE (index);
> > +  basic_block swtch_bb = gimple_bb (swtch);
> > +  unsigned num_labels = gimple_switch_num_labels (swtch);
> > +
> > +  /* Check that we can efficiently compute logarithm of 2^k (using FFS) and
> > +     test that a given number is 2^k for some k (using POPCOUNT).  */
> > +  optimization_type opt_type = bb_optimization_type (swtch_bb);
> > +  if (!direct_internal_fn_supported_p (IFN_FFS, index_type, opt_type)
> > +    || !direct_internal_fn_supported_p (IFN_POPCOUNT, index_type, 
> > opt_type))
> > +    return false;
> > +
> 
> See above, I think this can be improved.  Would be nice to split out
> a can_pow2p (type) and can_log2 (type) and a corresponding
> gen_pow2p (op) and gen_log2 (op) function so this could be re-used
> and alternate variants added when required.
> 

Just to check that I understand this correctly:  You'd like me to create
functions can_pow2p, can_log2.  Those functions will check that there are
optabs for the target machine which allow us to efficiently test
power-of-2-ness of a number and which allow us to efficiently compute the
base-2 log of a power-of-2 number.  You'd also like me to create functions
gen_pow2p and gen_log2 which generate this code.  For now these functions will
just use POPCOUNT and FFS but they can be later extended to also consider
different instructions.  Is that right?

Into which file should I put these functions?

Is can_pow2p and gen_pow2p necessary?  As you noted one can always use
(x & -x) == x so testing pow2p can always be done efficiently.

> > +  /* Insert a statement that takes the logarithm of the index variable.  */
> > +  tree tmp2 = make_ssa_name (index_type);
> > +  gsi = gsi_start_bb (swtch_bb);
> 
> Please use gsi_after_labels (swtch_bb) even though you know there's no
> labels there.
> 
> > +  gcall *stmt_ffs = gimple_build_call_internal (IFN_FFS, 1, index);
> > +  gimple_call_set_lhs (stmt_ffs, tmp2);
> > +  gsi_insert_before (&gsi, stmt_ffs, GSI_SAME_STMT);
> > +
> > +  tree tmp3 = make_ssa_name (index_type);
> > +  gassign *stmt_minus_one = gimple_build_assign (tmp3, MINUS_EXPR, tmp2, 
> > one);
> > +  gsi_insert_before (&gsi, stmt_minus_one, GSI_SAME_STMT);
> 
> You could also use
> 
>      tree tmp2 = gimple_build (gsi, true, GSI_SAME_STMT,
>                              UNKNOWN_LOCATION, IFN_FFS, index);
>      tree tmp3 = gimple_build (gsi, true, GSI_SAME_STMT,
>                                UNKNOWN_LOCATION, MINUS_EXPR, tmp2, one);
> 
> which does the stmt building, temporary SSA name creation and insertion
> plus eventually folding things with a definition.  There isn't a
> gimple_build_cond with this API, but it would still work for the
> popcount build above.

I've tried using the gimple_build API and it is indeed nicer.  However, I
wasn't able to get it working with the FFS internal function.  IFN_FFS is of a
different type than what gimple_build accepts.  I've tried this

  tree tmp2 = gimple_build (&gsi, true, GSI_SAME_STMT, UNKNOWN_LOCATION,
                            as_combined_fn (IFN_FFS), index);

but that only caused an ICE.  I tried looking for an example of using
gimple_build to build an internal function call in the sources but I haven't
found any.

If you'd be willing to help me with this, I will happily use the gimple_build
API.  If I don't figure this out I will just stick with the less clean original
version.

> > +  /* Use the result of the logarithm as the new index variable.  */
> > +  gimple_switch_set_index (swtch, tmp3);
> > +  update_stmt (swtch);
> > +
> > +  /* Replace each case number with its logarithm.  */
> > +  unsigned i;
> > +  for (i = 1; i < num_labels; i++)
> > +    {
> > +      tree label = gimple_switch_label (swtch, i);
> > +      unsigned HOST_WIDE_INT label_hwi = tree_to_uhwi (CASE_LOW (label));
> > +      CASE_LOW (label) = build_int_cst (index_type, floor_log2 
> (label_hwi));
> 
> You should be able to directly use
> 
>          CASE_LOW (label) = tree_log2 (CASE_LOW (label));
> 
> Do you have test coverage for a signed index and a 0x80000000 case?
> It would satisfy popcount and ffs should work but the constant
> doesnt fit an uhwi and likely tree_log2 would ICE on it.

A situation with signed index variable and a 0x80000000 case shouldn't be a
problem since I check all cases for tree_fits_uhwi_p.  To make sure I made this
testcase

int foo()    
{    
    int a, b;    
    switch (a)    
    {    
        case 0x10000000:    
            b = 0;    
            break;    
        case 0x20000000:    
            b = 1;    
            break;    
        case 0x40000000:    
            b = 2;    
            break;    
        case 0x80000000:    
            b = 3;    
            break;    
    }    
    return b;    
}

and indeed, is_exp_transform_viable returned false due to the last case.  No
ICE happened.  I added this testcase to my personal set of testcases that I'll
run on subsequent versions of this patch.

> 
> As Andrew said working with wide_int might be best though signedness
> might interfere here as well.
> 

Ok, I'm looking into using wide_ints instead of HOST_WIDE_INTs.  However, I
need some help:  How should I check that case numbers are nonnegative?  If I
understand it correctly wide_ints don't carry signedness information so I'll
have to somehow check nonnegativity of trees, right?  Should I continue using
tree_fits_uhwi_p for this?

> > +    }
> > +
> > +  /* Update information about the switch statement.  */
> > +  tree first_label = gimple_switch_label (swtch, 1);
> > +  tree last_label = gimple_switch_label (swtch, num_labels - 1);
> > +
> > +  m_range_min = CASE_LOW (first_label);
> > +  m_range_max = CASE_LOW (last_label);
> > +  m_index_expr = gimple_switch_index (swtch);
> > +  m_switch_bb = swtch_bb;
> > +
> > +  unsigned HOST_WIDE_INT range_size_hwi = tree_to_uhwi (m_range_max)
> > +    - tree_to_uhwi (m_range_min);
> > +  m_range_size = build_int_cst (index_type, range_size_hwi);
> 
> Use int_const_binop (MINUS_EXPR, m_range_max, m_range_min) as is done
> elsewhere (m_range_size should be a wide_int, but that's out of scope 
> here).

Just to check that I understand the bit in parentheses correctly:  You think
that the switch conversion pass should represent m_range_size as a wide_int
instead of a tree.  However, that isn't something that should be changed in my
patch (it could be changed in some future patch instead), right?


Thanks for the suggestions.

Cheers,
Filip Kastl

Reply via email to