On Mon, Nov 25, 2013 at 2:08 PM, Richard Sandiford <rsand...@linux.vnet.ibm.com> wrote: > Richard Biener <richard.guent...@gmail.com> writes: >> @@ -958,6 +961,12 @@ streamer_write_tree_header (struct output_block >> *ob, tree expr) >> streamer_write_uhwi (ob, BINFO_N_BASE_BINFOS (expr)); >> else if (TREE_CODE (expr) == CALL_EXPR) >> streamer_write_uhwi (ob, call_expr_nargs (expr)); >> + else if (CODE_CONTAINS_STRUCT (code, TS_INT_CST)) >> + { >> + gcc_checking_assert (TREE_INT_CST_NUNITS (expr)); >> + streamer_write_uhwi (ob, TREE_INT_CST_NUNITS (expr)); >> + streamer_write_uhwi (ob, TREE_INT_CST_EXT_NUNITS (expr)); >> + } >> >> isn't ext_nunits always either nunits or nunits + 1? So it should be >> sufficient to write a single bit in the tree_base bitpack and only >> write the nunits that are required for the actual allocation in >> write_tree_header, not both. > > ext_nunits can be > nunits + 1. E.g. imagine a 256-bit all-1s unsigned > integer. As a 256-bit number it's simply { -1 }, with all other bits > being sign-extensions of the -1 HWI. As a >256-bit number it's > { -1, -1, -1, -1, 0 }, assuming 64-bit HWIs.
Ah, ok. >> index 4a24c66..f3e0ffe 100644 >> --- a/gcc/tree-vrp.c >> +++ b/gcc/tree-vrp.c >> @@ -1141,15 +1142,7 @@ operand_less_p (tree val, tree val2) >> { >> /* LT is folded faster than GE and others. Inline the common case. */ >> if (TREE_CODE (val) == INTEGER_CST && TREE_CODE (val2) == INTEGER_CST) >> - { >> - if (TYPE_UNSIGNED (TREE_TYPE (val))) >> - return INT_CST_LT_UNSIGNED (val, val2); >> - else >> - { >> - if (INT_CST_LT (val, val2)) >> - return 1; >> - } >> - } >> + return INT_CST_LT (val, val2); >> else >> { >> tree tcmp; >> >> Note that INT_CST_LT and INT_CST_LT_UNSIGNED were supposed to >> be very fast. Now you made them >> >> #define INT_CST_LT(A, B) (wi::lts_p (wi::to_widest (A), wi::to_widest (B))) >> >> which a) doesn't look the same to me and b) hides complexity. > > Not sure what complexity you mean here: to_widest just reads the > constant in its "infinite-precision" form. I'm not sure: > >> So why not write >> >> if (TREE_CODE (val) == INTEGER_CST && TREE_CODE (val2) == INTEGER_CST) >> if (TYPE_UNSIGNED (...) >> return wi::ltu_p (val, va2); >> else >> return wi::lts_p (val, val2); >> >> ? > > ...this would be faster in practice, since the fast path in lts_p is > length-based and can handle small integers quickly whatever their precision. > > The second form means that VAL1 and VAL2 must have the same precision. > I don't know whether that's a good thing or not in this instance. Ok, I take your word for granted. Still with INT_CST_LT now being exactly the same as tree_int_cst_lt I'd prefer to eliminate the former in favor of the latter. Btw, trying to look at code generation via -fdump-tree-all blows up memory - you have a leak somewhere I guess :/ I'm trying to compile tree.ii with -O2 -fdump-tree-all. >> @@ -6966,12 +6975,7 @@ tree_int_cst_lt (const_tree t1, const_tree t2) >> int >> tree_int_cst_compare (const_tree t1, const_tree t2) >> { >> - if (tree_int_cst_lt (t1, t2)) >> - return -1; >> - else if (tree_int_cst_lt (t2, t1)) >> - return 1; >> - else >> - return 0; >> + return wi::cmps (wi::to_widest (t1), wi::to_widest (t2)); >> } >> >> /* Return true if T is an INTEGER_CST whose numerical value (extended >> >> How does this work for comparing UHWI_MAX to 0 for example? Looking >> at >> >> template <typename T1, typename T2> >> inline int >> wi::cmps (const T1 &x, const T2 &y) >> { >> unsigned int precision = get_binary_precision (x, y); >> WIDE_INT_REF_FOR (T1) xi (x, precision); >> WIDE_INT_REF_FOR (T2) yi (y, precision); >> if (precision <= HOST_BITS_PER_WIDE_INT) >> { >> HOST_WIDE_INT xl = xi.to_shwi (); >> HOST_WIDE_INT yl = yi.to_shwi (); >> if (xl < yl) >> return -1; >> >> this seems to happily return -1 instead of 1? Similar issues elsewhere >> where you change compares to unconditonally perform signed compares. >> (it's at least non-obvious to me). >> >> Ah, the to_widest () will make it XImode * 4 extended and thus >> get_precision returns 2048 (or so) so we don't take the shortcut >> (which means it's slow). >> >> Shouldn't the shortcuts be taken on 'len' == 1 rather than >> precision <= HWI? > > I suppose we could use the same approach as for lts_p, if that seems > OK to you. Yeah, that looks good to me. >> Side-note: we have too many ways to compare trees / integers > > Do you mean in wide-int, or in the tree routines? At the wide-int > level there are only really two ways: compare trees as sequences of bits > (ignoring their sign) and compare trees as "infinite-precision" numbers. In the tree routines. Richard. > Thanks, > Richard >