On Wed, 11 Nov 2015, Prathamesh Kulkarni wrote: > On 10 November 2015 at 20:11, Richard Biener <rguent...@suse.de> wrote: > > On Mon, 9 Nov 2015, Prathamesh Kulkarni wrote: > > > >> On 4 November 2015 at 20:35, Richard Biener <rguent...@suse.de> wrote: > >> > > >> > Btw, did you investigate code gen differences on x86_64/i586? That > >> > target expands all divisions/modulo ops via divmod, relying on CSE > >> > solely as the HW always computes both div and mod (IIRC). > >> x86_64 has optab_handler for divmod defined, so the transform won't > >> take place on x86. > > > > Ok. > > > >> > + > >> > + gassign *assign_stmt = gimple_build_assign (gimple_assign_lhs > >> > (use_stmt), rhs); > >> > + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); > >> > > >> > Ick. Please use > >> > > >> > gimple_set_rhs_from_tree (use_stmt, res); > >> Um there doesn't seem to be gimple_set_rhs_from_tree. > >> I used gimple_assign_set_rhs_from_tree which requires gsi for use_stmt. > >> Is that OK ? > > > > Yes. > > > >> > update_stmt (use_stmt); > >> > if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt)) > >> > cfg_changed = true; > >> > > >> > + free_dominance_info (CDI_DOMINATORS); > >> > > >> > do not free dominators. > >> > >> I have done the suggested changes in the attached patch. > >> I have a few questions: > >> > >> a) Does the change to insert DIVMOD call before topmost div or mod > >> stmt with matching operands > >> look correct ? > > > > + /* Insert call-stmt just before the topmost div/mod stmt. > > + top_bb dominates all other basic blocks containing div/mod stms > > + so, the topmost stmt would be the first div/mod stmt with matching > > operands > > + in top_bb. */ > > + > > + gcc_assert (top_bb != 0); > > + gimple_stmt_iterator gsi; > > + for (gsi = gsi_after_labels (top_bb); !gsi_end_p (gsi); gsi_next > > (&gsi)) > > + { > > + gimple *g = gsi_stmt (gsi); > > + if (is_gimple_assign (g) > > + && (gimple_assign_rhs_code (g) == TRUNC_DIV_EXPR > > + || gimple_assign_rhs_code (g) == TRUNC_MOD_EXPR) > > + && operand_equal_p (op1, gimple_assign_rhs1 (g), 0) > > + && operand_equal_p (op2, gimple_assign_rhs2 (g), 0)) > > + break; > > > > Looks overly complicated to me. Just remember "topmost" use_stmt > > alongside top_bb (looks like you'll no longer need top_bb if you > > retail top_stmt). And then do > > > > gsi = gsi_for_stmt (top_stmt); > > > > and insert before that. > Thanks, done in this patch. Does it look OK ? > IIUC gimple_uid (stmt1) < gimple_uid (stmt2) can be used to check if > stmt1 occurs before stmt2 > only if stmt1 and stmt2 are in the same basic block ? > > > >> b) Handling constants - I dropped handling constants in the attached > >> patch. IIUC we don't want > >> to enable this transform if there's a specialized expansion for some > >> constants for div or mod ? > > > > See expand_divmod which has lots of special cases for constant operands > > not requiring target support for div or mod. > Thanks, would it be OK if I do this in follow up patch ?
Well, just not handle them like in your patch is fine. > > > >> I suppose this would also be target dependent and require a target hook ? > >> For instance arm defines modsi3 pattern to expand mod when 2nd operand > >> is constant and <= 0 or power of 2, > >> while for other cases goes the expand_divmod() route to generate call > >> to __aeabi_idivmod libcall. > > > > Ok, so it lacks a signed mod instruction. > > > >> c) Gating the divmod transform - > >> I tried gating it on checks for optab_handlers on div and mod, however > >> this doesn't enable transform for arm cortex-a9 > >> anymore (cortex-a9 doesn't have hardware instructions for integer div and > >> mod). > >> IIUC for cortex-a9, > >> optab_handler (sdivmod_optab, SImode) returns CODE_FOR_nothing because > >> HAVE_divsi3 is 0. > >> However optab_handler (smod_optab, SImode) matches since optab_handler > >> only checks for existence of pattern > >> (and not whether the pattern gets matched). > >> I suppose we should enable the transform only if the divmod, div, and > >> mod pattern do not match rather than checking > >> if the patterns exist via optab_handler ? For a general x % y, modsi3 > >> would fail to match but optab_handler(smod_optab, SImode ) still > >> says it's matched. > > > > Ah, of course. Querying for an optab handler is just a cheap > > guesstimate... Not sure how to circumvent this best (sub-target > > enablement of patterns). RTL expansion just goes ahead (of course) > > and sees if expansion eventually fails. Richard? > > > >> Should we define a new target hook combine_divmod, which returns true > >> if transforming to divmod is desirable for that > >> target ? > >> The default definition could be: > >> bool default_combine_divmod (enum machine_mode mode, tree op1, tree op2) > >> { > >> // check for optab_handlers for div/mod/divmod and libfunc for divmod > >> } > >> > >> And for arm, it could be over-ridden to return false if op2 is > >> constant and <= 0 or power of 2. > >> I am not really sure if this is a good idea since I am replicating > >> information from modsi3 pattern. > >> Any change to the pattern may require corresponding change to the hook :/ > > > > Yeah, I don't think that is desirable. Ideally we'd have a way > > to query HAVE_* for CODE_FOR_* which would mean target-insns.def > > support for all div/mod/divmod patterns(?) and queries... > > > > Not sure if what would be enough though. > > > > Note that the divmod check is equally flawed. > > > > I think with the above I'd enable the transform when > > > > + if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing > > + || (optab_libfunc (divmod_optab, mode) != NULL_RTX > > && optab_handler ([su]div_optab, mode) == CODE_FOR_nothing)) > > + return false; > Um this fails for the arm backend (for cortex-a9) because > optab_handler (divmod_optab, mode) != CODE_FOR_nothing is false > optab_libfunc (divmod_optab, mode) != NULL_RTX is true. > optab_handler (div_optab, mode) == CODE_FOR_nothing is true. > which comes down to false || (true && true) which is true and we hit > return false. Oh, sorry to mess up the test - it was supposed to be inverted. > AFAIU, we want the transform to be disabled if: > a) optab_handler exists for divmod. > b) optab_handler exists for div. > c) optab_libfunc does not exist for divmod. */ > > + if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing > + || optab_handler (div_optab, mode) != CODE_FOR_nothing > + || optab_libfunc (divmod_optab, mode) == NULL_RTX) > + return false; > Does that look correct ? No. That will disable if we have a divmod optab. Instead try if (! (optab_handler (divmod_optab, mode) != CODE_FOR_nothing || (optab_libfunc (divmod_optab, mode) != NULL_RTX && optab_handler ([su]div_optab, mode) == CODE_FOR_nothing))) return false; which is what I intended. If we have a divmod optab go ahead. If we have a libfunc and not a div optab then as well. > > > > so we either will have a divmod instruction (hopefully not sub-target > > disabled for us) or a libfunc for divmod and for sure no HW divide > > instruction (HW mod can be emulated by HW divide but not the other > > way around). > > > >> d) Adding effective-target-check for divmod: I just enabled it for > >> arm*-*-* for now. I could additionally append more targets, > >> not sure if this is the right approach. > > > > Looks good to me. > Is this version OK if bootstrap/testing passes ? Ok with adjusting the optab check like above. Thanks, Richard.