On Fri, Oct 30, 2015 at 8:39 AM, Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> wrote: > Hi, > I have attached revamped version of Kugan's patch > (https://gcc.gnu.org/ml/gcc/2013-06/msg00100.html) for PR43721. > Test-case: http://pastebin.com/QMfpXLD9 > divmod pass dump: http://pastebin.com/yMY1ikCp > Assembly: http://pastebin.com/kk2HZpvA > > The approach I took is similar to sincos pass, which involves two parts: > a) divmod pass that transforms: > t1 = a / b; > t2 = a % b; > to the following sequence: > complex_tmp = DIVMOD (a, b); > t1 = REALPART_EXPR(a); > t2 = IMAGPART_EXPR(b); > > b) DIVMOD(a, b) is represented as an internal-fn and is expanded by > expand_DIVMOD(). > I am not sure if I have done this correctly. I was somehow looking to > reuse expand_divmod() but am not able to figure out how to do that > (AFAIU expand_divmod() strictly returns either the quotient or > remainder but never both which is what I want), so ended up with > manually expanding to rtx. > > While going through the sincos pass in execute_cse_sincos_1, I didn't > understand the following: > if (gimple_purge_dead_eh_edges (gimple_bb (stmt))) > cfg_changed = true; > Um why is the call to gimple_purge_dead_eh_edges necessary here?
The call replacement might no longer throw. > A silly question, what options to pass to gcc to print statistics ? I > added statistics_counter_event to keep track of number of calls > inserted/used but not sure how to print them :/ -fdump-tree-XXX-stats or -fdump-statistics-stats > I would be grateful for suggestions for improving the patch. First of all new functions need comments (expand_DIVMOD). Second - what's the reasoning for the pass placement seen? I think the transform is a lowering one, improving RTL expansion. The DIVMOD representation is harmful for GIMPLE optimizers. Third - I'd have integrated this with an existing pass - we have another lowering / RTL expansion kind pass which is pass_optimize_widening_mul. Please piggy-back on it. You seem to do the transform unconditionally even if the target has instructions for division and modulus but no instruction for divmod in which case you'll end up emitting a library call in the end instead of a div and mod instruction. I think the transform should be gated on missing div/mod instructions or the availability of a divmod one. + if (is_gimple_assign (stmt) + && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_binary) + { + if (gimple_assign_rhs_code (stmt) == TRUNC_DIV_EXPR) + cfg_changed |= execute_divmod_1 (stmt); you can directly check gimple_assign_rhs_code. I'd check for TRUNC_MOD_EXPR which seems to be less common and thus should cut down compile-time. + /* Case 2: When both are in top_bb */ + else if (op1_def_in_bb && op2_def_in_bb) + { + /* FIXME: better way to compare iterators?. */ + gimple_stmt_iterator gsi = get_later_stmt (top_bb, def_stmt_op1, def_stmt_op2); You can't use get_later_stmt, it's a vectorizer speciality. To do this test efficiently you need stmt UIDs computed. I miss a testcase (or two). Richard. > Thank you, > Prathamesh