On Mon, Sep 9, 2013 at 6:26 PM, Xinliang David Li <davi...@google.com> wrote: > On Fri, Sep 6, 2013 at 3:24 PM, Cong Hou <co...@google.com> wrote: >> First, thank you for your detailed comments again! Then I deeply >> apologize for not explaining my patch properly and responding to your >> previous comment. I didn't understand thoroughly the problem before >> submitting the patch. >> >> Previously I only considered the following three conversions for sqrt(): >> >> >> 1: (float) sqrt ((double) float_val) -> sqrtf (float_val) >> 2: (float) sqrtl ((long double) float_val) -> sqrtf (float_val) >> 3: (double) sqrtl ((long double) double_val) -> sqrt (double_val) >> >> >> We have four types here: >> >> TYPE is the type to which the result of the function call is converted. >> ITYPE is the type of the math call expression. >> TREE_TYPE(arg0) is the type of the function argument (before type >> conversion). >> NEWTYPE is chosen from TYPE and TREE_TYPE(arg0) with higher precision. >> It will be the type of the new math call expression after conversion. >> >> For all three cases above, TYPE is always the same as NEWTYPE. That is >> why I only considered TYPE during the precision comparison. ITYPE can >> only be double_type_node or long_double_type_node depending on the >> type of the math function. That is why I explicitly used those two >> types instead of ITYPE (no correctness issue). But you are right, >> ITYPE is more elegant and better here. >> >> After further analysis, I found I missed two more cases. Note that we >> have the following conditions according to the code in convert.c: >> >> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TYPE) >> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TREE_TYPE(arg0)) >> TYPE_PRECISION (NEWTYPE) < TYPE_PRECISION (ITYPE) >> >> the last condition comes from the fact that we only consider >> converting a math function call into another one with narrower type. >> Therefore we have >> >> TYPE_PRECISION(TYPE) < TYPE_PRECISION (ITYPE) >> TYPE_PRECISION(TREE_TYPE(arg0)) < TYPE_PRECISION (ITYPE) >> >> So for sqrt(), TYPE and TREE_TYPE(arg0) can only be float, and for >> sqrtl(), TYPE and TREE_TYPE(arg0) can be either float or double with >> four possible combinations. Therefore we have two more conversions to >> consider besides the three ones I mentioned above: >> >> >> 4: (float) sqrtl ((long double) double_val) -> (float) sqrt (double_val) >> 5: (double) sqrtl ((long double) float_val) -> sqrt ((double) float_val) >> >> >> For the first conversion here, TYPE (float) is different from NEWTYPE >> (double), and my previous patch doesn't handle this case.The correct >> way is to compare precisions of ITYPE and NEWTYPE now. >> >> To sum up, we are converting the expression >> >> (TYPE) sqrtITYPE ((ITYPE) expr) >> >> to >> >> (TYPE) sqrtNEWTYPE ((NEWTYPE) expr) >> >> and we require >> >> PRECISION (ITYPE) >= PRECISION (NEWTYPE) * 2 + 2 >> >> to make it a safe conversion. >> >> >> The new patch is pasted below. >> >> I appreciate your detailed comments and analysis, and next time when I >> submit a patch I will be more carefully about the reviewer's comment. >> >> >> Thank you! >> >> Cong >> >> >> >> Index: gcc/convert.c >> =================================================================== >> --- gcc/convert.c (revision 201891) >> +++ gcc/convert.c (working copy) >> @@ -135,16 +135,19 @@ convert_to_real (tree type, tree expr) >> CASE_MATHFN (COS) >> CASE_MATHFN (ERF) >> CASE_MATHFN (ERFC) >> - CASE_MATHFN (FABS) >> CASE_MATHFN (LOG) >> CASE_MATHFN (LOG10) >> CASE_MATHFN (LOG2) >> CASE_MATHFN (LOG1P) >> - CASE_MATHFN (LOGB) >> CASE_MATHFN (SIN) >> - CASE_MATHFN (SQRT) >> CASE_MATHFN (TAN) >> CASE_MATHFN (TANH) >> + /* The above functions are not safe to do this conversion. */ >> + if (!flag_unsafe_math_optimizations) >> + break; >> + CASE_MATHFN (SQRT) >> + CASE_MATHFN (FABS) >> + CASE_MATHFN (LOGB) >> #undef CASE_MATHFN >> { >> tree arg0 = strip_float_extensions (CALL_EXPR_ARG (expr, 0)); >> @@ -155,6 +158,27 @@ convert_to_real (tree type, tree expr) >> if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type)) >> newtype = TREE_TYPE (arg0); >> >> + /* We consider to convert >> + >> + (T1) sqrtT2 ((T2) exprT3) >> + to >> + (T1) sqrtT4 ((T4) exprT3) > > Should this be > > (T4) sqrtT4 ((T4) exprT3)
T4 is not necessarily the same as T1. For the conversion: (float) sqrtl ((long double) double_val) -> (float) sqrt (double_val) T4 is double and T1 is float. >> + >> + , where T1 is TYPE, T2 is ITYPE, T3 is TREE_TYPE (ARG0), >> + and T4 is NEWTYPE. > > NEWTYPE is also the wider one between T1 and T3. Right. Actually this is easy to catch from the context just before this comment. tree newtype = type; if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type)) newtype = TREE_TYPE (arg0); thanks, Cong > > David > >> All those types are of floating point types. >> + T4 (NEWTYPE) should be narrower than T2 (ITYPE). This conversion >> + is safe only if P1 >= P2*2+2, where P1 and P2 are precisions of >> + T2 and T4. See the following URL for a reference: >> + >> http://stackoverflow.com/questions/9235456/determining-floating-point-square-root >> + */ >> + if (fcode == BUILT_IN_SQRT || fcode == BUILT_IN_SQRTL) >> + { >> + int p1 = REAL_MODE_FORMAT (TYPE_MODE (itype))->p; >> + int p2 = REAL_MODE_FORMAT (TYPE_MODE (newtype))->p; >> + if (p1 < p2 * 2 + 2 && !flag_unsafe_math_optimizations) >> + break; >> + } >> + >> /* Be careful about integer to fp conversions. >> These may overflow still. */ >> if (FLOAT_TYPE_P (TREE_TYPE (arg0)) >> Index: gcc/testsuite/gcc.c-torture/execute/20030125-1.c >> =================================================================== >> --- gcc/testsuite/gcc.c-torture/execute/20030125-1.c (revision 201891) >> +++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c (working copy) >> @@ -44,11 +44,11 @@ __attribute__ ((noinline)) >> double >> sin(double a) >> { >> - abort (); >> + return a; >> } >> __attribute__ ((noinline)) >> float >> sinf(float a) >> { >> - return a; >> + abort (); >> } >> >> On Wed, Sep 4, 2013 at 3:26 PM, Joseph S. Myers <jos...@codesourcery.com> >> wrote: >>> On Wed, 4 Sep 2013, Xinliang David Li wrote: >>> >>>> > This patch submission still fails to pay attention to various of my >>>> > comments. >>>> > >>>> >>>> If you can provide inlined comments in the patch, that will be more >>>> useful and productive. >>> >>> I have explained things several times in this thread already. I see no >>> point in repeating things when what I would say has already been said and >>> ignored. As far as I can tell, this latest patch submission has taken one >>> line from the message it is in response to, and largely ignored the >>> following two paragraphs (including where I explicitly say that said line >>> should not appear literally in the source code at all). But, repeating >>> what I said before yet again: >>> >>> (but you should be referring to the relevant types >>> >>> The patch does not do properly that. It refers explicitly to >>> long_double_type_node and double_type_node. >>> >>> - "type", the type >>> being converted to, "itype", the type of the function being called in the >>> source code, "TREE_TYPE (arg0)", the type of the argument after extensions >>> have been removed, and "newtype", computed from those >>> >>> The patch only engages with "type". I suspect "newtype" is what it really >>> means there when using "type". When it uses long_double_type_node and >>> double_type_node, those should be "itype". >>> >>> - so you should have >>> expressions like the above with two or more of those four types, but not >>> with long_double_type_node directly). >>> >>> See above. The patch uses long_double_type_node directly, contrary to >>> what I explicitly said. You are free to disagree with something I say in >>> a review - but in that case you need to reply specifically to the review >>> comment explaining your rationale for disagreeing with it. Just ignoring >>> the comment and not mentioning the disagreement wastes the time of >>> reviewers. >>> >>> The patch submission will need to include a proper analysis to justify to >>> the reader why the particular inequality with particular types from those >>> four is correct in all cases where the relevant code may be executed. >>> >>> The comments only refer to "T1" and "T2" without explaining how they >>> relate to the original expression (three types) or the variables within >>> GCC dealing with various types (four types, because newtype gets >>> involved). I said back in >>> <http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00161.html> and >>> <http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01384.html> that three types >>> are involved - when I say "the patch submission needs to include its own >>> analysis for the full generality of three types", again, it's >>> inappropriate for a patch to omit such an analysis without justification. >>> The patch submission needs to include an analysis that properly explains >>> the transformation involved and the conditions under which it is safe. >>> Maybe starting along the lines of: >>> >>> We have an expression of the form (T1) sqrtT2 ((T2) exprT3), where exprT3 >>> has type T3 (TREE_TYPE (ARG0)), T2 is the type of the floating-point >>> square root function being used (ITYPE), T1 is TYPE and all these types >>> are binary floating-point types. We wish to optimize if possible to an >>> expression of the form (T1) sqrtT4 ((T4) exprT3), where T4 (NEWTYPE) is >>> narrower than T2. (Then explain the choice of T4 and the conditions under >>> which the transformation is safe, with appropriate references.) >>> >>> I suggest that for the next patch submission you (the patch submitter) >>> make sure you spend at least an hour properly understanding the issues and >>> all the previous messages in the thread and writing up the detailed, >>> coherent explanation of the transformation and analysis of the issues that >>> I asked for some time ago, as if for a reader who hasn't read any of this >>> thread or looked at this transformation in GCC before. I've certainly >>> spent longer than that on review in this thread. It's normal for a patch >>> involving anything at all tricky to take hours to write up (and also >>> normal for one to discover, in the course of writing the detailed coherent >>> analysis for people who aren't familiar with the code and the issues >>> involved, that there's in fact something wrong with the patch and it needs >>> revisiting before submission). >>> >>> -- >>> Joseph S. Myers >>> jos...@codesourcery.com