Any comment or more suggestions on this patch?

thanks,
Cong

On Mon, Sep 9, 2013 at 7:28 PM, Cong Hou <co...@google.com> wrote:
> On Mon, Sep 9, 2013 at 6:26 PM, Xinliang David Li <davi...@google.com> wrote:
>> On Fri, Sep 6, 2013 at 3:24 PM, Cong Hou <co...@google.com> wrote:
>>> First, thank you for your detailed comments again! Then I deeply
>>> apologize for not explaining my patch properly and responding to your
>>> previous comment. I didn't understand thoroughly the problem before
>>> submitting the patch.
>>>
>>> Previously I only considered the following three conversions for sqrt():
>>>
>>>
>>> 1: (float) sqrt ((double) float_val)  ->  sqrtf (float_val)
>>> 2: (float) sqrtl ((long double) float_val)  ->  sqrtf (float_val)
>>> 3: (double) sqrtl ((long double) double_val)  ->  sqrt (double_val)
>>>
>>>
>>> We have four types here:
>>>
>>> TYPE is the type to which the result of the function call is converted.
>>> ITYPE is the type of the math call expression.
>>> TREE_TYPE(arg0) is the type of the function argument (before type 
>>> conversion).
>>> NEWTYPE is chosen from TYPE and TREE_TYPE(arg0) with higher precision.
>>> It will be the type of the new math call expression after conversion.
>>>
>>> For all three cases above, TYPE is always the same as NEWTYPE. That is
>>> why I only considered TYPE during the precision comparison. ITYPE can
>>> only be double_type_node or long_double_type_node depending on the
>>> type of the math function. That is why I explicitly used those two
>>> types instead of ITYPE (no correctness issue). But you are right,
>>> ITYPE is more elegant and better here.
>>>
>>> After further analysis, I found I missed two more cases. Note that we
>>> have the following conditions according to the code in convert.c:
>>>
>>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TYPE)
>>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TREE_TYPE(arg0))
>>> TYPE_PRECISION (NEWTYPE) < TYPE_PRECISION (ITYPE)
>>>
>>> the last condition comes from the fact that we only consider
>>> converting a math function call into another one with narrower type.
>>> Therefore we have
>>>
>>> TYPE_PRECISION(TYPE) < TYPE_PRECISION (ITYPE)
>>> TYPE_PRECISION(TREE_TYPE(arg0)) < TYPE_PRECISION (ITYPE)
>>>
>>> So for sqrt(), TYPE and TREE_TYPE(arg0) can only be float, and for
>>> sqrtl(), TYPE and TREE_TYPE(arg0) can be either float or double with
>>> four possible combinations. Therefore we have two more conversions to
>>> consider besides the three ones I mentioned above:
>>>
>>>
>>> 4: (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)
>>> 5: (double) sqrtl ((long double) float_val)  ->  sqrt ((double) float_val)
>>>
>>>
>>> For the first conversion here, TYPE (float) is different from NEWTYPE
>>> (double), and my previous patch doesn't handle this case.The correct
>>> way is to compare precisions of ITYPE and NEWTYPE now.
>>>
>>> To sum up, we are converting the expression
>>>
>>> (TYPE) sqrtITYPE ((ITYPE) expr)
>>>
>>> to
>>>
>>> (TYPE) sqrtNEWTYPE ((NEWTYPE) expr)
>>>
>>> and we require
>>>
>>> PRECISION (ITYPE) >= PRECISION (NEWTYPE) * 2 + 2
>>>
>>> to make it a safe conversion.
>>>
>>>
>>> The new patch is pasted below.
>>>
>>> I appreciate your detailed comments and analysis, and next time when I
>>> submit a patch I will be more carefully about the reviewer's comment.
>>>
>>>
>>> Thank you!
>>>
>>> Cong
>>>
>>>
>>>
>>> Index: gcc/convert.c
>>> ===================================================================
>>> --- gcc/convert.c (revision 201891)
>>> +++ gcc/convert.c (working copy)
>>> @@ -135,16 +135,19 @@ convert_to_real (tree type, tree expr)
>>>    CASE_MATHFN (COS)
>>>    CASE_MATHFN (ERF)
>>>    CASE_MATHFN (ERFC)
>>> -  CASE_MATHFN (FABS)
>>>    CASE_MATHFN (LOG)
>>>    CASE_MATHFN (LOG10)
>>>    CASE_MATHFN (LOG2)
>>>    CASE_MATHFN (LOG1P)
>>> -  CASE_MATHFN (LOGB)
>>>    CASE_MATHFN (SIN)
>>> -  CASE_MATHFN (SQRT)
>>>    CASE_MATHFN (TAN)
>>>    CASE_MATHFN (TANH)
>>> +    /* The above functions are not safe to do this conversion. */
>>> +    if (!flag_unsafe_math_optimizations)
>>> +      break;
>>> +  CASE_MATHFN (SQRT)
>>> +  CASE_MATHFN (FABS)
>>> +  CASE_MATHFN (LOGB)
>>>  #undef CASE_MATHFN
>>>      {
>>>        tree arg0 = strip_float_extensions (CALL_EXPR_ARG (expr, 0));
>>> @@ -155,6 +158,27 @@ convert_to_real (tree type, tree expr)
>>>        if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type))
>>>   newtype = TREE_TYPE (arg0);
>>>
>>> +      /* We consider to convert
>>> +
>>> +     (T1) sqrtT2 ((T2) exprT3)
>>> + to
>>> +     (T1) sqrtT4 ((T4) exprT3)
>>
>> Should this be
>>
>>           (T4) sqrtT4 ((T4) exprT3)
>
> T4 is not necessarily the same as T1. For the conversion:
>
>  (float) sqrtl ((long double) double_val)  ->  (float) sqrt (double_val)
>
> T4 is double and T1 is float.
>
>
>>> +
>>> +  , where T1 is TYPE, T2 is ITYPE, T3 is TREE_TYPE (ARG0),
>>> + and T4 is NEWTYPE.
>>
>> NEWTYPE is also the wider one between T1 and T3.
>
> Right. Actually this is easy to catch from the context just before
> this comment.
>
> tree newtype = type;
> if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type))
>     newtype = TREE_TYPE (arg0);
>
>
>
> thanks,
> Cong
>
>
>>
>> David
>>
>>> All those types are of floating point types.
>>> + T4 (NEWTYPE) should be narrower than T2 (ITYPE). This conversion
>>> + is safe only if P1 >= P2*2+2, where P1 and P2 are precisions of
>>> + T2 and T4. See the following URL for a reference:
>>> + 
>>> http://stackoverflow.com/questions/9235456/determining-floating-point-square-root
>>> + */
>>> +      if (fcode == BUILT_IN_SQRT || fcode == BUILT_IN_SQRTL)
>>> + {
>>> +  int p1 = REAL_MODE_FORMAT (TYPE_MODE (itype))->p;
>>> +  int p2 = REAL_MODE_FORMAT (TYPE_MODE (newtype))->p;
>>> +  if (p1 < p2 * 2 + 2 && !flag_unsafe_math_optimizations)
>>> +    break;
>>> + }
>>> +
>>>        /* Be careful about integer to fp conversions.
>>>   These may overflow still.  */
>>>        if (FLOAT_TYPE_P (TREE_TYPE (arg0))
>>> Index: gcc/testsuite/gcc.c-torture/execute/20030125-1.c
>>> ===================================================================
>>> --- gcc/testsuite/gcc.c-torture/execute/20030125-1.c (revision 201891)
>>> +++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c (working copy)
>>> @@ -44,11 +44,11 @@ __attribute__ ((noinline))
>>>  double
>>>  sin(double a)
>>>  {
>>> - abort ();
>>> + return a;
>>>  }
>>>  __attribute__ ((noinline))
>>>  float
>>>  sinf(float a)
>>>  {
>>> - return a;
>>> + abort ();
>>>  }
>>>
>>> On Wed, Sep 4, 2013 at 3:26 PM, Joseph S. Myers <jos...@codesourcery.com> 
>>> wrote:
>>>> On Wed, 4 Sep 2013, Xinliang David Li wrote:
>>>>
>>>>> > This patch submission still fails to pay attention to various of my
>>>>> > comments.
>>>>> >
>>>>>
>>>>> If you can provide inlined comments in the patch, that will be more
>>>>> useful and productive.
>>>>
>>>> I have explained things several times in this thread already.  I see no
>>>> point in repeating things when what I would say has already been said and
>>>> ignored.  As far as I can tell, this latest patch submission has taken one
>>>> line from the message it is in response to, and largely ignored the
>>>> following two paragraphs (including where I explicitly say that said line
>>>> should not appear literally in the source code at all).  But, repeating
>>>> what I said before yet again:
>>>>
>>>>   (but you should be referring to the relevant types
>>>>
>>>> The patch does not do properly that.  It refers explicitly to
>>>> long_double_type_node and double_type_node.
>>>>
>>>>   - "type", the type
>>>>   being converted to, "itype", the type of the function being called in the
>>>>   source code, "TREE_TYPE (arg0)", the type of the argument after 
>>>> extensions
>>>>   have been removed, and "newtype", computed from those
>>>>
>>>> The patch only engages with "type".  I suspect "newtype" is what it really
>>>> means there when using "type".  When it uses long_double_type_node and
>>>> double_type_node, those should be "itype".
>>>>
>>>>   - so you should have
>>>>   expressions like the above with two or more of those four types, but not
>>>>   with long_double_type_node directly).
>>>>
>>>> See above.  The patch uses long_double_type_node directly, contrary to
>>>> what I explicitly said.  You are free to disagree with something I say in
>>>> a review - but in that case you need to reply specifically to the review
>>>> comment explaining your rationale for disagreeing with it.  Just ignoring
>>>> the comment and not mentioning the disagreement wastes the time of
>>>> reviewers.
>>>>
>>>>   The patch submission will need to include a proper analysis to justify to
>>>>   the reader why the particular inequality with particular types from those
>>>>   four is correct in all cases where the relevant code may be executed.
>>>>
>>>> The comments only refer to "T1" and "T2" without explaining how they
>>>> relate to the original expression (three types) or the variables within
>>>> GCC dealing with various types (four types, because newtype gets
>>>> involved).  I said back in
>>>> <http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00161.html> and
>>>> <http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01384.html> that three types
>>>> are involved - when I say "the patch submission needs to include its own
>>>> analysis for the full generality of three types", again, it's
>>>> inappropriate for a patch to omit such an analysis without justification.
>>>> The patch submission needs to include an analysis that properly explains
>>>> the transformation involved and the conditions under which it is safe.
>>>> Maybe starting along the lines of:
>>>>
>>>> We have an expression of the form (T1) sqrtT2 ((T2) exprT3), where exprT3
>>>> has type T3 (TREE_TYPE (ARG0)), T2 is the type of the floating-point
>>>> square root function being used (ITYPE), T1 is TYPE and all these types
>>>> are binary floating-point types.  We wish to optimize if possible to an
>>>> expression of the form (T1) sqrtT4 ((T4) exprT3), where T4 (NEWTYPE) is
>>>> narrower than T2.  (Then explain the choice of T4 and the conditions under
>>>> which the transformation is safe, with appropriate references.)
>>>>
>>>> I suggest that for the next patch submission you (the patch submitter)
>>>> make sure you spend at least an hour properly understanding the issues and
>>>> all the previous messages in the thread and writing up the detailed,
>>>> coherent explanation of the transformation and analysis of the issues that
>>>> I asked for some time ago, as if for a reader who hasn't read any of this
>>>> thread or looked at this transformation in GCC before.  I've certainly
>>>> spent longer than that on review in this thread.  It's normal for a patch
>>>> involving anything at all tricky to take hours to write up (and also
>>>> normal for one to discover, in the course of writing the detailed coherent
>>>> analysis for people who aren't familiar with the code and the issues
>>>> involved, that there's in fact something wrong with the patch and it needs
>>>> revisiting before submission).
>>>>
>>>> --
>>>> Joseph S. Myers
>>>> jos...@codesourcery.com

Reply via email to