Re: [PATCH] Add a character size parameter to c_strlen/get_range_strlen

Jeff Law Tue, 21 Aug 2018 08:25:43 -0700

On 08/20/2018 05:23 PM, Martin Sebor wrote:
>> And I think you're hitting on some of issues already raised in the
>> thread.
>>
>> In this specific case though ISTM 4 is the right answer.  We casted to a
>> char *, so that's what we should be counting.  Or am I missing
>> something?  Also note that's the value we'd get from the strlen C
>> library call IIUC.
> 
> It is the right answer.  My point is that this is optimized
> but the change to c_strlen() prevents the same optimization
> in other similar cases.  For example, GCC 6 folds this into
> memcpy:
> 
>   __builtin_strcpy (d, (char*)L"\x12345678");
> 
> GCC 7 and 8 do too but get the byte count wrong (my bad).
> 
> Current trunk doesn't optimize it.  If restoring the original
> behavior is the intent (and not just fixing the counting bug)
> then c_strlen() should be fixed to fold this again.
> 
> (I'm not a fan of the strcpy to memcpy transformation because
> it makes other things difficult but that's beside the point.)
I care a whole lot less about optimization of a wide character strcpy
than I do about the larger issues around correctness or getting good
warnings.


Again, if you want to add these things to the testsuite as well (xfail'd
of course), that's fine.


> 
>>> The missing nul variant of the same test case isn't folded (it
>>> ends up calling the library strlen) but the bug cannot be
>>> detected using the modified c_strlen():
>>>
>>>   const __WCHAR_TYPE__ wc[1] = L"\x12345678";   // no nul
>>>
>>>   int f (void)
>>>   {
>>>     const char *p = (char*)wc;
>>>     return strlen (p);            // not folded
>>>   }
>>>
>>> To detect it, I'd have to use some other function, like
>>> c_getstr().  I could certainly do that but I don't think
>>> I should need to.
>> And that's on the list of things we're going to try and address next.  I
>> don't think it needs to be conflated with change which indicated to
>> c_strlen how to count.
> 
> The two are one and the same: if c_strlen() doesn't count bytes
> in arrays of wide characters (wchar_t, char16_t, or char32_t)
> then the original GCC behavior is not restored and the missing
> nul detection won't work for these "mixed type" cases.
My understanding is prior to the work in gcc-7/gcc-8 c_strlen always
counted in bytes, regardless of the incoming type.  That was, IMHO,
proper behavior.  In retrospect I should have caught the change to have
it count elements before it went in.

I'm a whole lot less concernd that we're missing an optimization on wide
character types than I am about restoring the behavior of c_strlen.

Optimizing for wide char types, can wait and when addressed need not do
so through c_strlen.


> 
>> I do hope you're getting all these testcases recorded.  I'd even support
>> installing them into the suite now, properly xfailed.  It's easier to
>> see how any patch under consideration affects the wider set of tests.
> 
> I haven't been recording any of them --  I have no idea where
> this is going.  As I've said, these patches prevent some of
> the work I have already submitted.  That's why I have been
> objecting to them, including adding the ELTSIZE argument to
> c_strlen().
I don't know where all the ends either.  It is certainly a possibility
that some of the already submitted work by you or Bernd will need to be
reworked and that some my be dropped.  It happens to all of us from time
to time.

The issues that have been raised in these threads are significant and
need to be resolved.

Jeff

Re: [PATCH] Add a character size parameter to c_strlen/get_range_strlen

Reply via email to