On 08/20/2018 05:23 PM, Martin Sebor wrote: >> And I think you're hitting on some of issues already raised in the >> thread. >> >> In this specific case though ISTM 4 is the right answer. We casted to a >> char *, so that's what we should be counting. Or am I missing >> something? Also note that's the value we'd get from the strlen C >> library call IIUC. > > It is the right answer. My point is that this is optimized > but the change to c_strlen() prevents the same optimization > in other similar cases. For example, GCC 6 folds this into > memcpy: > > __builtin_strcpy (d, (char*)L"\x12345678"); > > GCC 7 and 8 do too but get the byte count wrong (my bad). > > Current trunk doesn't optimize it. If restoring the original > behavior is the intent (and not just fixing the counting bug) > then c_strlen() should be fixed to fold this again. > > (I'm not a fan of the strcpy to memcpy transformation because > it makes other things difficult but that's beside the point.) I care a whole lot less about optimization of a wide character strcpy than I do about the larger issues around correctness or getting good warnings.
Again, if you want to add these things to the testsuite as well (xfail'd of course), that's fine. > >>> The missing nul variant of the same test case isn't folded (it >>> ends up calling the library strlen) but the bug cannot be >>> detected using the modified c_strlen(): >>> >>> const __WCHAR_TYPE__ wc[1] = L"\x12345678"; // no nul >>> >>> int f (void) >>> { >>> const char *p = (char*)wc; >>> return strlen (p); // not folded >>> } >>> >>> To detect it, I'd have to use some other function, like >>> c_getstr(). I could certainly do that but I don't think >>> I should need to. >> And that's on the list of things we're going to try and address next. I >> don't think it needs to be conflated with change which indicated to >> c_strlen how to count. > > The two are one and the same: if c_strlen() doesn't count bytes > in arrays of wide characters (wchar_t, char16_t, or char32_t) > then the original GCC behavior is not restored and the missing > nul detection won't work for these "mixed type" cases. My understanding is prior to the work in gcc-7/gcc-8 c_strlen always counted in bytes, regardless of the incoming type. That was, IMHO, proper behavior. In retrospect I should have caught the change to have it count elements before it went in. I'm a whole lot less concernd that we're missing an optimization on wide character types than I am about restoring the behavior of c_strlen. Optimizing for wide char types, can wait and when addressed need not do so through c_strlen. > >> I do hope you're getting all these testcases recorded. I'd even support >> installing them into the suite now, properly xfailed. It's easier to >> see how any patch under consideration affects the wider set of tests. > > I haven't been recording any of them -- I have no idea where > this is going. As I've said, these patches prevent some of > the work I have already submitted. That's why I have been > objecting to them, including adding the ELTSIZE argument to > c_strlen(). I don't know where all the ends either. It is certainly a possibility that some of the already submitted work by you or Bernd will need to be reworked and that some my be dropped. It happens to all of us from time to time. The issues that have been raised in these threads are significant and need to be resolved. Jeff