On 08/18/18 20:01, Martin Sebor wrote: > On 08/17/2018 05:01 PM, Bernd Edlinger wrote: >> On 08/17/18 22:17, Martin Sebor wrote: >>> On 08/17/2018 12:44 PM, Bernd Edlinger wrote: >>>> On 08/17/18 20:23, Martin Sebor wrote: >>>>> On 08/17/2018 06:14 AM, Joseph Myers wrote: >>>>>> On Fri, 17 Aug 2018, Jeff Law wrote: >>>>>> >>>>>>> On 08/16/2018 05:01 PM, Joseph Myers wrote: >>>>>>>> On Thu, 16 Aug 2018, Jeff Law wrote: >>>>>>>> >>>>>>>>> restores previous behavior. The sprintf bits want to count element >>>>>>>>> sized chunks, which for wchars is 4 bytes (that count will then be >>>>>>>> >>>>>>>>> /* Compute the range the argument's length can be in. */ >>>>>>>>> - fmtresult slen = get_string_length (arg); >>>>>>>>> + int count_by = dir.specifier == 'S' || dir.modifier == FMT_LEN_l ? >>>>>>>>> 4 : 1; >>>>>>>> >>>>>>>> I don't see how a hardcoded 4 is correct here. Surely you need to >>>>>>>> example >>>>>>>> wchar_type_node to determine its actual size for this target. >>>>>>> We did kick this around a little. IIRC Martin didn't think that it was >>>>>>> worth handling the 2 byte wchar case. >>>>> >>>>> Sorry, I think we may have miscommunicated -- I didn't think it >>>>> was useful to pass a size of the character type to the function. >>>>> I agree that passing in a hardcoded constant doesn't seem right >>>>> (even if GCC's wchar_t were always 4 bytes wide). >>>>> >>>>> I'm still not sure I see the benefit of passing in the expected >>>>> element size given that the patch causes c_strlen() fail when >>>>> the actual element size doesn't match ELTSIZE. If the caller >>>>> asks for the number of bytes in a const wchar_t array it should >>>>> get back the number bytes. (I could see it fail if the caller >>>>> asked for the number of words in a char array whose size was >>>>> not evenly divisible by wordsize.) >>>>> >>>> >>>> I think in this case c_strlen should use the type which the %S format >>>> uses at runtime, otherwise it will not have anything to do with >>>> the reality. >>> >>> %S is not what I'm talking about. >>> >>> Failing in the case I described (a caller asking for the size >>> in bytes of a constant object whose type is larger) prevents >>> callers that don't care about the type from detecting the actual >>> size of a constant. >>> >>> Specifically for sprintf, it means that the buffer overflow >>> below is no longer diagnosed: >>> >>> struct S { char a[2]; void (*pf)(void); }; >>> >>> void f (struct S *p) >>> { >>> const char *q = (char*)L"\x41424344\x45464748"; >>> >>> sprintf (p->a, "%s", q); >>> } >>> >>> There may be no other analyses that would benefit from this >>> ability today but there easily could be. There certainly >>> are optimizations that depend on c_getstr() returning >>> a pointer to the constant object regardless of its type >>> (memchr being one of them). >>> >> >> Yes, I agree. >> >> Coincidentally that is exactly what in my follow-up patch implements. >> See: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01005.html >> >> If you call c_getstr(x) you get a valid zero-terminated single-byte >> string or nothing. >> >> If you call c_getstr(x, &memsize) you get a pointer to a memory >> of the specified memsize, regardless of the underlying type. >> and whether or not zero terminated. > > Most functions in GCC call c_strlen() at some point to determine > the length of a string. They need to be able to detect the missing > nul -- it would double the amount of processing to have them also > call c_getstr() when c_strlen() fails to see if the failure happens > to be due to a missing nul -- it the overwhelming majority of cases > it won't be. >
Well, yes, if the c_strlen function fails, it would be cheap to have some error code for the failure reason. But that is something for later. Bernd.