On 08/17/18 22:17, Martin Sebor wrote: > On 08/17/2018 12:44 PM, Bernd Edlinger wrote: >> On 08/17/18 20:23, Martin Sebor wrote: >>> On 08/17/2018 06:14 AM, Joseph Myers wrote: >>>> On Fri, 17 Aug 2018, Jeff Law wrote: >>>> >>>>> On 08/16/2018 05:01 PM, Joseph Myers wrote: >>>>>> On Thu, 16 Aug 2018, Jeff Law wrote: >>>>>> >>>>>>> restores previous behavior. The sprintf bits want to count element >>>>>>> sized chunks, which for wchars is 4 bytes (that count will then be >>>>>> >>>>>>> /* Compute the range the argument's length can be in. */ >>>>>>> - fmtresult slen = get_string_length (arg); >>>>>>> + int count_by = dir.specifier == 'S' || dir.modifier == FMT_LEN_l ? 4 >>>>>>> : 1; >>>>>> >>>>>> I don't see how a hardcoded 4 is correct here. Surely you need to >>>>>> example >>>>>> wchar_type_node to determine its actual size for this target. >>>>> We did kick this around a little. IIRC Martin didn't think that it was >>>>> worth handling the 2 byte wchar case. >>> >>> Sorry, I think we may have miscommunicated -- I didn't think it >>> was useful to pass a size of the character type to the function. >>> I agree that passing in a hardcoded constant doesn't seem right >>> (even if GCC's wchar_t were always 4 bytes wide). >>> >>> I'm still not sure I see the benefit of passing in the expected >>> element size given that the patch causes c_strlen() fail when >>> the actual element size doesn't match ELTSIZE. If the caller >>> asks for the number of bytes in a const wchar_t array it should >>> get back the number bytes. (I could see it fail if the caller >>> asked for the number of words in a char array whose size was >>> not evenly divisible by wordsize.) >>> >> >> I think in this case c_strlen should use the type which the %S format >> uses at runtime, otherwise it will not have anything to do with >> the reality. > > %S is not what I'm talking about. > > Failing in the case I described (a caller asking for the size > in bytes of a constant object whose type is larger) prevents > callers that don't care about the type from detecting the actual > size of a constant. > > Specifically for sprintf, it means that the buffer overflow > below is no longer diagnosed: > > struct S { char a[2]; void (*pf)(void); }; > > void f (struct S *p) > { > const char *q = (char*)L"\x41424344\x45464748"; > > sprintf (p->a, "%s", q); > } > > There may be no other analyses that would benefit from this > ability today but there easily could be. There certainly > are optimizations that depend on c_getstr() returning > a pointer to the constant object regardless of its type > (memchr being one of them). >
Yes, I agree. Coincidentally that is exactly what in my follow-up patch implements. See: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01005.html If you call c_getstr(x) you get a valid zero-terminated single-byte string or nothing. If you call c_getstr(x, &memsize) you get a pointer to a memory of the specified memsize, regardless of the underlying type. and whether or not zero terminated. Bernd.