On 08/22/2018 02:14 AM, Richard Biener wrote: > On Wed, 22 Aug 2018, Bernd Edlinger wrote: > >> On 08/22/18 09:26, Richard Biener wrote: >>> On Wed, 22 Aug 2018, Bernd Edlinger wrote: >>> >>>> On 08/21/18 10:59, Richard Biener wrote: >>>>> On Tue, 21 Aug 2018, Bernd Edlinger wrote: >>>>> >>>>>> gcc -S -O2 -Wall -Wformat-overflow -ftrack-macro-expansion=0 >>>>>> -fshort-wchar builtin-sprintf-warn-20.c >>>>>> builtin-sprintf-warn-20.c: In function 'test': >>>>>> builtin-sprintf-warn-20.c:19:39: warning: hex escape sequence out of >>>>>> range >>>>>> 19 | ? (char*)L"\x4142\x4344" : (char*)L"\x41424344\x45464748"; >>>>>> | ^~~~~~~~~~~~~~~~~~~~~~~ >>>>>> >>>>>> Hmm, this test might create some noise on short-wchar targets. >>>>>> >>>>>> I would prefer a warning here, about the wrong type of the parameter. >>>>>> The buffer overflow is only a secondary thing. >>>>>> >>>>>> For constant objects like those, the GIMPLE type is still guaranteed to >>>>>> be reliable, >>>>>> right? >>>>> >>>>> TREE_TYPE of tcc_declaration and tcc_constant trees should more-or-less >>>>> (minus qualifications not affecting semantics) be those set by >>>>> frontends. >>>>> >>>> >>>> and in this case: >>>> >>>> const union >>>> { struct { >>>> wchar_t x[4]; >>>> }; >>>> struct { >>>> char z[8]; >>>> }; >>>> } u = {{L"123"}}; >>>> >>>> int test() >>>> { >>>> return __builtin_strlen(u.z); >>>> } >>>> >>>> >>>> string_constant works out the initializer for u.x >>>> which has a different type than u.z >>> >>> Yes. That's because it uses ctor-for-folding and friends. It's >>> a question of the desired semantics of string_constant whether >>> it should better return NULL_TREE in this case or whether the >>> caller has to deal with type mismatches. >>> >> >> Yes, absolutely. >> >> c_getstr needs to bail out if the string is not zero-terminated >> within the limits given by the decl, or the string_cst-type or whatever >> may help. >> >> Furthermore I also consider it possible that the byteoffset >> is not a multiple of eltsize. So fail in that case as well. >> >> >> I am currently boot-strapping a patch for this (pr87053): >> $ cat u.c >> const union >> { struct { >> char x[4]; >> char y[4]; >> }; >> struct { >> char z[8]; >> }; >> } u = {{"1234", "567"}}; >> >> int test() >> { >> return __builtin_strlen(u.z); >> } >> >> gets folded to 4. >> >> ... but unfortunately it will depend on my pr86714 fix which fixes >> the mem_size parameter returned from string_constant. >> >> Frankly, in the moment I feel like I fell in a deep deep hole. :-O > > /me too with > 100 mails in this and related threads unread ;) > > I thought Jeff applied the mem_size parameter thing but maybe it > was something else. *fetches coffee* Ah, Jeff is still "working" > on it. The patch that was applied adds a parameter to c_strlen to indicate how it should count (ie, bytes, 16bit wchars or 32bit wchars).
There was one hunk that was dependent upon an earlier, more controversial, patch from Bernd which is still under discussion. That hunk was twiddled to preserve the overall goal of Bernd's patch which added the how to count parameter without being dependent upon the older, more controversial patch. It's all confusing and it'd actually help if folks stopped issuing updated patches which try to handle more cases or which touch on the areas of code already being looked at. Otherwise the patches just get more tangled and we're more likely to end up with developers getting more entrenched in a particular approach. Jeff