On 08/23/18 00:50, Jeff Law wrote: > On 08/22/2018 02:14 AM, Richard Biener wrote: >> On Wed, 22 Aug 2018, Bernd Edlinger wrote: >> >>> On 08/22/18 09:26, Richard Biener wrote: >>>> On Wed, 22 Aug 2018, Bernd Edlinger wrote: >>>> >>>>> On 08/21/18 10:59, Richard Biener wrote: >>>>>> On Tue, 21 Aug 2018, Bernd Edlinger wrote: >>>>>> >>>>>>> gcc -S -O2 -Wall -Wformat-overflow -ftrack-macro-expansion=0 >>>>>>> -fshort-wchar builtin-sprintf-warn-20.c >>>>>>> builtin-sprintf-warn-20.c: In function 'test': >>>>>>> builtin-sprintf-warn-20.c:19:39: warning: hex escape sequence out of >>>>>>> range >>>>>>> 19 | ? (char*)L"\x4142\x4344" : (char*)L"\x41424344\x45464748"; >>>>>>> | ^~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> >>>>>>> Hmm, this test might create some noise on short-wchar targets. >>>>>>> >>>>>>> I would prefer a warning here, about the wrong type of the parameter. >>>>>>> The buffer overflow is only a secondary thing. >>>>>>> >>>>>>> For constant objects like those, the GIMPLE type is still guaranteed to >>>>>>> be reliable, >>>>>>> right? >>>>>> >>>>>> TREE_TYPE of tcc_declaration and tcc_constant trees should more-or-less >>>>>> (minus qualifications not affecting semantics) be those set by >>>>>> frontends. >>>>>> >>>>> >>>>> and in this case: >>>>> >>>>> const union >>>>> { struct { >>>>> wchar_t x[4]; >>>>> }; >>>>> struct { >>>>> char z[8]; >>>>> }; >>>>> } u = {{L"123"}}; >>>>> >>>>> int test() >>>>> { >>>>> return __builtin_strlen(u.z); >>>>> } >>>>> >>>>> >>>>> string_constant works out the initializer for u.x >>>>> which has a different type than u.z >>>> >>>> Yes. That's because it uses ctor-for-folding and friends. It's >>>> a question of the desired semantics of string_constant whether >>>> it should better return NULL_TREE in this case or whether the >>>> caller has to deal with type mismatches. >>>> >>> >>> Yes, absolutely. >>> >>> c_getstr needs to bail out if the string is not zero-terminated >>> within the limits given by the decl, or the string_cst-type or whatever >>> may help. >>> >>> Furthermore I also consider it possible that the byteoffset >>> is not a multiple of eltsize. So fail in that case as well. >>> >>> >>> I am currently boot-strapping a patch for this (pr87053): >>> $ cat u.c >>> const union >>> { struct { >>> char x[4]; >>> char y[4]; >>> }; >>> struct { >>> char z[8]; >>> }; >>> } u = {{"1234", "567"}}; >>> >>> int test() >>> { >>> return __builtin_strlen(u.z); >>> } >>> >>> gets folded to 4. >>> >>> ... but unfortunately it will depend on my pr86714 fix which fixes >>> the mem_size parameter returned from string_constant. >>> >>> Frankly, in the moment I feel like I fell in a deep deep hole. :-O >> >> /me too with > 100 mails in this and related threads unread ;) >> >> I thought Jeff applied the mem_size parameter thing but maybe it >> was something else. *fetches coffee* Ah, Jeff is still "working" >> on it. > The patch that was applied adds a parameter to c_strlen to indicate how > it should count (ie, bytes, 16bit wchars or 32bit wchars). > > There was one hunk that was dependent upon an earlier, more > controversial, patch from Bernd which is still under discussion. That > hunk was twiddled to preserve the overall goal of Bernd's patch which > added the how to count parameter without being dependent upon the older, > more controversial patch. >
Which patch do you mean? > It's all confusing and it'd actually help if folks stopped issuing > updated patches which try to handle more cases or which touch on the > areas of code already being looked at. Otherwise the patches just get > more tangled and we're more likely to end up with developers getting > more entrenched in a particular approach. > > Jeff >