On Tue, 21 Aug 2018, Bernd Edlinger wrote: > gcc -S -O2 -Wall -Wformat-overflow -ftrack-macro-expansion=0 -fshort-wchar > builtin-sprintf-warn-20.c > builtin-sprintf-warn-20.c: In function 'test': > builtin-sprintf-warn-20.c:19:39: warning: hex escape sequence out of range > 19 | ? (char*)L"\x4142\x4344" : (char*)L"\x41424344\x45464748"; > | ^~~~~~~~~~~~~~~~~~~~~~~ > > Hmm, this test might create some noise on short-wchar targets. > > I would prefer a warning here, about the wrong type of the parameter. > The buffer overflow is only a secondary thing. > > For constant objects like those, the GIMPLE type is still guaranteed to be > reliable, > right?
TREE_TYPE of tcc_declaration and tcc_constant trees should more-or-less (minus qualifications not affecting semantics) be those set by frontends. Richard. > > Bernd. > > >>> I would just like the ability to get the length/size somehow > >>> so that the questionable code that started us down this path > >>> can be detected. This is not just the sprintf(d, "%s", L"1") > >>> kind of a mismatch but also the missing nul detection in cases > >>> like: > >> [ ... ] > >> I know. And ideally we'll be able to handle everything you want to. > >> But we also have to realize that sometimes we may have to punt. > >> > >> Also remember that incremental progress is (usually) good. > >> > >> > >>> > >>> const wchar_t a[1] = L"\xffffffff"; > >>> strcpy(d, (char*)a); > >> This touches on both the representational issue (excess elements in the > >> initializer) and how to handle a non-terminated string. Both are issues > >> we're debating right now. > >> > >>> > >>> I don't think this is necessarily an important use case to > >>> optimize for but it is one that GCC optimizes already nonetheless, > >>> and always has. For example, this has always been folded into 4 > >>> by GCC and still is even with the patch: > >>> > >>> const __WCHAR_TYPE__ wc[] = L"\x12345678"; > >>> > >>> int f (void) > >>> { > >>> const char *p = (char*)wc; > >>> return strlen (p); // folded to 4 > >>> } > >>> > >>> It's optimized because fold_const_call() relies on c_getstr() > >>> which returns the underlying byte representation of the wide > >>> string, and despite c_strlen() now trying to prevent it. > >> And I think you're hitting on some of issues already raised in the thread. > >> > >> In this specific case though ISTM 4 is the right answer. We casted to a > >> char *, so that's what we should be counting. Or am I missing > >> something? Also note that's the value we'd get from the strlen C > >> library call IIUC. > > > > It is the right answer. My point is that this is optimized > > but the change to c_strlen() prevents the same optimization > > in other similar cases. For example, GCC 6 folds this into > > memcpy: > > > > __builtin_strcpy (d, (char*)L"\x12345678"); > > > > GCC 7 and 8 do too but get the byte count wrong (my bad). > > > > Current trunk doesn't optimize it. If restoring the original > > behavior is the intent (and not just fixing the counting bug) > > then c_strlen() should be fixed to fold this again. > > > > (I'm not a fan of the strcpy to memcpy transformation because > > it makes other things difficult but that's beside the point.) > > > >>> The missing nul variant of the same test case isn't folded (it > >>> ends up calling the library strlen) but the bug cannot be > >>> detected using the modified c_strlen(): > >>> > >>> const __WCHAR_TYPE__ wc[1] = L"\x12345678"; // no nul > >>> > >>> int f (void) > >>> { > >>> const char *p = (char*)wc; > >>> return strlen (p); // not folded > >>> } > >>> > >>> To detect it, I'd have to use some other function, like > >>> c_getstr(). I could certainly do that but I don't think > >>> I should need to. > >> And that's on the list of things we're going to try and address next. I > >> don't think it needs to be conflated with change which indicated to > >> c_strlen how to count. > > > > The two are one and the same: if c_strlen() doesn't count bytes > > in arrays of wide characters (wchar_t, char16_t, or char32_t) > > then the original GCC behavior is not restored and the missing > > nul detection won't work for these "mixed type" cases. > > > >> I do hope you're getting all these testcases recorded. I'd even support > >> installing them into the suite now, properly xfailed. It's easier to > >> see how any patch under consideration affects the wider set of tests. > > > > I haven't been recording any of them -- I have no idea where > > this is going. As I've said, these patches prevent some of > > the work I have already submitted. That's why I have been > > objecting to them, including adding the ELTSIZE argument to > > c_strlen(). > > > > Martin > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)