On 8/21/18 8:10 PM, Martin Sebor wrote: > It didn't seem like we were making progress in the debate about > warning for sprintf argument mismatches earlier today so I took > a couple of hours this afternoon to prototype one of the solutions > I was trying to describe. It mostly keeps the existing interface > and just extends c_strlen() and the other functions to pass in > an in-out argument describing the requested element size on input > and the constant string on output. The caller is responsible for > validating the string to make sure its type matches the expected > type. > > String functions like strcpy interested in the size of their > argument in bytes succeed even for a wide string argument and > are candidates for folding (this matches the original behavior). > The patch doesn't add any warning for mismatched calls to those > (such as strcpy(d, (char*)L"123");) but enhancing it to do that > would be just as "simple" as adding the missing nul detection. > > Calls to sprintf "%s" and "%ls" also succeed with mismatched > arguments but get a warning: > > warning: ‘%s’ invalid directive argument type ‘int[4]’ [-Wformat-overflow=] > 3 | __builtin_sprintf (d, "%s", (char*)L"123"); > | ^~ ~~~~~~ > warning: ‘%ls’ invalid directive argument type ‘char[4]’ > [-Wformat-overflow=] > 4 | __builtin_sprintf (d, "%ls", (__WCHAR_TYPE__*)"123"); > | ^~~ ~~~~~ > > FWIW, I chose the approach of adding a c_strlen_data structure > over adding yet another argument to c_strlen() and friends to > keep the argument list from getting too long and confusing > (like get_range_strlen). This should also make it easy to > retrofit the missig nul detection patch on top of it. > > I didn't take any time to add tests for the restored strcpy > (et al.) folding. > > If this looks like the general direction we can agree on (perhaps > with some tweaks, including not folding etc.) I will add those > tests, plus more for the various argument mismatches. > > Tested on x86_64-linux. > > Martin > > gcc-87034.diff > > PR tree-optimization/87034 - missing -Wformat-overflow on a sprintf %s with a > wide string > > gcc/ChangeLog: > > PR tree-optimization/87034 > * builtins.c (c_strlen): Add an argument. Optionally return string > to caller. > * builtins.h (c_strlen): Add an argument. > * gimple-fold.c (get_range_strlen): Replace argument with > c_strlen_data *. > (get_maxval_strlen): Adjust call to get_range_strlen. > (gimple_fold_builtin_strlen): Same. > * gimple-fold.h (c_strlen_data): New struct. > (get_range_strlen): Add optional argument. > * gimple-ssa-sprintf.c (get_string_length): Change argument type. > (format_string): Same. Adjust. > (format_directive): Diagnose incompatible arguments. > > gcc/testsuite/ChangeLog: > > PR tree-optimization/87034 > * gcc.dg/tree-ssa/builtin-sprintf-warn-20.c: Remove xfail. So just now starting to look at this... Sorry.
So hunks of this are no longer necessary as I ended up doing something similar to c_strlen's API. My c_strlen_data is out parameters only, but obviously could be used for an in parameter like ELTSIZE. > > - /* Determine the size of the string element. */ > - if (eltsize != tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (src))))) > - return NULL_TREE; > + unsigned eltsize = 1; > + if (pdata) > + { > + eltsize = pdata->eltsize; > > + tree srctype = TREE_TYPE (TREE_TYPE (src)); > + if (eltsize != tree_to_uhwi (TYPE_SIZE_UNIT (srctype))) > + pdata->string = src; > + } > + So if I'm reading this hunk correctly, we'll no longer return NULL_TREE when presented with an ELTSIZE that is not equal to the TYPE_SIZE_UNIT. That seems wrong. Conceptually the return value from c_strlen should be either the length as it would be computed by strlen or NULL_TREE indicating it's a case we can't handle (lack of NUL termination, mismatches on sizes of the elements, etc). In those cases where we return NULL_TREE, callers can dig into the c_strlen_data to get additional information for diagnostics and such. We don't currently provide the actual string within c_strlen_data, but we easily could. I think that's the biggest thing you need IIUC your patch correctly. > Index: gcc/gimple-ssa-sprintf.c > =================================================================== > --- gcc/gimple-ssa-sprintf.c (revision 263754) > +++ gcc/gimple-ssa-sprintf.c (working copy) [ ... ] There will be some textual conflicts with your #4/6, but nothing that shouldn't be easily resolvable. > @@ -2153,9 +2157,21 @@ format_string (const directive &dir, tree arg, vr_ > { > fmtresult res; > > - /* Compute the range the argument's length can be in. */ > - int count_by = dir.specifier == 'S' || dir.modifier == FMT_LEN_l ? 4 : 1; > - fmtresult slen = get_string_length (arg, count_by); > + /* Compute the range the argument's length can be in. On input > + to get_string_length set ELT to the size of the expected argument > + type; on output, if successful, expect the function to set ELT to > + the type of the actual argument. */ > + c_strlen_data data; > + data.eltsize = dir.specifier == 'S' || dir.modifier == FMT_LEN_l ? 4 : 1; > + data.string = NULL_TREE; > + fmtresult slen = get_string_length (arg, &data); > + if (data.string) > + { > + tree strtype = TREE_TYPE (data.string); > + if (tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (strtype))) != > data.eltsize) > + res.badarg = data.string; > + } > + This looks like the meat of what you're doing. There'll be slight conflicts when I fix the pending AIX issue (2 byte wide chars), but those will be trivially fixable I think. Essentially it looks like you always want to get the string back so you can query its type. I'd think we could bubble that up pretty easily. So overall I've got no problems with the direction here. It just needs some updating for the current trunk. jeff