On 07/30/2018 02:21 PM, Bernd Edlinger wrote: > On 07/30/18 21:52, Martin Sebor wrote: >> On 07/30/2018 09:24 AM, Bernd Edlinger wrote: >>> On 07/30/18 01:05, Martin Sebor wrote: >>>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote: >>>>> Hi! >>>>> >>>>> This fixes two wrong code bugs where string_constant >>>>> returns over length string constants. Initializers >>>>> like that are rejected in C++, but valid in C. >>>> >>>> If by valid you are referring to declarations like the one in >>>> the added test: >>>> >>>> const char a[2][3] = { "1234", "xyz" }; >>>> >>>> then (as I explained), the excess elements in "1234" make >>>> the char[3] initialization and thus the test case undefined. >>>> I have resolved bug 86711 as invalid on those grounds. >>>> >>>> Bug 86711 has a valid test case that needs to be fixed, along >>>> with bug 86688 that I raised for the same underlying problem: >>>> considering the excess nul as part of the string. As has been >>>> discussed in a separate bug, rather than working around >>>> the excessively long strings in the middle-end, it would be >>>> preferable to avoid creating them to begin with. >>>> >>>> I'm already working on a fix for bug 86688, in part because >>>> I introduced the code change and also because I'm making other >>>> changes in this area -- bug 86552. Both of these in response >>>> to your comments. >>>> >>> >>> Sorry, I must admit, I have completely lost track on how many things >>> you are trying to work in parallel. >>> >>> Nevertheless I started to review you pr86552 patch here: >>> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html >>> >>> But so far you did not respond to me. >>> >>> Well actually I doubt your patch does apply to trunk, >>> maybe you start to re-base that one, and post it again >>> instead? >> >> I read your comments and have been busy working on enhancing >> the patch (among other things). There are a large number of >> additional contexts where constant strings are expected and >> where a missing nul needs to be detected. Some include >> additional instances of strlen calls that my initial patch >> didn't handle, many more others that involve other string >> functions. I have posted an updated patch that applies >> cleanly and that handles the first set. >> >> There is also a class of problems involving constant character >> arrays initialized by a braced list, as in char [] = { x, y, z }; >> Those are currently not recognized as strings even if they are >> nul-terminated, but they are far more likely to be a source of >> these problems than string literals, certainly in C++ where >> string initializers must fit in the array. I am testing a patch >> to convert those into STRING_CST so they can be handled as well. >> >> Since initializing arrays with more elements than fit is >> undefined in C and since the behavior is undefined at compile >> time it seems to me that rejecting such initializers with >> a hard error (as opposed to a warning) would be appropriate >> and obviate having to deal with them in the middle-end. >> > > We do not want to change what is currently accepted by the > front end. period. ?!? I don't follow this. When we can detect an error, we should issue a suitable diagnostic. As is mentioned later in the thread, some diagnostics are considered "pedantic" and are conditional. But I don't see how you get to "We do not want to change what is currently accepted by the front end. period." That seems like far too strong of a statement.
I'm not sure I agree with this being a pedantic error only. See below... > > But there is no reason why ambiguous string constants > have to be passed to the middle end. Agreed -- if we're not outright rejecting, then we should present the middle end with something reasonable. But.... > > For instance char c[2] = "a\0"; should look like c[1] = "a"; > while c[2] = "aaa"; should look like c[2] = "aa"; varasm.c > will cut the excess precision off anyway. I get the first case. The second is less clear. One could argue that c[1] should be NUL or one could argue that c[1] should be 'a'. It's the inability to know what is "more correct" and the security implications of leaving the string unterminated that drives my opinion that this should not be a pedantic error, but instead a hard error. > > That is TREE_STRING_LENGTH (str) == 3 and TREE_STRING_POINTER(str) = "aa\0"; See above. That's going to leave an unterminated string in the resulting code. That has a negative security impact. > > I propose to have all STRING_CST always be created by the > FE with explicit nul termination, but the > TYPE_SIZE_UNIT (TREE_TYPE (str)) >= TREE_STRING_LENGTH(str) in normal case > (null-terminated) > TREE_SIZE_UNIT (TREE_TYPE (str)) < TREE_STRING_LENGTH(str) if non zero > terminated, > truncated in the initializer. > > Do you understand what I mean? I think you're starting from the wrong basis, namely that we shouldn't be issuing a hard error for excess initializers like this when we don't have a clear cut best way to handle it. jeff