On 07/30/2018 02:21 PM, Bernd Edlinger wrote:
On 07/30/18 21:52, Martin Sebor wrote:
On 07/30/2018 09:24 AM, Bernd Edlinger wrote:
On 07/30/18 01:05, Martin Sebor wrote:
On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
Hi!

This fixes two wrong code bugs where string_constant
returns over length string constants.  Initializers
like that are rejected in C++, but valid in C.

If by valid you are referring to declarations like the one in
the added test:

     const char a[2][3] = { "1234", "xyz" };

then (as I explained), the excess elements in "1234" make
the char[3] initialization and thus the test case undefined.
I have resolved bug 86711 as invalid on those grounds.

Bug 86711 has a valid test case that needs to be fixed, along
with bug 86688 that I raised for the same underlying problem:
considering the excess nul as part of the string.  As has been
discussed in a separate bug, rather than working around
the excessively long strings in the middle-end, it would be
preferable to avoid creating them to begin with.

I'm already working on a fix for bug 86688, in part because
I introduced the code change and also because I'm making other
changes in this area -- bug 86552.  Both of these in response
to your comments.


Sorry, I must admit, I have completely lost track on how many things
you are trying to work in parallel.

Nevertheless I started to review you pr86552 patch here:
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html

But so far you did not respond to me.

Well actually I doubt your patch does apply to trunk,
maybe you start to re-base that one, and post it again
instead?

I read your comments and have been busy working on enhancing
the patch (among other things).  There are a large number of
additional contexts where constant strings are expected and
where a missing nul needs to be detected.  Some include
additional instances of strlen calls that my initial patch
didn't handle, many more others that involve other string
functions.  I have posted an updated patch that applies
cleanly and that handles the first set.

There is also a class of problems involving constant character
arrays initialized by a braced list, as in char [] = { x, y, z };
Those are currently not recognized as strings even if they are
nul-terminated, but they are far more likely to be a source of
these problems than string literals, certainly in C++ where
string initializers must fit in the array.  I am testing a patch
to convert those into STRING_CST so they can be handled as well.

Since initializing arrays with more elements than fit is
undefined in C and since the behavior is undefined at compile
time it seems to me that rejecting such initializers with
a hard error (as opposed to a warning) would be appropriate
and obviate having to deal with them in the middle-end.


We do not want to change what is currently accepted by the
front end. period.

On whose behalf are you making such categorical statements?
It was Jakub and Richard's suggestion in bug 86714 to reject
the undefined excessive initializers and I happen to like
the idea.  I don't recall anyone making a decision about what
"we" do or don't want to change.

That said, if rejecting such initializers is not acceptable
an alternate solution that I believe Richard preferred is to
trim excess elements early on, e.g., during gimplification
(or at some other point after the front-end is done).  That's
okay with me too.


But there is no reason why ambiguous string constants
have to be passed to the middle end.

For instance char c[2] = "a\0"; should look like c[1] = "a";
while c[2] = "aaa"; should look like c[2] = "aa"; varasm.c
will cut the excess precision off anyway.

That is TREE_STRING_LENGTH (str) == 3 and TREE_STRING_POINTER(str) = "aa\0";

I propose to have all STRING_CST always be created by the
FE with explicit nul termination, but the
TYPE_SIZE_UNIT (TREE_TYPE (str)) >= TREE_STRING_LENGTH(str) in normal case 
(null-terminated)
TREE_SIZE_UNIT (TREE_TYPE (str)) < TREE_STRING_LENGTH(str) if non zero 
terminated,
truncated in the initializer.

Do you understand what I mean?

I don't insist on any particular internal representation as long
as it makes it possible to detect and diagnose common bugs, and
(ideally) also help mitigate their worst consequences.

Martin

Reply via email to