On 07/30/18 17:57, Jakub Jelinek wrote: > On Mon, Jul 30, 2018 at 03:52:39PM +0000, Joseph Myers wrote: >> On Mon, 30 Jul 2018, Bernd Edlinger wrote: >> >>> In the moment I would already be happy if all STRING_CSTs would >>> be zero terminated. >> >> generic.texi says they need not be. Making the STRING_CST contain only >> the bytes of the initializer and not the trailing NUL in the C case where >> the trailing NUL does not fit in the object initialized would of course >> mean you get non-NUL-terminated STRING_CSTs for valid C code as well. > > One thing is whether TREE_STRING_LENGTH includes the trailing NUL byte, > that doesn't need to be the case e.g. for the shortened initializers. > The other thing is whether we as a convenience for the compiler's internals > want to waste some memory for the NUL termination; I think we could avoid > some bugs that way. >
Yes, exactly, currently the middle-end tries determine if a STRING_CST is nul terminated, but that is broken for wide character, for instance c_getstr: else if (string[string_length - 1] != '\0') { /* Support only properly NUL-terminated strings but handle consecutive strings within the same array, such as the six substrings in "1\0002\0003". */ return NULL; } It would be much better if any string constant could be zero terminated. When always zero-terminated STRING_CST the check for a non-zero terminated value is much more easy: compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (init)), TREE_STRING_LENGTH (init)) Bernd.