On Wed, Aug 22, 2018 at 5:05 PM Martin Sebor <mse...@gmail.com> wrote: > > On 08/22/2018 06:48 AM, Richard Biener wrote: > > On Wed, Aug 22, 2018 at 4:56 AM Martin Sebor <mse...@gmail.com> wrote: > >> > >> In the discussion of the fallout from the enhancement for pr71625 > >> it was pointed out that STRING_CSts in Gimple dumps extend only > >> to the first nul and don't include any subsequent characters, > >> and that this makes the dumps harder to read and might give rise > >> to the question whether the code is correct. > >> > >> In the attached patch I enhance the pretty_print_string() function > >> to print the full contents of a STRING_CST, including any embedded > >> nuls to make the dumps clearer. I got rid of the single digit > >> escapes like '\1' since they make the string look ambiguous. > >> If TREE_STRING_LENGTH (node) is guaranteed to be non-zero these > >> days the test for it being so may be redundant but I figured it's > >> better to be safe than sorry. > >> > >> A further enhancement might be to also distinguish the type of > >> the STRING_CST. > > > > And somehow indicate whether it is \0 terminated (just thinking of > > the GIMPLE FE and how it parses string literals simply by relying > > on libcpp). > > > > Can you write a not \0 terminated string literal in C? > > Yes: char a[2] = "12";
I thought they are fully defined in translation phase #1 ... > I briefly considered making the terminating nul visible when > there was one but there's at least one test that scans GIMPLE > for the expected string and it failed so I didn't pursue this > idea further. Yes, I think it would just confuse people at the moment. I thought of somehow marking non-terminated literals, like "12"[nt] or so ... or by using an alternate quote "12' (eh). Anyway, not too important I guess if you for a moment exclude the -fdump-XXX-gimple and re-parsing with the GIMPLE FE case. Richard. > The tests can certainly be changed to look for the new pattern > if we should decide to make a change here. I'm not sure what > would be best. Printing "12\x00" for an ordinary nul-terminated > literal looks like there might be an extra nul after the first > one. Leaving it out doesn't distinguish it from the unterminated > array. Using the braced-list notation (i.e., a = { '1', '2' } > for unterminated arrays doesn't capture the fact that it's > represented as STRING_CST). > > I'm open to suggestions. > > Martin