On Wed, Aug 22, 2018 at 5:05 PM Martin Sebor <mse...@gmail.com> wrote:
>
> On 08/22/2018 06:48 AM, Richard Biener wrote:
> > On Wed, Aug 22, 2018 at 4:56 AM Martin Sebor <mse...@gmail.com> wrote:
> >>
> >> In the discussion of the fallout from the enhancement for pr71625
> >> it was pointed out that STRING_CSts in Gimple dumps extend only
> >> to the first nul and don't include any subsequent characters,
> >> and that this makes the dumps harder to read and might give rise
> >> to the question whether the code is correct.
> >>
> >> In the attached patch I enhance the pretty_print_string() function
> >> to print the full contents of a STRING_CST, including any embedded
> >> nuls to make the dumps clearer.  I got rid of the single digit
> >> escapes like '\1' since they make the string look ambiguous.
> >> If TREE_STRING_LENGTH (node) is guaranteed to be non-zero these
> >> days the test for it being so may be redundant but I figured it's
> >> better to be safe than sorry.
> >>
> >> A further enhancement might be to also distinguish the type of
> >> the STRING_CST.
> >
> > And somehow indicate whether it is \0 terminated (just thinking of
> > the GIMPLE FE and how it parses string literals simply by relying
> > on libcpp).
> >
> > Can you write a not \0 terminated string literal in C?
>
> Yes: char a[2] = "12";

I thought they are fully defined in translation phase #1 ...

> I briefly considered making the terminating nul visible when
> there was one but there's at least one test that scans GIMPLE
> for the expected string and it failed so I didn't pursue this
> idea further.

Yes, I think it would just confuse people at the moment.  I thought
of somehow marking non-terminated literals, like  "12"[nt] or so ...
or by using an alternate quote "12' (eh).

Anyway, not too important I guess if you for a moment exclude
the -fdump-XXX-gimple and re-parsing with the GIMPLE FE case.

Richard.

> The tests can certainly be changed to look for the new pattern
> if we should decide to make a change here.  I'm not sure what
> would be best.  Printing "12\x00" for an ordinary nul-terminated
> literal looks like there might be an extra nul after the first
> one.  Leaving it out doesn't distinguish it from the unterminated
> array.  Using the braced-list notation (i.e.,  a = { '1', '2' }
> for unterminated arrays doesn't capture the fact that it's
> represented as STRING_CST).
>
> I'm open to suggestions.
>
> Martin

Reply via email to