Manuel López-Ibáñez wrote:
* Fortran devs: Is this approach acceptable? The main idea is to have
an output_buffer called pp_warning_buffer with the flush_p bit unset
if we are buffering. When printing buffered warnings, use this
output_buffer in the global_dc->printer instead of the (unbuffered
one) used by the *_now variants. In principle this could support
several buffered diagnostics, but Fortran only seems to buffer at most
one.
I think the approach is fine. As the _now version overrides the buffer,
one might even do with a single buffer by clearing it, setting flush_p
temporarily to true and printing the message. It only might collide with
buffered warnings (for error_now) and errors (for warning_now), but I
don't see whether that's an issue. For warnings/error_now it probably
isn't, for errors/warning_now, it might. Thus, having two buffers is
probably better.
The ugliest part is how to handle warningcount and werrorcount. I
could handle this in the common machinery in a better way by storing
DK_WERROR in the diagnostic->kind and checking it after printing.
I'm missing the "but". Does it require more modifications in common code
and thus you shy away? Or is there another reason?
I can also hide the output_buffer switching inside two helper
functions, but the helper function would need to use either a static
variable or a global one to save and restore the tmp_buffer. I'm not
sure that is better or worse (the current code uses a global pointer
&cur_error_buffer, so perhaps I should have used a similar approach).
Me neither. The current approach is rather localized in error.c; thus,
it doesn't really matter.
* Fortran devs #2: The testsuite is testing that the warning is
eventually printed. However, I'm not sure it is testing when the
warning is buffered and then discarded, is it? If not, how can I
produce such a test?
Well, for nearly every Fortran program, at least for one line multiple
attempts have to be done by the parser. Thus, if you set a break point
in gfc_error, you will see "syntax error" messages which never make it
to the screen. As the test suite checks for excess errors, nearly every
test-suite file tests this.
* * *
With the patch in place, plus its gfc_error cousin, and after the
follow-up conversion work, only the following would remain:
* gfc_warning: a single occurence of two locations in the same error
string (of ~120 calls)
* gfc_error: approx. 34 times two locations (of ~1630 calls)
* gfc_error_now_1: 11 times two locations
* In scanner.c:
- gfc_warning_now_1: 6 calls in scanner.c
- gfc_warning: 3
- fprintf: 2
- gfc_error: 0
Some of those can probably be simply converted, others either need to
remain, or one has to setup a proper location (currently, using
gfc_warning_now would ICE), or one creates a work around and constructs
manually the error string (at least for the fprintf cases).
* * *
Regarding the use of libcpp: Currently, it is only used with explicit
preprocessing ("-cpp", file extension) and it is used to output into a
temporary file which is then read back. I experimented with
token = cpp_get_token (cpp_in);
const char *str = cpp_token_as_text (cpp_in, token);
which kind of works okay.
But I do not understand how one gets linebreaks and spacing correctly.
For linebreaks, I can use "flags & BOL" or a callback. But I'm still
trying to understand how one can get the number of spacings. "flags &
PREV_WHITE" seems to record only a single one and seems to convert all
of them (\t, \n and ' ') into a single type.
For fixed-form Fortran, spaces are essential. On punch cards,* only the
first 72 characters were read; as (some?) punch cards had additional 8
characters, those were used for comments (e.g. to enumerate the cards).
Hence, there are codes out there, which assumes that everything beyond
72 characters is ignored. Thus, the number of spaces is crucial – at
least with -fpreprocessed, if one wants to always goes through libcpp.
Looking at the current code, it seems to use a line-break callback with
SOURCE_COLUMN to reconstruct the indentation. I think saving the token
src_column and the length of the token string and subtracting it from
the new source location, will also work for mid-line spaces. However,
that's really a kludge – and it still doesn't permit one to distiniguish
between spaces and tabs.
Finally, the token handling seems to get in trouble with
print *, "Hello &
&world!"
where a single string extends over two lines.
Taking everything together, I think using libcpp for reading Fortran
files is a topic for GCC 6. Any suggestion how to properly handle the
spacing?
For the lexing itself, one probably needs a Fortran mode, which can
recognize Fortran comments, continuation lines and some special
properties about Fortran string. When implemented, one could then also
turn of the traditional mode.
Tobias
* Disclaimer: I started with Fortran 95 and a rather object-oriented
code. Hence, I might have gotten the fine print of the Fortran history
wrong.