Manuel López-Ibáñez wrote:
* Fortran devs: Is this approach acceptable? The main idea is to have
an output_buffer called pp_warning_buffer with the flush_p bit unset
if we are buffering. When printing buffered warnings, use this
output_buffer in the global_dc->printer instead of the (unbuffered
one) used by the *_now variants. In principle this could support
several buffered diagnostics, but Fortran only seems to buffer at most
one.

I think the approach is fine. As the _now version overrides the buffer, one might even do with a single buffer by clearing it, setting flush_p temporarily to true and printing the message. It only might collide with buffered warnings (for error_now) and errors (for warning_now), but I don't see whether that's an issue. For warnings/error_now it probably isn't, for errors/warning_now, it might. Thus, having two buffers is probably better.

The ugliest part is how to handle warningcount and werrorcount. I
could handle this in the common machinery in a better way by storing
DK_WERROR in the diagnostic->kind and checking it after printing.

I'm missing the "but". Does it require more modifications in common code and thus you shy away? Or is there another reason?

I can also hide the output_buffer switching inside two helper
functions, but the helper function would need to use either a static
variable or a global one to save and restore the tmp_buffer. I'm not
sure that is better or worse (the current code uses a global pointer
&cur_error_buffer, so perhaps I should have used a similar approach).

Me neither. The current approach is rather localized in error.c; thus, it doesn't really matter.

* Fortran devs #2: The testsuite is testing that the warning is
eventually printed. However, I'm not sure it is testing when the
warning is buffered and then discarded, is it? If not, how can I
produce such a test?

Well, for nearly every Fortran program, at least for one line multiple attempts have to be done by the parser. Thus, if you set a break point in gfc_error, you will see "syntax error" messages which never make it to the screen. As the test suite checks for excess errors, nearly every test-suite file tests this.

* * *

With the patch in place, plus its gfc_error cousin, and after the follow-up conversion work, only the following would remain:

* gfc_warning: a single occurence of two locations in the same error string (of ~120 calls)
* gfc_error: approx. 34 times two locations (of ~1630 calls)
* gfc_error_now_1: 11 times two locations

* In scanner.c:
- gfc_warning_now_1: 6 calls in scanner.c
- gfc_warning: 3
- fprintf: 2
- gfc_error: 0
Some of those can probably be simply converted, others either need to remain, or one has to setup a proper location (currently, using gfc_warning_now would ICE), or one creates a work around and constructs manually the error string (at least for the fprintf cases).

* * *

Regarding the use of libcpp: Currently, it is only used with explicit preprocessing ("-cpp", file extension) and it is used to output into a temporary file which is then read back. I experimented with
  token = cpp_get_token (cpp_in);
  const char *str = cpp_token_as_text (cpp_in, token);
which kind of works okay.

But I do not understand how one gets linebreaks and spacing correctly. For linebreaks, I can use "flags & BOL" or a callback. But I'm still trying to understand how one can get the number of spacings. "flags & PREV_WHITE" seems to record only a single one and seems to convert all of them (\t, \n and ' ') into a single type.

For fixed-form Fortran, spaces are essential. On punch cards,* only the first 72 characters were read; as (some?) punch cards had additional 8 characters, those were used for comments (e.g. to enumerate the cards). Hence, there are codes out there, which assumes that everything beyond 72 characters is ignored. Thus, the number of spaces is crucial – at least with -fpreprocessed, if one wants to always goes through libcpp.

Looking at the current code, it seems to use a line-break callback with SOURCE_COLUMN to reconstruct the indentation. I think saving the token src_column and the length of the token string and subtracting it from the new source location, will also work for mid-line spaces. However, that's really a kludge – and it still doesn't permit one to distiniguish between spaces and tabs.

Finally, the token handling seems to get in trouble with
  print *, "Hello &
      &world!"
where a single string extends over two lines.

Taking everything together, I think using libcpp for reading Fortran files is a topic for GCC 6. Any suggestion how to properly handle the spacing?

For the lexing itself, one probably needs a Fortran mode, which can recognize Fortran comments, continuation lines and some special properties about Fortran string. When implemented, one could then also turn of the traditional mode.

Tobias

* Disclaimer: I started with Fortran 95 and a rather object-oriented code. Hence, I might have gotten the fine print of the Fortran history wrong.

Reply via email to