Am 17.03.2025 um 18:25 schrieb James K. Lowden:
On Sun, 16 Mar 2025 21:07:39 +0100
Simon Sobisch <simonsobi...@web.de> wrote:
This gives three reference-formats: "fixed" "free" and "extended". For
two of those we have seen the flags -ffixed-form and -ffree-form, so
I'd _guess_ the last one would be -fextended-form.
Question: Is there a reason to have multiple flags for that?
[I think this thread belongs in gcc@ because there's no patch to
discuss. I'm answering here for the sake of continuity.]
-ffixed-form and -ffree-form are the names gfortran uses. To get
"logical reference format" -- unlimited lines with the first 6 columns
ignored and indicator column 7, we have
-findicator-column=7
It's not a great name, not least because it seems
invertible but is not. ("-fno-indicator-column=n" makes no sense.)
OTHO it says what it means: the location of the indicator column, with
no mention of a line length limit (because there isn't one).
-fformat=fixed/free/extended/.../auto
The problem here IMO is the burden of names. Each combination of
left/right margin needs a name, all of which are arbitrary. "Extended"
from what, and to what? Every compiler seems to have its own
variation.
"extended" came from Bob's notes, not mine :-)
In general: words are just words, you can choose "arbitrary but
reasonable ones".
If -- if -- we were to support other formats I'd be inclined to use
-source-format from[-to]
so the user says where the indicator column is, and what the maximum
length is, if any. So,
-ffixed-form is -source-format 7-72
-ffree-form is -source-format 1
(logical ref) is -source-format 7
(no indicator) is -source-format 0
with the implied rule that, if the first column is 1, then '*' is
honored as a comment, else the character is part of the COBOL text.
That covers... only a very small subset.
Here is GnuCOBOL's documentation on source formats:
----------------------------------------------------------------------
@node Source format
@subsection Source format
GnuCOBOL supports fixed, free, Micro Focus' Variable, X/Open Free-form,
ICOBOL xCard and Free-form, ACUCOBOL-GT Terminal, and COBOLX source
formats. By default, the compiler tries to autodetect the format using
the indicator on the first line, using the fixed format for correct
indicators and the free format for incorrect ones. This can be
overridden either by the @code{>>SOURCE [FORMAT] [IS]
@{FIXED|FREE|COBOL85|VARIABLE|XOPEN|XCARD|CRT|TERMINAL|COBOLX|AUTO@}}
directive, or by one of the following options:
@table @code
@item -free, -F, -fformat=free
Free format. The program-text area starts in column 1 and
continues till the end of line (effectively 255 characters
in GnuCOBOL).
@item -fixed, -fformat=fixed
Fixed format. Source code is divided into: columns 1-6, the sequence
number area; column 7, the indicator area; columns 8-72, the
program-text area; and columns 72-80 as the reference area.
@footnote{Historically, fixed format was based on 80-character punch
cards.}
@item -fformat=cobol85
Fixed format with enforcements on the use of Area A.
@item -fformat=variable
Micro Focus' Variable format. Identical to the fixed format above
except for the program-text area which extends up to column 250 instead
of 72.
@item -fformat=xcard
ICOBOL xCard format. Variable format with right margin set at column
255 instead of 250.
@item -fformat=xopen
X/Open Free-form format. The program-text area may start in column 1
unless an indicator is present, and lines may contain up to 255
characters. Indicator for debugging lines is @samp{D } (D followed by
a space) instead of @samp{D} or @samp{d}.
@item -fformat=crt
ICOBOL Free-form format (CRT). Similar to the X/Open format above, with
lines containing up to 320 characters and single-character debugging
line indicators (@samp{D} or @samp{d}).
@item -fformat=terminal
ACUCOBOL-GT Terminal format. Similar to the CRT format above, with
indicator for debugging lines being @samp{\D} instead of @samp{D} or
@samp{d}. This format is mostly compatible with VAX COBOL terminal
source format.
@item -fformat=cobolx
COBOLX format. This format is similar to the CRT format above, except
that the indicator area is always present in column 1; the program-text
area starts in column 2 and extends up to the end of the record. Lines
may contain up to 255 characters.
@item -fformat=auto
Autodetection of format. The compiler will use the first line of the
file to detect whether the file is in fixed format (with a correct
indicator at position 7), or in free format.
@end table
Note that with source formats @code{XOPEN}, @code{CRT}, @code{TERMINAL},
and @code{COBOLX}, missing spaces are not inserted within continued
alphanumeric literals that are truncated before the right margin.
@emph{Area A} denotes the source code that spans between margin A and
margin B, and Area B spans from the latter to the end of the record.
@emph{Area A enforcement} checks the contents of Area A, and reports any
item that does not belong to the correct Area: this feature helps in
developping COBOL programs that are portable to actual mainframe
environments.
In general, division, section, and paragraph names must start in Area A.
In the @code{DATA DIVISION}, level numbers @samp{01} and @samp{77}, must
also start in Area A. In the @code{PROCEDURE DIVISION}s, statements and
separator periods must fit within Area B. Every source format listed
above may be subject to Area A enforcement, except @code{FIXED},
@code{FREE}, and @code{XOPEN}.
Note that Area A enforcement enables recovery from missing periods
between paragraphs and sections.
----------------------------------------------------------------------
As you see - there's *a lot* of formats and a single "from-to-col" won't
cover that.
... and if gcobol implements -fformat - you don't even have to search
for nice words, just use cobc's ones and blame the GC guys for those
strange names.
The main point is: if you go with something like -fformat, you can add
more formats later easily without adding new flags (just values).
You _could_ also fall back to free for xopen and to fixed for everything
unknown as well (or similar) after raising a warning on unknown source
format (for both -fformat and >> SOURCE FORMAT IS).
Side-note: auto-choosing "extended" was at least confusing for me (and
the NIST suite initial compile-try). That likely confused me most,
because of not knowing another compiler choosing _that_ format
automatically.
There must be some default, and auto-detection is a good one. It can
be improved without changing the command-line options. :-) I'm
looking forward to more examples before tweaking it.
As noted: most code "in production" is still in fixed-form
reference-format, either with or without area-a enforcement.
Therefore I'd suggest to use this reference-format if no other format
can be deduced (you can switch to free-form reference-format if the
indicator column 7 has no valid value, for example).
Simon