Joseph, I am Bob Dubner, the other half of the development team for the
COBOL front end.  Conceptually, I regard the front end as having a blurry
line down the middle of it; Jim primarily does parsing, I generate the
GENERIC tree.

> -----Original Message-----
> From: Joseph Myers <josmy...@redhat.com>
> Sent: Thursday, December 19, 2024 15:18
> To: James K. Lowden <jklow...@schemamania.org>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] COBOL 1/8 hdr: header files
> 
> On Wed, 18 Dec 2024, James K. Lowden wrote:
> 
> > On Mon, 16 Dec 2024 23:36:37 +0000 (UTC) Joseph Myers
> > <josmy...@redhat.com> wrote:
> >
> > > > +extern "C"  _Float128 __gg__float128_from_qualified_field
> > >
> > > I'm not entirely sure whether this is host or target code (you
> > > always need to be clear about which is which in GCC), but in any
> > > case, both hosts and targets without __int128 or _Float128 are
> supported in GCC.
> >
> > In preparing my comprehensive TODO list, these points need
> > clarification (for us both, I think).
> >
> > We are ignoring 32-bit architectures and rely on 128-bit numeric
> > support to meet ISO COBOL requirements.  I know there's a way to
> > enumerate supported targets but don't know how.  As of now, any
> > missing support is reported by the compiler when building gcobol.
> 
> Could you give more details of what is required or optional in ISO
COBOL?
> Looking at ISO/IEC 1989:2023 (and searching for "128"), I see, for
> example, in A.3 item 17, "The usages FLOAT-BINARY-32, FLOAT-BINARY-64
and
> FLOAT-BINARY-128 are dependent on the capabilities of the processor.".

In the document you mention, we have section "8.3.3.3.2 Fixed-point
numeric literals", which specifies that "...shall allow for fixed-point
numeric literals of 1 through 31 digits in length".  COBOL provides that
fixed-point values can have decimal places, but they are not stored as
floating point.  A data description of "PICTURE 99V999" means that the
data structure can hold five digits, with an implied decimal point at the
'V'.  If the DISPLAY usage is specified, then the value 12.345 is stored
as the characters "12345" (In ASCII, 0x31 through 0x35).  If a binary
USAGE is specified, then the binary value 12345 (0x3039) is stored in
memory.  The number of bytes, and whether or not it is stored as big- or
little-endian, is also determined by the data description.  Yes, all that
is part of the language.  Welcome to COBOL.

My implementation attempts to keep intermediate values small.  So, when at
run-time I am adding two values that both fit into 32-bit integers, I try
to do that.  If they get up to 10 or more digits, I switch to 64-bit
integers; when they get up to 20 or more digits I switch to __int128.
__int128 can hold numbers up to 38 digits, and that's the limit of our
implementation, which meets the requirement that a fixed-point number can
be [at least] 31 digits.

The following section, " 8.3.3.3.3 Floating-point numeric literals",
requires 1 to 36 digits.  I assume it is no coincidence that 36 digits can
be stored in an IEEE 754 binary128.  The ISO float-short, float-long, and
float-extended correspond precisely with the IEEE binary32, binary64, and
binary128 definitions.  So, I used them.

I have been speaking of what Jim and I call "run-time code", and what I
see here is referenced as the target code.

At compile-time (or on the host), we also do numeric calculations.  The
ISO specification allows for compile-time computations specified in the
source code.  In addition, at times I put initial values for the COBOL
variables into the run-time structures that are the COBOL variables.  In
order to create those CONSTRUCTOR nodes we have to do those calculations
at compile time, hence the use of __int128 and _Float128 in the host code.

In the run-time/host code, I have been using intTI_type_node for __int128,
and unsigned_intTI_type_node for __uint128.  For floating point, I've been
using float32_type_node, float64_type_node, and float128_type_node.

If there are recommendations as to what would work better across other
architectures, I am all ears.

As to how we arrived here:  I am very aware of, and a bit in awe of, GCC's
ability to create hosts pn one set of architectures that themselves create
executables for other architectures.  Jim and I, however, have had plenty
to do just getting an Ubuntu/x86_64 version of GCC to create Ubuntu/x86_64
COBOL executables.

> 
> The corresponding C and C++ features are optional - some targets support
> them, some don't, the language doesn't require them to be supported.
> (I'm aware of a C++ proposal to require support for 128-bit integers,
but
> I'm not sure of its current status.  If it went in, we'd need all
> architecture maintainers for 8-bit/16-bit/32-bit architectures to define
> the ABI for 128-bit integers on their target, in collaboration with the
> maintainers of any ABI document or other implementations.)
> 
> And having support for such features on the target is in any case
> independent of having it on the host.  You can build GCC to run on
32-bit
> Arm (no __int128 or _Float128) as the host, and generate code for
AArch64
> (has __int128 and _Float128) as the target.  It would be odd to require
a
> 64-bit host for a particular language (if you need arithmetic within the
> compiler itself wider than natively supported on the host, we have both
> GCC's wide_int and GMP available; likewise, GCC's real* and MPFR for
wider
> floating-point support).
> 
> If you require __int128 on the target, the toplevel / libgcobol
configure
> code will need to handle building libgcobol only for the subset of
> multilibs for the target that have __int128, since lots of targets have
> both 32-bit multilibs (no __int128) and 64-bit multilibs (with
__int128).
> 
> > Is there an architecture-feature database within gcc that lists which
> > ones support _Float128?
> 
> _Float128 is generally TFmode.  Look at the
TARGET_SCALAR_MODE_SUPPORTED_P
> hooks to see which support TFmode.  The default hook supports it if it's
> used for long double (TARGET_C_MODE_FOR_FLOATING_TYPE hook).
> 
> Although most 64-bit targets do support _Float128, that isn't
universally
> the case.  For example, powerpc64 big-endian doesn't.

You have given me much to ponder, and ponder it I will.

> 
> > > In general, target code - including headers - should not go under
> > > gcc/ at all.  And host code shouldn't be using __* identifiers as
> > > those are reserved.
> >
> > The above function is implemented in the runtime library.  It is
> > called from generated code, and from within the library.  We have many
> > such functions.  They have leading underscores because they're not
> > intended to be called by any user; that is, they're part of the
> > implementation. It's my understanding we *are* the implementation to
> > which such names are reserved.
> >
> > > whether this is host or target code
> >
> > I think "target" must be the answer? The function is not used to build
> > gcobol.  The built compiler emits code that calls that function, which
> > it requires be supplied by libgcobol.
> 
> OK, so this header should go in the libgcobol/ directory, not in
> gcc/cobol/ (which is where this patch version has it).  The same for any
> other headers declaring functions in libgcobol.
> 
> If there's any header that needs to be included in both the compiler and
> the library for some reason (e.g. if you need a header defining
constants
> that are used by the library, and the compiler also needs to know when
> generating code), we'll need to look at that in more detail.  But
function
> declarations certainly should only be included in one of those two
places:
> the compiler's headers should declare functions that are part of the
> compiler, the library's headers should declare functions that are part
of
> the library.  And structure declarations can't readily be shared either
> simply because the host and target can have different types.

This observation of yours triggered a discussion between Jim and me.  We
agree.  It was convenient for me to put all of the .h files for libgcobol
in with the compile code.  But we completely agree that the compiler and
the library can't use the same structures.  And currently it is not.

The compiler needs to know what the run-time structures are, because it
has to create the GENERIC that creates those structures.  But it doesn't
use the .h declarations; they actually appear only as comments in the C
code that creates the GENERIC that creates the structures.  Much of the
library code is accepting pointers to the structures that were created by
the GENERIC "code".  (I tend to think about creating GENERIC as if I were
writing assembly language.  I have reached a point where it is easy to do,
but I get confused when I think about it too much, and I trip over my
tongue a lot when I try to talk about it.)  We can, and will, separate out
the .h declarations as you describe.

Pleased to meet you, by the way, and thanks very much for your help.

Bob Dubner
rdub...@symas.com

> 
> --
> Joseph S. Myers
> josmy...@redhat.com

Reply via email to