On Fri, Apr 4, 2025 at 11:50 PM Robert Dubner <rdub...@symas.com> wrote: > > Anybody who might have gotten interested should stand down. > > As usual, that analysis got me thinking. > > I got focused on where var_decl_return_code was being used. (I was wrong. > I made the mistake because I had just eliminated two sets of errors caused > by the optimization actually optimizing away things I need, so I had that > in the front of my brain.) Richard told me of something odd at the point > where var_decl_return was being established. I finally decided to look at > that. > > Turned out, ultimately, to be a SHORT / USHORT mismatch on the variables > being given to a MODIFY_EXPR. Apparently the optimization algorithms can > be extremely cranky about value types.
Yes. Unless you implement the GET_ALIAS_SET language hook even more so than when using C. But I expect COBOL to be a quite restrictive language with respect to typing - is there something like pointers so a COBOL program can access the bit-representation of some variable as a different type? Richard. > > In any event, with that straightened out, everything is working without > the flag_strict_aliasing modification. > > Thanks for asking, and thanks for listening. > > > -----Original Message----- > > From: Robert Dubner <rdub...@symas.com> > > Sent: Friday, April 4, 2025 16:02 > > To: Sam James <s...@gentoo.org> > > Cc: GCC Patches <gcc-patches@gcc.gnu.org> > > Subject: RE: [committed] cobol: Eliminate cobolworx UAT errors when > > compiling with -Os > > > > > -----Original Message----- > > > From: Sam James <mailto:s...@gentoo.org> > > > Sent: Friday, April 4, 2025 14:28 > > > To: Robert Dubner <mailto:rdub...@symas.com> > > > Cc: 'GCC Patches' <mailto:gcc-patches@gcc.gnu.org> > > > Subject: Re: [committed] cobol: Eliminate cobolworx UAT errors when > > > compiling with -Os > > > > > > Robert Dubner <mailto:rdub...@symas.com> writes: > > > > > > > From e70fe5ed46ab129a8b1da961c47d3fb75b11b988 Mon Sep 17 00:00:00 > 2001 > > > > From: Bob Dubner mailto:rdub...@symas.com > > > > Date: Fri, 4 Apr 2025 13:48:58 -0400 > > > > Subject: [PATCH] cobol: Eliminate cobolworx UAT errors when > compiling > > > with > > > > -Os > > > > > > > > Testcases compiled with -Os were failing because static functions > and > > > > static > > > > variables were being optimized away, because of improper data type > > > casts, > > > > and > > > > because strict aliasing (whatever that is) was resulting in some > loss > > > > of > > > > > > Are you unfamiliar with that from C and C++? See > > > https://gist.github.com/shafik/848ae25ee209f698763cffee272a58f8 (we > all > > > have our favourite documents to explain it) but I don't know if COBOL > is > > > amenable to the concept of TBAA. > > > > > > > data. > > > > These changes eliminate those known problems. > > > > > > I'd suggest that this should be accompanied by some question, > otherwise > > > it's going to live there forever and it's not necessarily right > (though > > > see above - if COBOL is incompatible with the idea, it might need > > > something along those lines, though not sure this is the right way of > > > doing that). > > > > > > That is, while you're free to approve your own COBOL patches, you're > > > also free to CC others and ask them for advice even if not explicit > > > approval before pushing them if something doesn't seem to be correct > or > > > is a hack. > > > > I am not any kind of compiler expert, except, possibly in one way: I > don't > > trust them. I have never trusted them. I don't trust them, and I don't > > trust the computers they run on. I regard compilers the same way a > > lion-tamer regards the big cats they work with every day. I treat them > > with firm respect; I expect them to behave the way they've been trained > -- > > and I never turn my back on them. So, no, I don't understand TBAA; I > had > > to look it up. I don't understand aliasing. > > > > The problem at hand is not a COBOL problem. It is a Bob problem. > Richard > > pointed me at the original root of the thing. I believe I have > addressed > > that. But the problem has not gone away. The "flag_strict_aliasing = > 0;" > > solution makes the symptoms go away. I spent a couple of hours messing > > with this, and I was unable to resolve it. So I have shrugged and used > > what I have. > > > > You probably have figured out by now that when I need help, it's because > > 1) I have missed something stupid, 2) I was just plain ignorant, or 3) > > It's a hard problem. > > > > And because it's a hard problem, it's hard to describe. But you sort of > > asked, so I will sort of try to explain what's going on. > > > > IBM-flavored COBOL has the concept of a RETURN-CODE, a global 16-bit > > integer that is shared by all PROGRAM-ID modules. (Each is implemented > as > > a C type function.) > > > > In GCOBOL, a COBOL variable is actually a structure with a data area, > > because COBOL variables have lots of metadata associated with them, > > including the type of storage, the amount of storage, in many cases the > > output format of the variable, whether or not they are signed, offsets > > from parent, number of digits, number of decimal places, and more. To > > make life easier, we implement RETURN-CODE as a COBOL variable, but its > > data area is a global 16-bit "short" defined in the libgcobol.so. > > > > So: Here we go: > > > > In libgcobol/constants.cc: > > short __gg__data_return_code = 0; > > > > In libgcobol/charmaps.h: > > extern short __gg__data_return_code ; > > > > In gcc/cobol/genapi.cc, I need to be able to generate GENERIC that > > accesses that variable. > > > > So, in gcc/genutil.h: > > extern tree var_decl_return_code; // short __gg__data_return_code > > > > In gcc/cobol/genutil.cc: > > tree var_decl_return_code; // short __gg__data_return_code > > > > In gcc/cobol/genapicc: > > SET_VAR_DECL(var_decl_return_code, SHORT, > "__gg__data_return_code"); > > > > That macro results in > > var_decl_return_code = gg_declare_variable( > > SHORT, > > "__gg__data_return_code", > > NULL_TREE, // Initial value > > vs_external_reference) > > > > SHORT is defined in gcc/cobol/gengen.h > > #define SHORT short_integer_type_node > > > > The gg_declare_variable() function is in gengen.cc. It creates a > var_decl > > for a SHORT with the name "__gg__data_return_code", and these > attributes: > > > > DECL_CONTEXT (var_decl) = gg_trans_unit.trans_unit_decl; > > TREE_USED(var_decl) = 1; > > DECL_EXTERNAL (var_decl) = 1; > > TREE_PUBLIC(var_decl) = 1; > > > > With that all in place, in genapi.cc we turn our attention to the > > parser_see_stop_run() routine. That routine uses a static integer named > > "..pssr_retval": > > > > static tree returned_value = gg_define_variable(INT, "..pssr_retval", > > vs_file_static); > > > > That variable gets set thusly: > > > > gg_assign(returned_value, gg_cast(INT, var_decl_return_code)); > > > > (gg_assign() creates a MODIFY_EXPR, gg_cast does "return > > fold_convert(type, var);" > > > > and that variable gets used like this: > > > > gg_exit(returned_value); > > > > gg_exit is found in gengen.cc: > > > > void > > gg_exit(tree exit_code) > > { > > tree the_call = > > build_call_expr_loc(location_from_lineno(), > > builtin_decl_explicit (BUILT_IN_EXIT), > > 1, > > exit_code); > > gg_append_statement(the_call); > > } > > > > So, we have the stage set. When the prior version of GCOBOL (without the > > flag_strict_aliasing=0) compiles this program > > > > PROGRAM-ID. PROG. > > PROCEDURE DIVISION. > > MOVE 1 TO RETURN-CODE > > STOP RUN. > > > > with -O0 or -O1, the executable terminates with 1. > > > > With -O2, -O3, or -Os, the executable terminates with 0. > > > > It was at this point that I gave up. As far as I can tell I am creating > a > > signed short, I am picking it up, I am casting it to a signed int, and I > > am passing that to exit(). > > > > I don't know what else to try. > > > > With flag_strict_aliasing=0, it works. Without it, it doesn't. > > > > I dislike needing that flag_strict_aliasing=0; it feels a bit like > turning > > my back to a tiger. But here we are. > > > > Remember: You asked. And thank you for asking. And if anybody can offer > > any insight as to what's going on here, it will be received with great > > enthusiasm. > > > > > > > > > > > > > > > gcc/cobol > > > > > > > > * cobol1.cc: (cobol_langhook_post_options): Implemented in order > > > > to set > > > > flag_strict_aliasing to zero. > > > > * genapi.cc: (set_user_status): Add comment. > > > > (parser_intrinsic_subst): Expand SHOW_PARSE information. > > > > (psa_global): Change names of return-code and upsi globals, > > > > (psa_FldLiteralA): Set DECL_PRESERVE_P for FldLiteralA. > > > > * gengen.cc: (show_type): Add POINTER type. > > > > (gg_define_function_with_no_parameters): Set DECL_PRESERVE_P for > > > > COBOL- > > > > style nested programs. (gg_array_of_bytes): Fix bad cast. > > > > > > > > libgcobol > > > > > > > > * charmaps.h: Change __gg__data_return_code to 'short' type. > > > > * constants.cc: Likewise. > > > > --- > > > > gcc/cobol/cobol1.cc | 19 +++++++++++++++++++ > > > > gcc/cobol/genapi.cc | 19 +++++++++++++++++-- > > > > gcc/cobol/gengen.cc | 12 ++++++++++-- > > > > libgcobol/charmaps.h | 2 +- > > > > libgcobol/constants.cc | 10 +++++----- > > > > 5 files changed, 52 insertions(+), 10 deletions(-) > > > > > > > > diff --git a/gcc/cobol/cobol1.cc b/gcc/cobol/cobol1.cc > > > > index 0d07c460d41..d175ab11e3f 100644 > > > > --- a/gcc/cobol/cobol1.cc > > > > +++ b/gcc/cobol/cobol1.cc > > > > @@ -646,6 +646,22 @@ cobol_get_sarif_source_language(const char *) > > > > return "cobol"; > > > > } > > > > > > > > +bool > > > > +cobol_langhook_post_options(const char**) > > > > + { > > > > + // This flag, when set to 0, results in calls to gg_exit working > > > > properly. > > > > > > Comments like this usually have a ??? or similar. > > > > > > > + // I don't know why it is necessary. There is something going on > > > with > > > > the > > > > + // definition of __gg__data_return_code in constants.cc, and > with > > > how > > > > it > > > > + // is used through var_decl_return_code in genapi.cc. Without > it, > > > the > > > > value > > > > + // delivered to exit@PLT is zero, and not __gg__data_return_code > > > > + // Dubner, 2025-04-04. > > > > + flag_strict_aliasing = 0; > > > > + > > > > + /* Returning false means that the backend should be used. */ > > > > + return false; > > > > + } > > > > + > > > > + > > > > #undef LANG_HOOKS_BUILTIN_FUNCTION > > > > #undef LANG_HOOKS_GETDECLS > > > > #undef LANG_HOOKS_GLOBAL_BINDINGS_P > > > > @@ -660,6 +676,7 @@ cobol_get_sarif_source_language(const char *) > > > > ////#undef LANG_HOOKS_TYPE_FOR_SIZE > > > > #undef LANG_HOOKS_SET_DECL_ASSEMBLER_NAME > > > > #undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE > > > > +#undef LANG_HOOKS_POST_OPTIONS > > > > > > > > // We use GCC in the name, not GNU, as others do, > > > > // because "GnuCOBOL" refers to a different GNU project. > > > > @@ -685,6 +702,8 @@ cobol_get_sarif_source_language(const char *) > > > > > > > > #define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE > > > > cobol_get_sarif_source_language > > > > > > > > +#define LANG_HOOKS_POST_OPTIONS cobol_langhook_post_options > > > > + > > > > struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; > > > > > > > > #include "gt-cobol-cobol1.h" > > > > diff --git a/gcc/cobol/genapi.cc b/gcc/cobol/genapi.cc > > > > index a0da6476e2a..fbe0bbc75dc 100644 > > > > --- a/gcc/cobol/genapi.cc > > > > +++ b/gcc/cobol/genapi.cc > > > > @@ -8806,6 +8806,10 @@ static > > > > void set_user_status(struct cbl_file_t *file) > > > > { > > > > // This routine sets the user_status, if any, to the > > > > cblc_file_t::status > > > > + > > > > + // We have to do it this way, because in the case where the > > > > file->user_status > > > > + // is in linkage, the memory addresses can end up pointing to the > > > wrong > > > > + // places > > > > if(file->user_status) > > > > { > > > > cbl_field_t *user_status = > > > > cbl_field_of(symbol_at(file->user_status)); > > > > @@ -10111,6 +10115,13 @@ parser_intrinsic_subst( cbl_field_t *f, > > > > SHOW_PARSE > > > > { > > > > SHOW_PARSE_HEADER > > > > + SHOW_PARSE_FIELD(" TO ", f) > > > > + for(size_t i=0; i<argc; i++) > > > > + { > > > > + SHOW_PARSE_INDENT > > > > + SHOW_PARSE_FIELD(" ", argv[i].orig.field) > > > > + SHOW_PARSE_FIELD(" ", argv[i].replacement.field) > > > > + } > > > > SHOW_PARSE_END > > > > } > > > > TRACE1 > > > > @@ -15908,12 +15919,12 @@ psa_global(cbl_field_t *new_var) > > > > > > > > if( strcmp(new_var->name, "RETURN-CODE") == 0 ) > > > > { > > > > - strcpy(ach, "__gg___11_return_code6"); > > > > + strcpy(ach, "__gg__return_code"); > > > > } > > > > > > > > if( strcmp(new_var->name, "UPSI-0") == 0 ) > > > > { > > > > - strcpy(ach, "__gg___6_upsi_04"); > > > > + strcpy(ach, "__gg__upsi"); > > > > } > > > > > > > > new_var->var_decl_node = > gg_declare_variable(cblc_field_type_node, > > > ach, > > > > NULL, vs_external_reference); > > > > @@ -16156,6 +16167,10 @@ psa_FldLiteralA(struct cbl_field_t *field ) > > > > field->data.initial, > > > > NULL_TREE, > > > > field->var_decl_node); > > > > + TREE_READONLY(field->var_decl_node) = 1; > > > > + TREE_USED(field->var_decl_node) = 1; > > > > + TREE_STATIC(field->var_decl_node) = 1; > > > > + DECL_PRESERVE_P (field->var_decl_node) = 1; > > > > nvar += 1; > > > > } > > > > TRACE1 > > > > diff --git a/gcc/cobol/gengen.cc b/gcc/cobol/gengen.cc > > > > index ffb64c8993d..e7a4e3c5165 100644 > > > > --- a/gcc/cobol/gengen.cc > > > > +++ b/gcc/cobol/gengen.cc > > > > @@ -375,6 +375,10 @@ show_type(tree type) > > > > static char ach[1024]; > > > > switch( TREE_CODE(type) ) > > > > { > > > > + case POINTER_TYPE: > > > > + sprintf(ach, "POINTER"); > > > > + break; > > > > + > > > > case VOID_TYPE: > > > > sprintf(ach, "VOID"); > > > > break; > > > > @@ -2548,6 +2552,10 @@ gg_define_function_with_no_parameters(tree > > > > return_type, > > > > DECL_CONTEXT (function_decl) = gg_trans_unit.trans_unit_decl; > > > > TREE_PUBLIC(function_decl) = 0; > > > > > > > > + // This function is file static, but nobody calls it, so > without > > > > + // intervention -O1+ optimizations will discard it. > > > > + DECL_PRESERVE_P (function_decl) = 1; > > > > + > > > > // Append this function to the list of functions and variables > > > > // associated with the computation module. > > > > gg_append_var_decl(function_decl); > > > > @@ -3358,8 +3366,8 @@ gg_array_of_size_t( size_t N, size_t *values) > > > > tree > > > > gg_array_of_bytes( size_t N, unsigned char *values) > > > > { > > > > - tree retval = gg_define_variable(build_pointer_type(UCHAR)); > > > > - gg_assign(retval, gg_cast(build_pointer_type(UCHAR), gg_malloc( > > > > build_int_cst_type(UCHAR, N * sizeof(unsigned char))))); > > > > + tree retval = gg_define_variable(UCHAR_P); > > > > + gg_assign(retval, gg_cast(UCHAR_P, gg_malloc( > > > > build_int_cst_type(SIZE_T, N * sizeof(unsigned char))))); > > > > for(size_t i=0; i<N; i++) > > > > { > > > > gg_assign(gg_array_value(retval, i), build_int_cst_type(UCHAR, > > > > values[i])); > > > > diff --git a/libgcobol/charmaps.h b/libgcobol/charmaps.h > > > > index 12968fdf928..6b4e9f5c4b4 100644 > > > > --- a/libgcobol/charmaps.h > > > > +++ b/libgcobol/charmaps.h > > > > @@ -297,7 +297,7 @@ extern unsigned char __gg__data_zeros[1] ; > > > > extern unsigned char __gg__data_high_values[1] ; > > > > extern unsigned char __gg__data_quotes[1] ; > > > > extern unsigned char __gg__data_upsi_0[2] ; > > > > -extern unsigned char __gg__data_return_code[2] ; > > > > +extern short __gg__data_return_code ; > > > > > > > > // These are the various hardcoded tables used for conversions. > > > > extern const unsigned short __gg__one_to_one_values[256]; > > > > diff --git a/libgcobol/constants.cc b/libgcobol/constants.cc > > > > index 026f919cacc..d37c791f1b3 100644 > > > > --- a/libgcobol/constants.cc > > > > +++ b/libgcobol/constants.cc > > > > @@ -288,7 +288,7 @@ struct cblc_field_t __gg___14_linage_counter6 = > { > > > > > > > > > > > > unsigned char __gg__data_upsi_0[2] = {0,0}; > > > > -struct cblc_field_t __gg___6_upsi_04 = { > > > > +struct cblc_field_t __gg__upsi = { > > > > .data = __gg__data_upsi_0 , > > > > .capacity = 2 , > > > > .allocated = 2 , > > > > @@ -307,9 +307,9 @@ struct cblc_field_t __gg___6_upsi_04 = { > > > > .dummy = 0 , > > > > }; > > > > > > > > -unsigned char __gg__data_return_code[2] = {0,0}; > > > > -struct cblc_field_t __gg___11_return_code6 = { > > > > - .data = __gg__data_return_code , > > > > +short __gg__data_return_code = 0; > > > > +struct cblc_field_t __gg__return_code = { > > > > + .data = (unsigned char *)&__gg__data_return_code , > > > > .capacity = 2 , > > > > .allocated = 2 , > > > > .offset = 0 , > > > > @@ -319,7 +319,7 @@ struct cblc_field_t __gg___11_return_code6 = { > > > > .parent = NULL, > > > > .occurs_lower = 0 , > > > > .occurs_upper = 0 , > > > > - .attr = 0x0 , > > > > + .attr = signable_e , > > > > .type = FldNumericBin5 , > > > > .level = 0 , > > > > .digits = 4 ,