As you have no doubt figured out, for input and output I am converting, as
best I can, from system locale to CP1252 for "ASCII" and CP1140 for
EBCDIC.

We can't use UTF-8 internally for most purposes, because going back to a
time before the Cuban Missile Crisis means that COBOL is built around an
assumption that a character is a byte is a character. Wide characters, and
the variable length codepoints of UTF-8, need not apply.

The language does have provision for that newfangled stuff, but we haven't
implemented it as yet.

So, please, stick with the default 1252 for existing code -- as you noted,
changing the page breaks some tests.  Handling additional code pages is a
can we've been kicking down the road.

(Actually, I have a "can cannon" that I use to launch cans over the
horizon.   But don't tell anybody.)

> -----Original Message-----
> From: Iain Sandoe <iains....@gmail.com>
> Sent: Friday, March 21, 2025 06:06
> To: rdub...@symas.com; gcc-patches@gcc.gnu.org
> Subject: [PATCH] cobol: Address some iconv issues.
> 
> Darwin/macOS installed libiconv does not accept // trailers on
> conversion codes; (it does accept // with TRANSLIT etc after it)
> Anyway the current setting causes the init_iconv to fail - and
> then that SEGVs later.  So let's at least print a warning if we
> fail to init the conversion.
> 
> Secondly, using Windows code page 1252 as a default seems overly
> restrictive. Ideally, we should be using something like "char"
> which represents the prevailing charset for the locale.  However
> that causes testsuite fails, since the tests are expecting CP1252
> or similar - for Apple/Darwin, we should use ISO-8859-1 (the actual
> system, in common with most modern systems uses UTF-8).
> 
> NOTE I seem to be unable to use LC_ALL= to override this (but I
> did not attempt to sort that out so far).  This is just a patch to
> allow build to succeed on Darwin/macOS.
> 
> gcc/cobol/ChangeLog:
> 
>       * symbols.cc : Initialise standard_internal to ISO8859-1
>       for Apple/Dawin platforms.
>       (cbl_field_t::internalize): Print a warning if we fail to
>       initialise iconv.
> 
> Signed-off-by: Iain Sandoe <i...@sandoe.co.uk>
> ---
>  gcc/cobol/symbols.cc | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/cobol/symbols.cc b/gcc/cobol/symbols.cc
> index e078412e4ea..ebabcfd3070 100644
> --- a/gcc/cobol/symbols.cc
> +++ b/gcc/cobol/symbols.cc
> @@ -3566,7 +3566,12 @@ cbl_field_t::is_ascii() const {
>   * compilation, if it moves off the default, it adjusts only once, and
>   * never reverts.
>   */
> -static const char standard_internal[] = "CP1252//";
> +static const char standard_internal[] =
> +#if __APPLE__
> +"ISO8859-1";
> +#else
> +"CP1252//";
> +#endif
>  extern os_locale_t os_locale;
> 
>  static const char *
> @@ -3594,6 +3599,10 @@ cbl_field_t::internalize() {
>    static  iconv_t cd = iconv_open(tocode, fromcode);
>    static const size_t noconv = size_t(-1);
> 
> +  if (cd == (iconv_t)-1) {
> +    yywarn("failed iconv_open tocode = '%s' fromcode = %s", tocode,
> fromcode);
> +  }
> +
>    // Sat Mar 16 11:45:08 2024: require temporary environment for
testing
>    if( getenv( "INTERNALIZE_NO") ) return data.initial;
> 
> --
> 2.39.2 (Apple Git-143)

Reply via email to