[patch. libgfortran] PR118571 UTF-8 output and the A edit descriptor

Jerry D Thu, 23 Jan 2025 13:21:10 -0800

Hello all,

The attached patch is straight forward. I spent more time on getting thetest case ready. Thanks Steve for finding this and narrowing down wherethe problem was.


Regression tested on x86_64-linux-gnu.

OK for trunk? What about a backport to 14?

Regards,

Jerry

Author: Jerry DeLisle <jvdeli...@gcc.gnu.org>
Date:   Thu Jan 23 12:58:14 2025 -0800

    Fortran: Fix UTF-8 output with A edit descriptor.

            This adjusts the source len for the case where the user has
            specified a field width with the A descriptor.

            PR libfortran/118571

    libgfortran/ChangeLog:

            * io/write.c (write_utf8_char4): Adjust the src_len to the
            format width w_len when greater than zero.

    gcc/testsuite/ChangeLog:

            * gfortran.dg/utf8_3.f03: New test.

diff --git a/gcc/testsuite/gfortran.dg/utf8_3.f03 b/gcc/testsuite/gfortran.dg/utf8_3.f03
new file mode 100644
index 00000000000..e1688149e5d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/utf8_3.f03
@@ -0,0 +1,57 @@
+! { dg-do run }
+! PR118571 UTF-8 output and the A edit descriptor.
+
+program test
+
+  use iso_fortran_env
+  
+  implicit none
+  
+  integer, parameter :: ucs4 = selected_char_kind('ISO_10646')
+  
+  character(kind=ucs4, len=1), parameter :: alpha = char(int(z'03B1'), ucs4)
+  character(kind=ucs4, len=1), parameter :: beta  = char(int(z'03B2'), ucs4)
+  character(kind=ucs4, len=1), parameter :: space = ucs4_' '
+  
+  integer fd
+  character(kind=ucs4,len=:), allocatable :: str
+  character(kind=ucs4,len=25) :: instr, correct
+  
+  fd = 42
+  
+  open (fd, encoding='UTF-8', status="scratch")
+  open (output_unit, encoding='UTF-8')
+  str = repeat(space,6)//alpha//beta//alpha//beta
+
+  write(fd,'(I4,1X,A)') len_trim(str), str
+  rewind(fd)
+  read(fd,'(a)') instr
+  if (trim(instr) /= ucs4_'  10 '//trim(str)) stop 1
+  
+  str = alpha // beta // alpha // beta
+  rewind(fd) 
+  write(fd,'(I4,1X,">",A,"<")')  len_trim(str(1:1)), str(1:1)
+  rewind(fd)
+  read(fd,'(a)') instr
+  if (trim(instr) /= ucs4_'   1 >'//alpha//ucs4_'<') stop 2
+
+  rewind(fd)
+  write(fd,*) len_trim(str(1:1)), str(1:1)
+  rewind(fd)
+  read(fd,'(a)') instr
+  if (trim(instr) /= ucs4_'           1 '//alpha) stop 3
+
+  rewind(fd)  
+  write(fd,'(I4,1X,">",A1,"<")')  len_trim(str(1:1)), str(1:1)
+  rewind(fd)
+  read(fd, '(a)') instr
+  if (trim(instr) /= ucs4_'   1 >'//alpha//ucs4_'<') stop 4
+
+  rewind(fd)  
+  write(fd,'(I4,1X,">",A1,"<")')  len_trim(str), str
+  rewind(fd)
+  read(fd, '(a)') instr
+  if (trim(instr) /= ucs4_'   4 >'//alpha//ucs4_'<') stop 5
+  close(fd)
+end program
+
diff --git a/libgfortran/io/write.c b/libgfortran/io/write.c
index 54312bf67e9..b36c5bef09e 100644
--- a/libgfortran/io/write.c
+++ b/libgfortran/io/write.c
@@ -177,7 +177,9 @@ write_utf8_char4 (st_parameter_dt *dtp, gfc_char4_t *source,
       break;
     }
 
-  /* Now process the remaining characters, one at a time.  */
+  /* Now process the remaining characters, one at a time. We need to
+     adjust the src_len if the user has specified a field width.  */
+  src_len = w_len > 0 ? w_len : src_len;
   for (j = k; j < src_len; j++)
     {
       c = source[j];

[patch. libgfortran] PR118571 UTF-8 output and the A edit descriptor

Reply via email to