Le 16/07/2025 à 18:28, c8ef a écrit :
On 7/16/2025 12:01 AM, c8ef wrote:
Hi Mikael,

On 7/14/2025 8:46 PM, Mikael Morin wrote:
Le 13/07/2025 à 16:39, c8ef a écrit :
Hi all,

I'm currently working on implementing the `split` procedure, which was added in Fortran 2023. Given its similar functionality to the `scan` intrinsic, I've been learning the implementation of `scan` to better understand its mechanics. During my investigation of the source code, I've come across a couple of questions that I'm hoping you could help me out:

* Function Resolution

I noticed that `scan` resolves to `gfc_get_string("__scan_%d", string-  >ts.kind)`. However, based on my examination of `libgfortran` and `trans-intrinsic.cc`, it appears this ultimately forwards to the `string_scan` function. Could you please explain the significance of this resolution step? Is it critical to the current implementation, or is it perhaps a remnant of historical design?

No, it's not critical; I think the name can be used in compile-time errors for example.  The name that really matters is the name of the function declaration that is passed to the middle-end (look for string_scan in trans-decl.cc).  Note that there are two variants: string_scan, and string_scan_char4 (both on the front-end size and in libgfortran).

* Argument Passing

According to the Fortran specification, the `scan` intrinsic accepts arguments for `string`, `set`, `back`, and `kind`. Yet, the intrinsic implemented in `libgfortran` seems to take `charlen` and the actual pointers for both `string` and `set`. I've tried searching, but I haven't been able to pinpoint where this transformation from the specified arguments to the `libgfortran` expected arguments occurs. Any guidance on this would be greatly appreciated.

You can have a look in gfc_conv_intrinsic_function_args.

To arrive there, the call stack starts with gfc_conv_expr (the main expression translation entry point), then gfc_conv_function_expr, gfc_conv_intrinsic_function, there jump to the GFC_ISYM_SCAN switch case, then gfc_conv_intrinsic_index_scan_verify and finally gfc_conv_intrinsic_function_args.

I hope it helps.

Thanks for your quick and helpful response! Based on your suggestions, I've written the following trans-intrinsic function and test case:

***

! { dg-do run }
program b
   CHARACTER (LEN=:), ALLOCATABLE :: INPUT
   CHARACTER (LEN=2) :: SET = ', '
   INTEGER P
   INPUT = "one,last example"
   P = 4
   ISTART = P + 1
   CALL SPLIT (INPUT, SET, P)
   IEND = P - 1
   PRINT '(T7,A)', INPUT (ISTART:IEND)
end program b


__attribute__((fn spec (". ")))
void b ()
{
   struct __st_parameter_dt dt_parm.0;
   static character(kind=1) set[1:2] = ", ";
   integer(kind=4) p;
   character(kind=1)[1:.input] * input;
   integer(kind=4) iend;
   integer(kind=4) p.3_1;
   character(kind=1) * _2;
   integer(kind=8) _4;
   integer(kind=8) _5;
   integer(kind=8) _20;

   <bb 2> [local count: 1073741824]:
   input_8 = __builtin_malloc (16);
   __builtin_memcpy (input_8, &"one,last example"[1]{lb: 1 sz: 1}, 16);
   p = 4;
   _gfortran_string_split (16, input_8, 2, &set, &p, 0);
   p.3_1 = p;
   iend_12 = p.3_1 + -1;
  dt_parm.0.common.filename = &"/home/c8ef/gcc/gcc/testsuite/ gfortran.dg/split_1.f90"[1]{lb: 1 sz: 1};
   dt_parm.0.common.line = 11;
   dt_parm.0.format = &"(T7,A)"[1]{lb: 1 sz: 1};
   dt_parm.0.format_len = 6;
   dt_parm.0.common.flags = 4096;
   dt_parm.0.common.unit = 6;
   _gfortran_st_write (&dt_parm.0);
   _20 = (integer(kind=8)) iend_12;
   _2 = &(*input_8)[5]{lb: 1 sz: 1};
   _4 = _20 + -4;
   _5 = MAX_EXPR <_4, 0>;
   _gfortran_transfer_character_write (&dt_parm.0, _2, _5);
   _gfortran_st_write_done (&dt_parm.0);
   dt_parm.0 ={v} {CLOBBER(eos)};
   p ={v} {CLOBBER(eos)};
   return;
}

When looking at the optimized (O1) GENERIC format, it is still unclear why *p = 16128084538487209988 instead of 4 on the O1/O0 optimized level.

How did you declare gfor_fndecl_string_split?
More exactly what is the declaration type for the POS argument?

Reply via email to