On 7/16/2025 12:01 AM, c8ef wrote:
Hi Mikael,
On 7/14/2025 8:46 PM, Mikael Morin wrote:
Le 13/07/2025 à 16:39, c8ef a écrit :
Hi all,
I'm currently working on implementing the `split` procedure, which
was added in Fortran 2023. Given its similar functionality to the
`scan` intrinsic, I've been learning the implementation of `scan` to
better understand its mechanics. During my investigation of the
source code, I've come across a couple of questions that I'm hoping
you could help me out:
* Function Resolution
I noticed that `scan` resolves to `gfc_get_string("__scan_%d",
string- >ts.kind)`. However, based on my examination of
`libgfortran` and `trans-intrinsic.cc`, it appears this ultimately
forwards to the `string_scan` function. Could you please explain the
significance of this resolution step? Is it critical to the current
implementation, or is it perhaps a remnant of historical design?
No, it's not critical; I think the name can be used in compile-time
errors for example. The name that really matters is the name of the
function declaration that is passed to the middle-end (look for
string_scan in trans-decl.cc). Note that there are two variants:
string_scan, and string_scan_char4 (both on the front-end size and in
libgfortran).
* Argument Passing
According to the Fortran specification, the `scan` intrinsic accepts
arguments for `string`, `set`, `back`, and `kind`. Yet, the intrinsic
implemented in `libgfortran` seems to take `charlen` and the actual
pointers for both `string` and `set`. I've tried searching, but I
haven't been able to pinpoint where this transformation from the
specified arguments to the `libgfortran` expected arguments occurs.
Any guidance on this would be greatly appreciated.
You can have a look in gfc_conv_intrinsic_function_args.
To arrive there, the call stack starts with gfc_conv_expr (the main
expression translation entry point), then gfc_conv_function_expr,
gfc_conv_intrinsic_function, there jump to the GFC_ISYM_SCAN switch
case, then gfc_conv_intrinsic_index_scan_verify and finally
gfc_conv_intrinsic_function_args.
I hope it helps.
Thanks for your quick and helpful response! Based on your suggestions,
I've written the following trans-intrinsic function and test case:
***
! { dg-do run }
program b
CHARACTER (LEN=:), ALLOCATABLE :: INPUT
CHARACTER (LEN=2) :: SET = ', '
INTEGER P
INPUT = "one,last example"
P = 4
ISTART = P + 1
CALL SPLIT (INPUT, SET, P)
IEND = P - 1
PRINT '(T7,A)', INPUT (ISTART:IEND)
end program b
__attribute__((fn spec (". ")))
void b ()
{
struct __st_parameter_dt dt_parm.0;
static character(kind=1) set[1:2] = ", ";
integer(kind=4) p;
character(kind=1)[1:.input] * input;
integer(kind=4) iend;
integer(kind=4) p.3_1;
character(kind=1) * _2;
integer(kind=8) _4;
integer(kind=8) _5;
integer(kind=8) _20;
<bb 2> [local count: 1073741824]:
input_8 = __builtin_malloc (16);
__builtin_memcpy (input_8, &"one,last example"[1]{lb: 1 sz: 1}, 16);
p = 4;
_gfortran_string_split (16, input_8, 2, &set, &p, 0);
p.3_1 = p;
iend_12 = p.3_1 + -1;
dt_parm.0.common.filename =
&"/home/c8ef/gcc/gcc/testsuite/gfortran.dg/split_1.f90"[1]{lb: 1 sz: 1};
dt_parm.0.common.line = 11;
dt_parm.0.format = &"(T7,A)"[1]{lb: 1 sz: 1};
dt_parm.0.format_len = 6;
dt_parm.0.common.flags = 4096;
dt_parm.0.common.unit = 6;
_gfortran_st_write (&dt_parm.0);
_20 = (integer(kind=8)) iend_12;
_2 = &(*input_8)[5]{lb: 1 sz: 1};
_4 = _20 + -4;
_5 = MAX_EXPR <_4, 0>;
_gfortran_transfer_character_write (&dt_parm.0, _2, _5);
_gfortran_st_write_done (&dt_parm.0);
dt_parm.0 ={v} {CLOBBER(eos)};
p ={v} {CLOBBER(eos)};
return;
}
When looking at the optimized (O1) GENERIC format, it is still unclear
why *p = 16128084538487209988 instead of 4 on the O1/O0 optimized level.