Hi Mikael,
On 7/14/2025 8:46 PM, Mikael Morin wrote:
Le 13/07/2025 à 16:39, c8ef a écrit :
Hi all,
I'm currently working on implementing the `split` procedure, which was
added in Fortran 2023. Given its similar functionality to the `scan`
intrinsic, I've been learning the implementation of `scan` to better
understand its mechanics. During my investigation of the source code,
I've come across a couple of questions that I'm hoping you could help
me out:
* Function Resolution
I noticed that `scan` resolves to `gfc_get_string("__scan_%d", string-
>ts.kind)`. However, based on my examination of `libgfortran` and
`trans-intrinsic.cc`, it appears this ultimately forwards to the
`string_scan` function. Could you please explain the significance of
this resolution step? Is it critical to the current implementation, or
is it perhaps a remnant of historical design?
No, it's not critical; I think the name can be used in compile-time
errors for example. The name that really matters is the name of the
function declaration that is passed to the middle-end (look for
string_scan in trans-decl.cc). Note that there are two variants:
string_scan, and string_scan_char4 (both on the front-end size and in
libgfortran).
* Argument Passing
According to the Fortran specification, the `scan` intrinsic accepts
arguments for `string`, `set`, `back`, and `kind`. Yet, the intrinsic
implemented in `libgfortran` seems to take `charlen` and the actual
pointers for both `string` and `set`. I've tried searching, but I
haven't been able to pinpoint where this transformation from the
specified arguments to the `libgfortran` expected arguments occurs.
Any guidance on this would be greatly appreciated.
You can have a look in gfc_conv_intrinsic_function_args.
To arrive there, the call stack starts with gfc_conv_expr (the main
expression translation entry point), then gfc_conv_function_expr,
gfc_conv_intrinsic_function, there jump to the GFC_ISYM_SCAN switch
case, then gfc_conv_intrinsic_index_scan_verify and finally
gfc_conv_intrinsic_function_args.
I hope it helps.
Thanks for your quick and helpful response! Based on your suggestions,
I've written the following trans-intrinsic function and test case:
static tree
conv_intrinsic_split (gfc_code *code) {
stmtblock_t block;
gfc_se se;
tree stringlen, string;
tree setlen, set;
tree pos, back;
tree tmp;
gfc_start_block (&block);
gfc_init_se (&se, NULL);
gfc_conv_expr(&se, code->ext.actual->expr);
gfc_conv_string_parameter(&se);
stringlen = se.string_length;
gfc_add_block_to_block(&block, &se.pre);
gfc_add_block_to_block(&block, &se.post);
string = se.expr;
gfc_init_se (&se, NULL);
gfc_conv_expr(&se, code->ext.actual->next->expr);
gfc_conv_string_parameter(&se);
setlen = se.string_length;
gfc_add_block_to_block(&block, &se.pre);
gfc_add_block_to_block(&block, &se.post);
set = se.expr;
gfc_init_se (&se, NULL);
gfc_conv_expr(&se, code->ext.actual->next->next->expr);
gfc_add_block_to_block(&block, &se.pre);
gfc_add_block_to_block(&block, &se.post);
pos = se.expr;
pos = gfc_build_addr_expr(NULL_TREE, pos);
back = build_int_cst (gfc_get_logical_type (4), 0);
tmp = build_call_expr_loc (input_location, gfor_fndecl_string_split,
6, stringlen, string, setlen, set, pos, back);
gfc_add_expr_to_block (&block, tmp);
return gfc_finish_block (&block);
}
***
! { dg-do run }
program b
CHARACTER (LEN=:), ALLOCATABLE :: INPUT
CHARACTER (LEN=2) :: SET = ', '
INTEGER P
INPUT = "one,last example"
P = 4
ISTART = P + 1
CALL SPLIT (INPUT, SET, P)
IEND = P - 1
PRINT '(T7,A)', INPUT (ISTART:IEND)
end program b
The execution test failed on optimization levels O0 and O1 but passed on
the other four options. The error message indicates that *pos holds an
unusually large value in O0/O1. Could this be due to a bug in the trans
function, or does the issue lie elsewhere?
Thanks,
Yuao