Hi Mikael,

On 7/14/2025 8:46 PM, Mikael Morin wrote:
Le 13/07/2025 à 16:39, c8ef a écrit :
Hi all,

I'm currently working on implementing the `split` procedure, which was added in Fortran 2023. Given its similar functionality to the `scan` intrinsic, I've been learning the implementation of `scan` to better understand its mechanics. During my investigation of the source code, I've come across a couple of questions that I'm hoping you could help me out:

* Function Resolution

I noticed that `scan` resolves to `gfc_get_string("__scan_%d", string-  >ts.kind)`. However, based on my examination of `libgfortran` and `trans-intrinsic.cc`, it appears this ultimately forwards to the `string_scan` function. Could you please explain the significance of this resolution step? Is it critical to the current implementation, or is it perhaps a remnant of historical design?

No, it's not critical; I think the name can be used in compile-time errors for example.  The name that really matters is the name of the function declaration that is passed to the middle-end (look for string_scan in trans-decl.cc).  Note that there are two variants: string_scan, and string_scan_char4 (both on the front-end size and in libgfortran).

* Argument Passing

According to the Fortran specification, the `scan` intrinsic accepts arguments for `string`, `set`, `back`, and `kind`. Yet, the intrinsic implemented in `libgfortran` seems to take `charlen` and the actual pointers for both `string` and `set`. I've tried searching, but I haven't been able to pinpoint where this transformation from the specified arguments to the `libgfortran` expected arguments occurs. Any guidance on this would be greatly appreciated.

You can have a look in gfc_conv_intrinsic_function_args.

To arrive there, the call stack starts with gfc_conv_expr (the main expression translation entry point), then gfc_conv_function_expr, gfc_conv_intrinsic_function, there jump to the GFC_ISYM_SCAN switch case, then gfc_conv_intrinsic_index_scan_verify and finally gfc_conv_intrinsic_function_args.

I hope it helps.

Thanks for your quick and helpful response! Based on your suggestions, I've written the following trans-intrinsic function and test case:

static tree
conv_intrinsic_split (gfc_code *code) {
  stmtblock_t block;
  gfc_se se;
  tree stringlen, string;
  tree setlen, set;
  tree pos, back;
  tree tmp;

  gfc_start_block (&block);

  gfc_init_se (&se, NULL);
  gfc_conv_expr(&se, code->ext.actual->expr);
  gfc_conv_string_parameter(&se);
  stringlen = se.string_length;
  gfc_add_block_to_block(&block, &se.pre);
  gfc_add_block_to_block(&block, &se.post);
  string = se.expr;

  gfc_init_se (&se, NULL);
  gfc_conv_expr(&se, code->ext.actual->next->expr);
  gfc_conv_string_parameter(&se);
  setlen = se.string_length;
  gfc_add_block_to_block(&block, &se.pre);
  gfc_add_block_to_block(&block, &se.post);
  set = se.expr;

  gfc_init_se (&se, NULL);
  gfc_conv_expr(&se, code->ext.actual->next->next->expr);
  gfc_add_block_to_block(&block, &se.pre);
  gfc_add_block_to_block(&block, &se.post);
  pos = se.expr;
  pos = gfc_build_addr_expr(NULL_TREE, pos);

  back = build_int_cst (gfc_get_logical_type (4), 0);

tmp = build_call_expr_loc (input_location, gfor_fndecl_string_split, 6, stringlen, string, setlen, set, pos, back);
  gfc_add_expr_to_block (&block, tmp);
  return gfc_finish_block (&block);
}

***

! { dg-do run }
program b
  CHARACTER (LEN=:), ALLOCATABLE :: INPUT
  CHARACTER (LEN=2) :: SET = ', '
  INTEGER P
  INPUT = "one,last example"
  P = 4
  ISTART = P + 1
  CALL SPLIT (INPUT, SET, P)
  IEND = P - 1
  PRINT '(T7,A)', INPUT (ISTART:IEND)
end program b

The execution test failed on optimization levels O0 and O1 but passed on the other four options. The error message indicates that *pos holds an unusually large value in O0/O1. Could this be due to a bug in the trans function, or does the issue lie elsewhere?

Thanks,
Yuao

Reply via email to