Re: [PATCH] fortran: implement conditional expression for fortran 2023

Tobias Burnus Tue, 26 Aug 2025 02:40:37 -0700

Hi Yuao,

Yuao Ma wrote:

I addressed some review comments and included additional feedback.

Thanks for the update. I know that the patch is incomplete, but as Istarted with bootstrapping, I start with the following. I bet the issuesare known to you:

First, your patch does not bootstrap* here. As the stage2 and stage3 arebuild with the current GCC, some -Werror= messages are enabled. And thisbuild fails with your patch:

../../../repos/gcc/gcc/fortran/trans-expr.cc:5338:10: error: enumerationvalue ‘EXPR_CONDITIONAL’ not handled in switch [-Werror=switch]


Likewise for 'module.cc'.

(*With --enable-bootstrap, GCC is build trice, once with the systemcompiler, then the just build stage1 compiler builds GCC withoutdebugging symbols and then stage2 builds stage3 with debugging symbols –and then stage2 and stage3 are checked to be identical after strippingthe debugging symbols.)

It seems as if the following functions need to handle it (might beincomplete and not all might be required):

dependency.cc: gfc_dep_compare_expr, gfc_check_dependency,contains_forall_index_p, gfc_contains_implied_index_p


expr.cc: gfc_traverse_expr, gfc_check_init_expr, check_restricted,

frontend-passes.cc: gfc_expr_walker and traverse_io_block

module.cc: not handled.

frontend-passes.cc:

openmp.cc: expr_references_sym

resolve.cc: gfc_resolve_expr

trans-expr.c: gfc_apply_interface_mapping_to_expr

* * *

… as said, you are aware of it:

Yes I am aware of that, which is why I'm a bit fear of implementingit. Every time we switch an expression type, we have to considerwhether a conditional expression should be a no-op or require specialhandling. This would, however, be a great opportunity to dive deepinto the gfortran codebase.


Yes, but usually, neither new EXPR_TYPES nor new switch statement pop up.

* * *

Regarding the diagnostic:

...

Talking about the location, I wonder whether
the expr->where should be not the beginning of the expr,
i.e. 'll' for the outer and the leftmost ':' for the inner
conditional, but at the (respective) '?'.

Additionally, I wonder whether in

+ gfc_error ("True and false expressions in conditionalexpression "

+                "must have the same declared type at %L",
+                &expr->where);

the '&expr->where' should be replaced by:

gfc_get_location_range (NULL, 0, &true_expr->where,0, &false_expr- >where, 0)

* * *

I am not very familiar with PDT in Fortran. I will learn about it andadd the relevant parts.

I think for now, it is the simplest to just reject expressions withBT_DERIVED + BT_CLASS wih a sorry (and a note to handle the kindarguments of PDT).


* * *

Thanks for pointing that out; that was my oversight. I obviouslyassumed conditional expressions don't require surrounding parentheses,unlike in C++. To parse this form, maybe we need to handle it in`match_primary`?
I have another question regarding this. The standard states that anyexpression in parentheses should be treated as a data entity. Does aconditional expression satisfy this requirement?


Well, data entity seems to be extremely wide, covering the result
of function calls and variables and other expressions and … [→ 5.4.3].

I think one place where '(…)' appeared in current code is:
'A' is a variable, '(A)' isn't but an expression. Hence:

integer :: A(-1:2)
lbound(A,dim=1) → -1 // lower bound of variable 'A'
lbound( (A), dim=1) → 1 // default lower bound of an expression is 1.

* * *

call myfunc(A) → subroutine may modify 'A'
call myfunc( (A) ) → argument is not definable, i.e. A may not be modified

* * *

While function calls are handled explicitly in the spec, the variable
vs. expression topic matters for code like:

C = (cond ? A : B )  ! expression or variable ?

Reason:

implicit none
integer, allocatable :: A(:), B(:), C(:)
allocate(A(-1:1))
A(:) = [1,2,3]

B = A
C = (A)
print *, lbound(B)
print *, lbound(C)
end

The example relies on (re)allocate on assignment and
prints '-1' (bound taken from a variable) and
'1' (bounds taken from an expression).

My reading is that for
  C = (cond ? A : B)
the lower bound should be '1' - while for function calls
like:
  lbound(cond ? A : B, dim=1)
the lower bound should be the one of A or B, respectively.

* * *

I don't know whether the '(...)' is just syntactical sugar,
added because of reasons like my example above or to aid parsing.

In fixed-form Fortran, the following line is ambiguous
      d o i = f(g) + 40 * N
     & ...

This can either start an assignment to an (possibly implicitly typed)
variable named 'doi'.
Or it can start a 'do' loop with loop variable 'i' ('do i = start,end,step').

The compiler only knows when it either hits the end of the statement or
a comma.

It might be that the added '(...)' helps in some cases that would otherwise
be ambiguous. (Albeit I don't see where it would be.)

* * *

R1002 conditional-expr is ( scalar-logical-expr ? expr [ : scalar-logical-expr ? expr ]... : expr )
But Fortran does not seem to require it for inner conditionalexpressions
in the 'else' branch:

print *, ( i>j ? i : j > k ? k)  ! OK
print *, ( i>j ? j > k ? j : k : m )  ! invalid
print *, ( i>j ? (j > k ? j : k) : m )  ! OK
Yes, you are right. I have fixed the issue, so now the above threeshould be valid.


Currently, the compiler accepts all three variants above
and A to E below. — But if I read R1002 correctly,
the second example above and 'D' below are invalid
and should be rejected.

I think your previous patch was already correctly handling it,
but due to the missing enforcement of the '(…)' requirement,
I am not completely sure.

In any case, I think R1002 means the following:

---------------------------------------
implicit none
logical :: cond, cond2, cond3
integer :: expr, expr2, expr3, expr4

! Fortran permits:
!   ( scalar-logical-expr ? expr [ : scalar-logical-expr ? expr ]... : expr )
!
! Namely:
!   ( scalar-logical-expr ? expr : expr )
print *, 'A:', ( cond ? expr : expr2 )

! Namely:
!   ( scalar-logical-expr ? expr : scalar-logical-expr ? expr : expr )
print *, 'B:', ( cond ? expr : cond2 ? expr2 : expr3 )

! Namely:
!   ( scalar-logical-expr ? expr : scalar-logical-expr ? expr : 
scalar-logical-expr ? expr : expr )
print *, 'C:', ( cond ? expr : cond2 ? expr2 : cond3 ? expr3 : expr4 )


! INVALID - error expected
!   ( scalar-logical-expr ? scalar-logical-expr ? expr : expr : expr )
print *, 'D:', ( cond ? cond2 ? expr : expr2 : expr3 )  ! { dg-error "..." }

! FIXED: Adding '( ... )'
!   ( scalar-logical-expr ? ( ... ) : expr )
print *, 'E:', ( cond ? ( cond2 ? expr : expr2 ) : expr3 )
end
---------------------------------------

For the cond-expr feature, I was initially focused on the backend codegeneration, but it looks like the frontend part (parsing & resolving)still needs some work. I'll take care of that ASAP.

Thanks. As said, having a first version with some 'sorry' which rejectthe cases to be handled later would be fine.


* * *

I do have a question about the backend, though. I've noticed that mostof the assignment scalarization happens in gfc_trans_assignment_1. Ifwe continue with our current approach for cond-expr, are we going toduplicate a lot of common logic? I'm curious if we can reuse some ofthat, like setting the string length or scalarizing the array.


I think we need do so some duplication as it might appear in multiple places
and we only want to evaluate function calls as few times as possible.

I hope that the changes are either small – or can be factored out.
But I have not really thought about it in depth. For simple scalars,
the current code seems to be fine.

For instance:
  call sub( (cond ? A : B) )
here, passing the address of A and B directly works.
While for
  call sub ( (cond ? A : B ) + 1)
a scalarized loop with a temporary is required.

For '(cond() ? A(g())%comp : B(h())%comp2(5:))'
the condition and (conditionally) g() and h() and
possibly also the effective array size should be evaluated
only once and before the scalarized loop.

That's not too different from existing code, except that
evaluations need to happen conditionally - and we have
to be careful as some code walking differs. For instance,
for:

integer, target :: A(10), B(10), C(:), D(10)
integer, allocatable :: C(:)
integer, pointer :: ptr(:)

C = (cond ? A : B)

a simple (mixing C style and Fortran-style syntax):
  tmp = cond ? &A : &B
  allocate(C( (cond ? size(A) : size(B)) ))
  C(i) = tmp(i)
will do.

But for
  ptr => B(::2)
  D = (cond ? A : ptr)

the values are only contiguous for 'A' but not for 'ptr'.

Thus, we either need:
  D(i) = cond ? A(i) : ptr(i)
putting the condition and the offset calculation into the loop.
or:
  stride = cond ? A.stride : ptr.stride
  tmp = cond ? A.base_addr : ptr.base_addr
  ...
  D(i) = *(tmp + i * stride);

where the second version is more efficient, but we have to think
harder how to implement it. At least, both variants need to have
the same array rank - or are assumed rank, but that implies that
it only appears as actual argument.

* * *

There should be already some code which handles early evaluation
(like for 'C = var(g())%array' - where g() is only evalulated
once), which we can also use when scalarizing. Some other code
requires more thinking, but could also be implemented later
like the last example.

Tobias

Re: [PATCH] fortran: implement conditional expression for fortran 2023

Reply via email to