Re: [patch, fortran] First part of Fortran's unsigned implementation

Andre Vehreschild Tue, 13 Aug 2024 02:40:08 -0700

Hi Thomas,

may I ask you to run contrib/check_GNU_style.py on your patch? At least on my
system more than lines 50 are reported. I am drawn to this style issues and find
it hard to digest the beef of the patch. That's my personal OCD unfortunately.


Furthermore (Sorry, I inserted your w/o cite and noted that after it was too
late. I have prefixed my remarks/questions with AV: to make them easier to
find. I hope this helps):

diff --git a/gcc/fortran/check.cc b/gcc/fortran/check.cc
index 2f50d84b876..1020ba5342f 100644
--- a/gcc/fortran/check.cc
+++ b/gcc/fortran/check.cc
@@ -7637,3 +7960,12 @@ gfc_check_storage_size (gfc_expr *a, gfc_expr *kind)

   return true;
 }
+
+/* Check two operands that either both or none of them can
+   be UNSIGNED.  */
+
+bool
+gfc_invalid_unsigned_ops (gfc_expr * op1, gfc_expr * op2)
+{
+  return (op1->ts.type == BT_UNSIGNED) + (op2->ts.type == BT_UNSIGNED) == 1;

AV: That's an interesting way to model an xor. Why not `op1.. ^ op2..`? Yes,
it's bitwise, but on bools.

+}
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 7e8783a3690..9043fa321dc 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -2701,7 +2702,90 @@ descriptor occurred, use @code{INQUIRE} to get the file
position, count the characters up to the next @code{NEW_LINE} and then start
 reading from the position marked previously.

+@node Experimental features for Fortran 202Y
+@section Experimental features for Fortran 202Y
+@cindex Fortran 202Y

+GNU Fortran supports some experimental features which have been
+proposed and accepted by the J3 standards committee.  These
+exist to give users a chance to try them out, and to provide
+a reference implementation.
+
+As these features have not been finalized, there is a chance that the
+version in the upcoming standard will differ from what GNU Fortran
+currently implements.  Stability of these implementations is therefore
+not guaranteed.
+
+@menu
+* Unsigned integers::
+@end menu
+
+@node Unsigned integers
+@subsection Unsigned integers
+@cindex Unsigned integers
+GNU Fortran supports unsigned integers according to
+@uref{https://j3-fortran.org/doc/year/24/24-116.txt, J3/24-116}.  The
+data type is called @code{UNSIGNED}.  For an unsigned type with $n$ bits,
+it implements integer arithmetic modulo @code{2**n}, comparable to the
+@code{unsigned} data type in C.
+
+The data type has @code{KIND} numbers comparable to other Fortran data
+types, which can be selected via the @code{SELECTED_UNSIGNED_KIND}
+function.
+
+Mixed arithmetic, comparisoins and assignment between @code{UNSIGNED}

AV: ... comparisons ...

+and other types are only possible via explicit conversion.  Conversion
+from @code{UNSIGNED} to other types is done via type conversion
+functions like @code{INT} or @code{REAL}. Conversion from other types
+to @code{UNSIGNED} is done via @code{UINT}. Unsigned variables cannot be
+used as index variables in @code{DO} loops or as array indices.
+
+Unsigned numbers have a trailing @code{u} as suffix, optionally followed
+by a @code{KIND} number separated by an underscore.
+
+Input and output can be done using the @code{I}, @code{B}, @code{O}
+and @code{Z} descriptors, plus unformatted I/O.
+
+Here is a small, somewhat contrived example of their use:
+@smallexample
+program main
+  unsigned(kind=8) :: v
+  v = huge(v) - 32u_8
+  print *,v
+end program main
+@end smallexample
+which will output the number 18446744073709551583.
+
+Arithmetic operations work on unsigned integers, except for exponentiation,
+which is prohibited.  Unary minus is not permitted when @code{-predantic}

AV: ... @code{-pedantic}

+is in force; this prohibition is part of J3/24-116.txt.
+
+Generally, unsigned integers are only permitted as data in intrinsics.
+
+Unsigned numbers can be read and written using list-directed,
+formatted and unformatted I/O.  For formatted I/O, the @code{B},
+@code{I}, @code{O} and @code{Z} descriptors are valid.  Negative
+values and values which would overflow are rejected with
+@code{-pedantic}.
+
+As of now, the following intrinsics take unsigned arguments:
+@itemize @bullet
+@item @code{BLT}, @code{BLE}, @code{BGE} and @code{BGT}. These intrinsics
+      are actually redundant because comparison operators could be used
+      directly.
+@item @code{IAND}, @code{IOR}, @code{IEOR} and @code{NOT}
+@item @code{BIT_SIZE}, @code{DIGITS} and @code{HUGE}
+@item @code{DSHIFTL} and @code{DSHIFTR}
+@item @code{IBCLR}, @code{IBITS} and @code{IBITS}

AV: IBITS and IBITS ???

+@item @code{MIN} and @code{MAX}
+@item @code{ISHFT}, @code{ISHFTC}, @code{SHIFTL}, @code{SHIFTR} and
@code{SHIFTA}. +@item @code{MERGE_BITS}
+@item @code{MOD} and @code{MODULO}
+@item @code{MVBITS}
+@item @code{RANGE}
+@item @code{TRANSFER}
+@end itemize
+This list will grow in the near future.
 @c ---------------------------------------------------------------------
 @c ---------------------------------------------------------------------
 @c Mixed-Language Programming
diff --git a/gcc/fortran/intrinsic.cc b/gcc/fortran/intrinsic.cc
index 40f4c4f4b0b..926ac44dfd4 100644
--- a/gcc/fortran/intrinsic.cc
+++ b/gcc/fortran/intrinsic.cc
@@ -5316,7 +5347,8 @@ gfc_convert_type_warn (gfc_expr *expr, gfc_typespec *ts,
int eflag, int wflag, else if (from_ts.type == ts->type
               || (from_ts.type == BT_INTEGER && ts->type == BT_REAL)
               || (from_ts.type == BT_INTEGER && ts->type == BT_COMPLEX)
-              || (from_ts.type == BT_REAL && ts->type == BT_COMPLEX))
+              || (from_ts.type == BT_REAL && ts->type == BT_COMPLEX)
+              || (from_ts.type == BT_UNSIGNED && ts->type == BT_UNSIGNED))

AV: I don't get why converting from unsigned to unsigned is an issue here?

        {
          /* Larger kinds can hold values of smaller kinds without problems.
             Hence, only warn if target kind is smaller than the source
diff --git a/gcc/fortran/iresolve.cc b/gcc/fortran/iresolve.cc
index c63a4a8d38c..f466a473f15 100644
--- a/gcc/fortran/iresolve.cc
+++ b/gcc/fortran/iresolve.cc
@@ -895,11 +895,13 @@ void
 gfc_resolve_dshift (gfc_expr *f, gfc_expr *i, gfc_expr *j ATTRIBUTE_UNUSED,
                    gfc_expr *shift ATTRIBUTE_UNUSED)
 {
+  char c = i->ts.type == BT_INTEGER ? 'i' : 'u';
+
   f->ts = i->ts;
   if (f->value.function.isym->id == GFC_ISYM_DSHIFTL)
-    f->value.function.name = gfc_get_string ("dshiftl_i%d", f->ts.kind);
+    f->value.function.name = gfc_get_string ("dshiftl_%c%d", c, f->ts.kind);
   else if (f->value.function.isym->id == GFC_ISYM_DSHIFTR)
-    f->value.function.name = gfc_get_string ("dshiftr_i%d", f->ts.kind);
+    f->value.function.name = gfc_get_string ("dshiftr_%c%d", c, f->ts.kind);
   else
     gcc_unreachable ();
 }
@@ -1182,6 +1184,7 @@ gfc_resolve_iand (gfc_expr *f, gfc_expr *i, gfc_expr *j)
   /* If the kind of i and j are different, then g77 cross-promoted the
      kinds to the largest value.  The Fortran 95 standard requires the
      kinds to match.  */
+
   if (i->ts.kind != j->ts.kind)
     {
       if (i->ts.kind == gfc_kind_max (i, j))
@@ -1191,7 +1194,8 @@ gfc_resolve_iand (gfc_expr *f, gfc_expr *i, gfc_expr *j)
     }

   f->ts = i->ts;
-  f->value.function.name = gfc_get_string ("__iand_%d", i->ts.kind);
+  const char *name = i->ts.kind == BT_UNSIGNED ? "__iand_m_%d" : "__iand_%d";

AV: Why not using _(u|i)%d here, too, like above with dshift?

+  f->value.function.name = gfc_get_string (name, i->ts.kind);
 }



@@ -2213,7 +2239,8 @@ void
 gfc_resolve_not (gfc_expr *f, gfc_expr *i)
 {
   f->ts = i->ts;
-  f->value.function.name = gfc_get_string ("__not_%d", i->ts.kind);
+  const char *name = i->ts.kind == BT_UNSIGNED ? "__not_u_%d" : "__not_%d";

AV: And here another "style". Wouldn't a consistent one be more understandable
or did these function pre-exist and now are only reused? (Sorry, for my
ignorance).

+  f->value.function.name = gfc_get_string (name, i->ts.kind);
 }


diff --git a/gcc/fortran/primary.cc b/gcc/fortran/primary.cc
index 76f6bcb8a78..80cbf39a752 100644
--- a/gcc/fortran/primary.cc
+++ b/gcc/fortran/primary.cc
@@ -209,6 +209,44 @@ convert_integer (const char *buffer, int kind, int radix,
locus *where) }


+/* Convert an unsigned string to an expression node.  XXX:
+   This needs a calculation modulo 2^n.  TODO: Implement restriction
+   that no unary minus is permitted.  */
+static gfc_expr *
+convert_unsigned (const char *buffer, int kind, int radix, locus *where)
+{
+  gfc_expr *e;
+  const char *t;
+  int k;
+  arith rc;
+
+  e = gfc_get_constant_expr (BT_UNSIGNED, kind, where);
+  /* A leading plus is allowed, but not by mpz_set_str.  */
+  if (buffer[0] == '+')
+    t = buffer + 1;
+  else
+    t = buffer;
+
+  mpz_set_str (e->value.integer, t, radix);
+
+  k = gfc_validate_kind (BT_UNSIGNED, kind, false);
+
+  /* XXX Maybe move this somewhere else.  */

AV: How about replacing XXX by TODO and above? No one searches for XXX and TODOs
are quite good support by most recent IDEs.

+  rc = gfc_range_check (e);
+  if (rc != ARITH_OK)
+    {
+    if (pedantic)
+      gfc_error_now (gfc_arith_error (rc), &e->where);
+    else
+      gfc_warning (0, gfc_arith_error (rc), &e->where);
+    }
+
+  gfc_convert_mpz_to_unsigned (e->value.integer,
gfc_unsigned_kinds[k].bit_size,
+                              false);
+
+  return e;
+}
+
 /* Convert a real string to an expression node.  */

 static gfc_expr *
@@ -296,6 +334,71 @@ match_integer_constant (gfc_expr **result, int signflag)
   return MATCH_YES;
 }

+/* Match an unsigned constant (an integer with suffixed u).  No sign

AV: ... with suffix u)...

+   is currently accepted, in accordance with 24-116.txt, but that
+   could be changed later.  This is very much like the integer
+   constant matching above, but with enough differences to put it into
+   its own function.  */
+

diff --git a/gcc/fortran/simplify.cc b/gcc/fortran/simplify.cc
index 8ddd491de11..e339f7ebc06 100644
--- a/gcc/fortran/simplify.cc
+++ b/gcc/fortran/simplify.cc
@@ -3738,16 +3814,48 @@ gfc_simplify_idint (gfc_expr *e)
   return range_check (result, "IDINT");
 }

+gfc_expr *
+gfc_simplify_uint (gfc_expr *e, gfc_expr *k)
+{
+  gfc_expr *result = NULL;
+  int kind;
+
+  kind = get_kind (BT_INTEGER, k, "INT", gfc_default_integer_kind);

AV: May I ask you to add a comment, why the above is correct?

AV: I have skipped reviewing all testcases.

AV: Nothing to comment in the library part.

I have to admit, that I am only familiar to a small part of the code.
Therefore I hope that my initial comments will make it easier for a second
reviewer to comment on the Fortran-specific problems?!

Thanks for the patch and the big effort. I hope my comments help at least a bit.

Regards,
        Andre

On Mon, 12 Aug 2024 21:40:07 +0200
Thomas Koenig <tkoe...@netcologne.de> wrote:

> Hello world,
>
> the attached patch and ChangeLog show the current state of the UNSIGNED
> implementation for gfortran.  This pretty much follows J3/24-116.txt
> and implements the basic functionality, plus the non-array intrinsics.
> Some basic functionality is tested (see the attached test cases),
> but there are, with a very high probability, still quite a few bugs.
>
> However, given my problems with git and the branch, maybe the
> best strategy is to push this to master as soon as possible;
> I would then start working on the array intrinsics.
>
> Regarding where to put this:  Paul, you had the idea of making this
> dependent on a future standard plan.  I think we can do this, setting
> -funsigned when this is flag is set.
>
> Where to put the test cases: I currently have them in the main
> gfortran.dg directory.  A subdirectory might also be a good idea,
> but then somebody would have to help me withe DejaGnu code to
> put there.
>
> So... Comments? Suggestions? OK for master?
>
> Best regards
>
>       Thomas
>


--
Andre Vehreschild * Email: vehre ad gmx dot de

Re: [patch, fortran] First part of Fortran's unsigned implementation

Reply via email to