On Jun 23, 2023, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

> -ftrivial-auto-var-init has been in GCC since GCC12.  

*nod*.  IIRC I started designing hardbool in GCC10, but the first
complete implementation was for GCC11.

> The decision to use 0x00 for zero-initiation, 0xfe for pattern-initiation
> has been discussed thoroughly during the design phase. -:)

*nod*, and it's a good one

> Since this hardbool attribute will also  be a security feature, users that 
> seek
> security features of GCC will likely ask the same question for these two
> features.

> So, their interaction is better to be at least documented. That’s my main 
> point.

*nod*.  At first, I thought the ideal place to clarify the issue was in
the documentation for the option, because there's nothing exceptional
about the option's behavior when it comes to hardbool specifically.  But
it doesn't hurt to mention it in both places, so I did.  How about the
incremental patchlet below (at the end)?

> For normal Boolean variables, 0x00 is false, this is a reasonable init
> value with zero-initialization.

*nod*.  I was surprised by zero initialization of (non-hardened)
booleans even when pattern is requested, but not consistently
(e.g. boolean fields of a larger struct would still get
pattern-initialized IIUC).  I'd have expected pattern would translate to
nonzero and thus to true, rather than false.

> For hardbool variables, what 0x00 represents if it’s not false or true
> value?

It depends on how hardbool is parameterized.  One may pick 0x00 or 0xFE
as the representations for true or false, or neither, in which case the
trivial initializer will end up as a trapping value.

>> I'd probably have arranged for the front-end to create the initializer
>> value, because expansion time is too late to figure it out: we may not
>> even have the front-end at hand any more, in case of lto compilation.

> Is the hardbool attribute information available during the rtl expansion 
> phase?

It is in the sense that the attribute lives on, but c_hardbool_type_attr
is a frontend function, it cannot be called from e.g. lto1.

The hardbool attribute is also implemented in Ada, but there it only
affects validity checking in the front end: Boolean types in Ada are
Enumeration types, and there is standard syntax to specify the
representations for true and false.  AFAICT, once we translate GNAT IR
to GNU IR, hardened booleans would not be recognizable as boolean types.
Even non-hardened booleans with representation clauses would.  So
handling these differently from other enumeration types, to make them
closer to booleans, would be a bit of a challenge, and a
backwards-compatibility issue (because such booleans have already been
handled in the present way since the introduction of -ftrivial-* back in
GCC12)

>> Now, I acknowledge that the decision to make implicit
>> zero-initialization of boolean types set them to the value for false,
>> rather than to all-bits-zero representation, is a departure from common
>> practice of zero-initialization yielding logical zero.

> Dont’s quite understand the above, for normal boolean variables,

Sorry, I meant hardened boolean types.  This was WRT to the design
decision that led to this bit in the documentation:

typedef char __attribute__ ((__hardbool__ (0x5a))) hbool;
[...]
static hbool zeroinit; /* False, stored as (char)0x5a.  */
auto hbool uninit;     /* Undefined, may trap.  */
  
> And this is a very reasonable initial value for Boolean variables,

Agreed.  The all-zeros bit pattern is not so great for booleans that use
alternate representations, though, such as the following standard Ada:

   type MyBool is new Boolean;
   for MyBool use (16#5a#, 16#a5#);
   for MyBool'Size use 8;

or for biased variables such as:

  X : Integer range 254 .. 507;
  for X'Size use 8; -- bits, so a biased representation is required.

Just to make things more interesting, I chose a range for X that causes
the compiler to represent 0xfe as 0x00 in in the byte that holds X, but
that places the 0xfe pattern just out of the range :-) So with
-ftrivial-auto-var-init=zero, X = 254, whereas with
-ftrivial-auto-var-init=pattern, it fails validity checking, and might
come out as 508 if that's disabled.

> From my understanding, only with the introduction of “hardbool”
> attribute, all-bits-zero will not be equal to the
> logical false anymore. 

Ada booleans have long allowed nonzero representations for false.


diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 772209c1793e8..ae7867bb35696 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8774,6 +8774,12 @@ on the bits held in the storage (re)used for the 
variable, if any, and
 on optimizations the compiler may perform on the grounds that using
 uninitialized values invokes undefined behavior.
 
+Users of @option{-ftrivial-auto-var-init} should be aware that the bit
+patterns used as its trivial initializers are @emph{not} converted to
+@code{hardbool} types, so using variables implicitly initialized by it
+may trap if the representations values chosen for @code{false} and
+@code{true} do not match the initializer.
+
 
 @cindex @code{may_alias} type attribute
 @item may_alias
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 296a9c178b195..e21f468a9c8f3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13510,6 +13510,23 @@ The values used for pattern initialization might be 
changed in the future.
 
 The default is @samp{uninitialized}.
 
+Note that the initializer values, whether @samp{zero} or @samp{pattern},
+refer to data representation (in memory or machine registers), rather
+than to their interpretation as numerical values.  This distinction may
+be important in languages that support types with biases or implicit
+multipliers, and with such extensions as @samp{hardbool} (@pxref{Type
+Attributes}).  For example, a variable that uses 8 bits to represent
+(biased) quantities in the @code{range 160..400} will be initialized
+with the bit patterns @code{0x00} or @code{0xFE}, depending on
+@var{choice}, whether or not these representations stand for values in
+that range, and even if they do, the interpretation of the value held by
+the variable will depend on the bias.  A @samp{hardbool} variable that
+uses say @code{0X5A} and @code{0xA5} for @code{false} and @code{true},
+respectively, will trap with either @samp{choice} of trivial
+initializer, i.e., @samp{zero} initialization will not convert to the
+representation for @code{false}, even if it would for a @code{static}
+variable of the same type.
+
 You can control this behavior for a specific variable by using the variable
 attribute @code{uninitialized} (@pxref{Variable Attributes}).
 


-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

Reply via email to