On Jun 23, 2023, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> -ftrivial-auto-var-init has been in GCC since GCC12. *nod*. IIRC I started designing hardbool in GCC10, but the first complete implementation was for GCC11. > The decision to use 0x00 for zero-initiation, 0xfe for pattern-initiation > has been discussed thoroughly during the design phase. -:) *nod*, and it's a good one > Since this hardbool attribute will also be a security feature, users that > seek > security features of GCC will likely ask the same question for these two > features. > So, their interaction is better to be at least documented. That’s my main > point. *nod*. At first, I thought the ideal place to clarify the issue was in the documentation for the option, because there's nothing exceptional about the option's behavior when it comes to hardbool specifically. But it doesn't hurt to mention it in both places, so I did. How about the incremental patchlet below (at the end)? > For normal Boolean variables, 0x00 is false, this is a reasonable init > value with zero-initialization. *nod*. I was surprised by zero initialization of (non-hardened) booleans even when pattern is requested, but not consistently (e.g. boolean fields of a larger struct would still get pattern-initialized IIUC). I'd have expected pattern would translate to nonzero and thus to true, rather than false. > For hardbool variables, what 0x00 represents if it’s not false or true > value? It depends on how hardbool is parameterized. One may pick 0x00 or 0xFE as the representations for true or false, or neither, in which case the trivial initializer will end up as a trapping value. >> I'd probably have arranged for the front-end to create the initializer >> value, because expansion time is too late to figure it out: we may not >> even have the front-end at hand any more, in case of lto compilation. > Is the hardbool attribute information available during the rtl expansion > phase? It is in the sense that the attribute lives on, but c_hardbool_type_attr is a frontend function, it cannot be called from e.g. lto1. The hardbool attribute is also implemented in Ada, but there it only affects validity checking in the front end: Boolean types in Ada are Enumeration types, and there is standard syntax to specify the representations for true and false. AFAICT, once we translate GNAT IR to GNU IR, hardened booleans would not be recognizable as boolean types. Even non-hardened booleans with representation clauses would. So handling these differently from other enumeration types, to make them closer to booleans, would be a bit of a challenge, and a backwards-compatibility issue (because such booleans have already been handled in the present way since the introduction of -ftrivial-* back in GCC12) >> Now, I acknowledge that the decision to make implicit >> zero-initialization of boolean types set them to the value for false, >> rather than to all-bits-zero representation, is a departure from common >> practice of zero-initialization yielding logical zero. > Dont’s quite understand the above, for normal boolean variables, Sorry, I meant hardened boolean types. This was WRT to the design decision that led to this bit in the documentation: typedef char __attribute__ ((__hardbool__ (0x5a))) hbool; [...] static hbool zeroinit; /* False, stored as (char)0x5a. */ auto hbool uninit; /* Undefined, may trap. */ > And this is a very reasonable initial value for Boolean variables, Agreed. The all-zeros bit pattern is not so great for booleans that use alternate representations, though, such as the following standard Ada: type MyBool is new Boolean; for MyBool use (16#5a#, 16#a5#); for MyBool'Size use 8; or for biased variables such as: X : Integer range 254 .. 507; for X'Size use 8; -- bits, so a biased representation is required. Just to make things more interesting, I chose a range for X that causes the compiler to represent 0xfe as 0x00 in in the byte that holds X, but that places the 0xfe pattern just out of the range :-) So with -ftrivial-auto-var-init=zero, X = 254, whereas with -ftrivial-auto-var-init=pattern, it fails validity checking, and might come out as 508 if that's disabled. > From my understanding, only with the introduction of “hardbool” > attribute, all-bits-zero will not be equal to the > logical false anymore. Ada booleans have long allowed nonzero representations for false. diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 772209c1793e8..ae7867bb35696 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -8774,6 +8774,12 @@ on the bits held in the storage (re)used for the variable, if any, and on optimizations the compiler may perform on the grounds that using uninitialized values invokes undefined behavior. +Users of @option{-ftrivial-auto-var-init} should be aware that the bit +patterns used as its trivial initializers are @emph{not} converted to +@code{hardbool} types, so using variables implicitly initialized by it +may trap if the representations values chosen for @code{false} and +@code{true} do not match the initializer. + @cindex @code{may_alias} type attribute @item may_alias diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 296a9c178b195..e21f468a9c8f3 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -13510,6 +13510,23 @@ The values used for pattern initialization might be changed in the future. The default is @samp{uninitialized}. +Note that the initializer values, whether @samp{zero} or @samp{pattern}, +refer to data representation (in memory or machine registers), rather +than to their interpretation as numerical values. This distinction may +be important in languages that support types with biases or implicit +multipliers, and with such extensions as @samp{hardbool} (@pxref{Type +Attributes}). For example, a variable that uses 8 bits to represent +(biased) quantities in the @code{range 160..400} will be initialized +with the bit patterns @code{0x00} or @code{0xFE}, depending on +@var{choice}, whether or not these representations stand for values in +that range, and even if they do, the interpretation of the value held by +the variable will depend on the bias. A @samp{hardbool} variable that +uses say @code{0X5A} and @code{0xA5} for @code{false} and @code{true}, +respectively, will trap with either @samp{choice} of trivial +initializer, i.e., @samp{zero} initialization will not convert to the +representation for @code{false}, even if it would for a @code{static} +variable of the same type. + You can control this behavior for a specific variable by using the variable attribute @code{uninitialized} (@pxref{Variable Attributes}). -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer Disinformation flourishes because many people care deeply about injustice but very few check the facts. Ask me about <https://stallmansupport.org>