Sent in error (and not moderated).

On 03/02/2025 17:36, Prof Brian Ripley via R-devel wrote:
Tomas,

I am thinking of writing something for R-devel, and hope to have your input first.

I get moderated on R-devel as I am now subscribed as brian.ripley@R- project.org which of course I cannot send from. So I am even more discouraged from posting there.  (R-core is bad enough with Luke discouraging all innovation except by him and Simon completely misunderstanding the C23 status.)

Thanks,

Brian

----------------

There are several of these, and few guarantees for inter-working.

a) R's logical vectors, which include a value NA for its elements.
b) R's Rboolean type in C/C++

c) C++'s bool type
d) C23's bool type
e) C99's _Bool type to which bool is aliased if <stdbool.h> is included.
f) Fortran's LOGICAL type

a) is currently implemented as a C int (so 32-bit) type with NA as the C value NA_LOGICAL which is the same a NA_INTEGER.

b) is currently implemented as a C enum with two values.  I don't know of any guarantees on how that is stored except in char or an integer type -- however it seems common practice to use a 32-bit type (int or unsigned int would not be distinguishable).  (C23 §6.7.3.3)  Enums can have a specified data type, but we do not.

C23 states that bool has 1 value bit and some padding bits (§6.2.6.2) so it can be stored in char-sized storage (i.e. bytes) or multiples thereof.  And that _Bool is a alternative name for bool.

f) is complier-dependent: for interoperability with C or R, code should use c_bool from iso_c_binding (Fortran 2003).  Fortran compilers store LOGICAL in compiler-dependent ways, and for a long time we got away with assuming that was equivalent to int (so LOGICAL values could be passed to and from with int* on the C/R side).  But sometime around GCC 8 they changed to int_least32_t, which on common platforms is the same as int but does not need to be.

It seems that in all cases coercion to an integer type coerces false values to 0 and true values to 1 (and this is guaranteed by C23 at least).  And C23 guarantees that when coercing from an integer type to bool zero values are coerced to false and non-zero ones to true (bool is 'an unsigned integer type').  However, that does not seem to be true for C++ as UB sanitizers warn on coercing values other than 0/1.

I believe it to be the intention that c), d) and e) have the same representation and interwork using the same compiler, but I could not find that documented and see signs that e) might differ in C17 and C23 modes.

----------------

I need to look again at the C and C++ standards which with my vision I need to do in very small chunks.  Oh for the vision I once had!



--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to