On IPA-PTA field sensitivity and pointer expressions
Hi, I am working on an alias analysis using the points-to information generated during IPA-PTA. If we look at the varmap varinfo_t array in gcc/tree-ssa-struct.c, most of the constraint variable info structs contain a non-null decl field which points to a valid tree in gimple (which is an SSA variable and a pointer). I am trying to find out a way to obtain points-to information for pointer expressions. By this, the concrete example I have in mind is answering the following question: What does `astruct->aptrfield` points to? Here I have a concrete example: #include struct A { char *f1; struct A *f2;}; int __GIMPLE(startwith("ipa-pta")) main (int argc, char * * argv) { struct A * p1; char * pc; int i; int _27; i_15 = 1; pc = malloc(100); // HEAP(1) p1 = malloc (16); // HEAP(2) p1->f1 = pc; p1->f2 = p1; _27 = (int) 0; return _27; } Will give the following correct points-to information: HEAP(1) = { } HEAP(2) = { HEAP(1) HEAP(2) } pc_30 = { HEAP(1) } p1_32 = { HEAP(2) } However, there does not seem to be information printed for neither: p1->f1 p1->f2 which I would expect (or like) something like: p1_32->0 = { HEAP(1) } p1_32->64 = { HEAP(2) } Looking more closely at the problem, I found that some varinfo_t have a non-null "complex" field. Which has an array of "complex" constraints used to handle offsets and dereferences in gimple. For this same gimple code, we have the following complex constraints for the variable p1_32: main.clobber = p1_32 + 64 *p1_32 = pc_30 *p1_32 + 64 = p1_32 It seems to me that I can probably parse these complex constraints to generate the answers which I want. Is this the way this is currently being handled in GCC or is there some other standard mechanism for this? Thanks!
Re: [patch, doc] Update people who can review gfortran patches
On 9/25/20 8:02 AM, Thomas Koenig via Fortran wrote: for review of its patches, gfortran relies on a group of people who can approve patches. Unfortuntately, many of them are not active. Others, who have the capability and who have acted as de facto approvers (without anybody minding) are missing. I think you want to link to https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=MAINTAINERS which is the official list. — At least additionally. I note the the HTML page states "contributions will be reviewed" and not "approved" – and (nearly) any review comment is welcome, be it from an experienced maintainer or just by someone who passes by. Still, it is a bit misleading if this list includes someone who is not an official reviewer. New reviewers get added by approval of the steering committee (https://gcc.gnu.org/steering.html, typically by some existing maintainer/reviewer proposing a new one – by sending a private email to one of the SC members.) Regarding Jakub: while he is a global reviewer and very active, for Fortran, he is mostly reviewing OpenMP and – to a lesser extent – OpenACC patches and only a very few other patches. Thus, listing him under 'contributions will be reviewed' is a bit misleading. Cheers, Tobias
Re: On IPA-PTA field sensitivity and pointer expressions
On Fri, Sep 25, 2020 at 9:05 AM Erick Ochoa wrote: > > Hi, > > I am working on an alias analysis using the points-to information > generated during IPA-PTA. If we look at the varmap varinfo_t array in > gcc/tree-ssa-struct.c, most of the constraint variable info structs > contain a non-null decl field which points to a valid tree in gimple > (which is an SSA variable and a pointer). I am trying to find out a way > to obtain points-to information for pointer expressions. By this, the > concrete example I have in mind is answering the following question: > > What does `astruct->aptrfield` points to? > > Here I have a concrete example: > > > #include > > struct A { char *f1; struct A *f2;}; > > int __GIMPLE(startwith("ipa-pta")) > main (int argc, char * * argv) > { >struct A * p1; >char * pc; >int i; >int _27; > >i_15 = 1; >pc = malloc(100); // HEAP(1) >p1 = malloc (16); // HEAP(2) >p1->f1 = pc; >p1->f2 = p1; >_27 = (int) 0; >return _27; > } > > > Will give the following correct points-to information: > > HEAP(1) = { } > HEAP(2) = { HEAP(1) HEAP(2) } > pc_30 = { HEAP(1) } > p1_32 = { HEAP(2) } > > However, there does not seem to be information printed for neither: > >p1->f1 >p1->f2 > > which I would expect (or like) something like: > >p1_32->0 = { HEAP(1) } >p1_32->64 = { HEAP(2) } > > Looking more closely at the problem, I found that some varinfo_t have a > non-null "complex" field. Which has an array of "complex" constraints > used to handle offsets and dereferences in gimple. For this same gimple > code, we have the following complex constraints for the variable p1_32: > > main.clobber = p1_32 + 64 > *p1_32 = pc_30 > *p1_32 + 64 = p1_32 The issue is that allocated storage is not tracked field-sensitive since we do not know it's layout at the point of allocation (where we allocate the HEAP variable). There are some exceptions, see what we do for by-reference parameters in create_variable_info_for_1: if (vi->only_restrict_pointers && !type_contains_placeholder_p (TREE_TYPE (decl_type)) && handle_param && !bitmap_bit_p (handled_struct_type, TYPE_UID (TREE_TYPE (decl_type { varinfo_t rvi; tree heapvar = build_fake_var_decl (TREE_TYPE (decl_type)); DECL_EXTERNAL (heapvar) = 1; if (var_can_have_subvars (heapvar)) bitmap_set_bit (handled_struct_type, TYPE_UID (TREE_TYPE (decl_type))); rvi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS", true, true, handled_struct_type); if (var_can_have_subvars (heapvar)) bitmap_clear_bit (handled_struct_type, TYPE_UID (TREE_TYPE (decl_type))); rvi->is_restrict_var = 1; insert_vi_for_tree (heapvar, rvi); make_constraint_from (vi, rvi->id); make_param_constraints (rvi); where we create a heapvarwith a specific aggregate type. Generally make_heapvar (for the allocation case) allocates a variable without subfields: static varinfo_t make_heapvar (const char *name, bool add_id) { varinfo_t vi; tree heapvar; heapvar = build_fake_var_decl (ptr_type_node); DECL_EXTERNAL (heapvar) = 1; vi = new_var_info (heapvar, name, add_id); vi->is_heap_var = true; vi->is_unknown_size_var = true; vi->offset = 0; vi->fullsize = ~0; vi->size = ~0; vi->is_full_var = true; I've once had attempted to split (aka generate subfields) a variable on-demand during solving but that never worked well. So for specific cases like C++ new T we could create heapvars appropriately typed. But you have to double-check for correctness because of may_have_pointers and so on. > It seems to me that I can probably parse these complex constraints to > generate the answers which I want. Is this the way this is currently > being handled in GCC or is there some other standard mechanism for this? GCC is in the end only interested in points-to sets for SSA names which never have subfields. The missing subfields for aggregates simply make the points-to solution less precise. Richard. > Thanks!
Re: On IPA-PTA field sensitivity and pointer expressions
On 25/09/2020 13:30, Richard Biener wrote: On Fri, Sep 25, 2020 at 9:05 AM Erick Ochoa wrote: Hi, I am working on an alias analysis using the points-to information generated during IPA-PTA. If we look at the varmap varinfo_t array in gcc/tree-ssa-struct.c, most of the constraint variable info structs contain a non-null decl field which points to a valid tree in gimple (which is an SSA variable and a pointer). I am trying to find out a way to obtain points-to information for pointer expressions. By this, the concrete example I have in mind is answering the following question: What does `astruct->aptrfield` points to? Here I have a concrete example: #include struct A { char *f1; struct A *f2;}; int __GIMPLE(startwith("ipa-pta")) main (int argc, char * * argv) { struct A * p1; char * pc; int i; int _27; i_15 = 1; pc = malloc(100); // HEAP(1) p1 = malloc (16); // HEAP(2) p1->f1 = pc; p1->f2 = p1; _27 = (int) 0; return _27; } Will give the following correct points-to information: HEAP(1) = { } HEAP(2) = { HEAP(1) HEAP(2) } pc_30 = { HEAP(1) } p1_32 = { HEAP(2) } However, there does not seem to be information printed for neither: p1->f1 p1->f2 which I would expect (or like) something like: p1_32->0 = { HEAP(1) } p1_32->64 = { HEAP(2) } Looking more closely at the problem, I found that some varinfo_t have a non-null "complex" field. Which has an array of "complex" constraints used to handle offsets and dereferences in gimple. For this same gimple code, we have the following complex constraints for the variable p1_32: main.clobber = p1_32 + 64 *p1_32 = pc_30 *p1_32 + 64 = p1_32 The issue is that allocated storage is not tracked field-sensitive since we do not know it's layout at the point of allocation (where we allocate the HEAP variable). There are some exceptions, see what we do for by-reference parameters in create_variable_info_for_1: if (vi->only_restrict_pointers && !type_contains_placeholder_p (TREE_TYPE (decl_type)) && handle_param && !bitmap_bit_p (handled_struct_type, TYPE_UID (TREE_TYPE (decl_type { varinfo_t rvi; tree heapvar = build_fake_var_decl (TREE_TYPE (decl_type)); DECL_EXTERNAL (heapvar) = 1; if (var_can_have_subvars (heapvar)) bitmap_set_bit (handled_struct_type, TYPE_UID (TREE_TYPE (decl_type))); rvi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS", true, true, handled_struct_type); if (var_can_have_subvars (heapvar)) bitmap_clear_bit (handled_struct_type, TYPE_UID (TREE_TYPE (decl_type))); rvi->is_restrict_var = 1; insert_vi_for_tree (heapvar, rvi); make_constraint_from (vi, rvi->id); make_param_constraints (rvi); where we create a heapvarwith a specific aggregate type. Generally make_heapvar (for the allocation case) allocates a variable without subfields: static varinfo_t make_heapvar (const char *name, bool add_id) { varinfo_t vi; tree heapvar; heapvar = build_fake_var_decl (ptr_type_node); DECL_EXTERNAL (heapvar) = 1; vi = new_var_info (heapvar, name, add_id); vi->is_heap_var = true; vi->is_unknown_size_var = true; vi->offset = 0; vi->fullsize = ~0; vi->size = ~0; vi->is_full_var = true; I've once had attempted to split (aka generate subfields) a variable on-demand during solving but that never worked well. So for specific cases like C++ new T we could create heapvars appropriately typed. But you have to double-check for correctness because of may_have_pointers and so on. Hi Richard, I am not 100% sure about the meaning of the may_have_pointers field. In the source it says: 270 /* True if this field may contain pointers. */ 271 unsigned int may_have_pointers : 1; Does that mean that (1) the field is a pointer type? (2) An integer which may have hold the value of a pointer? Or that (3) there are pointers pointing to that field? You mention that for C++ new T heap vars are typed appropriately, but we still need to check for correctness. If I understood correctly, this means that may_have_pointers means that there might be pointers pointing to that variable (?) Because the pointers could potentially be casted and therefore the layout be interpreted in a different way. This is just my interpretation of what you said, can you please confirm or clarify that bit? Another clarification point: you mentioned that the heap variables' layout needs to be known in order to do what I want. I think you are right, but let me continue the example. These are the points-to sets we have: HEAP(1) = { } HEAP(2) = { HEAP(1) HEAP(2) } pc_30 = { HEAP(1) } p1_32 = { HEAP(2) } These are the complex constraints of interest:
Re: On IPA-PTA field sensitivity and pointer expressions
On Fri, Sep 25, 2020 at 2:27 PM Erick Ochoa wrote: > > > > On 25/09/2020 13:30, Richard Biener wrote: > > On Fri, Sep 25, 2020 at 9:05 AM Erick Ochoa > > wrote: > >> > >> Hi, > >> > >> I am working on an alias analysis using the points-to information > >> generated during IPA-PTA. If we look at the varmap varinfo_t array in > >> gcc/tree-ssa-struct.c, most of the constraint variable info structs > >> contain a non-null decl field which points to a valid tree in gimple > >> (which is an SSA variable and a pointer). I am trying to find out a way > >> to obtain points-to information for pointer expressions. By this, the > >> concrete example I have in mind is answering the following question: > >> > >> What does `astruct->aptrfield` points to? > >> > >> Here I have a concrete example: > >> > >> > >> #include > >> > >> struct A { char *f1; struct A *f2;}; > >> > >> int __GIMPLE(startwith("ipa-pta")) > >> main (int argc, char * * argv) > >> { > >> struct A * p1; > >> char * pc; > >> int i; > >> int _27; > >> > >> i_15 = 1; > >> pc = malloc(100); // HEAP(1) > >> p1 = malloc (16); // HEAP(2) > >> p1->f1 = pc; > >> p1->f2 = p1; > >> _27 = (int) 0; > >> return _27; > >> } > >> > >> > >> Will give the following correct points-to information: > >> > >> HEAP(1) = { } > >> HEAP(2) = { HEAP(1) HEAP(2) } > >> pc_30 = { HEAP(1) } > >> p1_32 = { HEAP(2) } > >> > >> However, there does not seem to be information printed for neither: > >> > >> p1->f1 > >> p1->f2 > >> > >> which I would expect (or like) something like: > >> > >> p1_32->0 = { HEAP(1) } > >> p1_32->64 = { HEAP(2) } > >> > >> Looking more closely at the problem, I found that some varinfo_t have a > >> non-null "complex" field. Which has an array of "complex" constraints > >> used to handle offsets and dereferences in gimple. For this same gimple > >> code, we have the following complex constraints for the variable p1_32: > >> > >> main.clobber = p1_32 + 64 > >> *p1_32 = pc_30 > >> *p1_32 + 64 = p1_32 > > > > The issue is that allocated storage is not tracked field-sensitive since > > we do not know it's layout at the point of allocation (where we allocate > > the HEAP variable). There are some exceptions, see what we do > > for by-reference parameters in create_variable_info_for_1: > > > >if (vi->only_restrict_pointers > >&& !type_contains_placeholder_p (TREE_TYPE (decl_type)) > >&& handle_param > >&& !bitmap_bit_p (handled_struct_type, > > TYPE_UID (TREE_TYPE (decl_type > > { > >varinfo_t rvi; > >tree heapvar = build_fake_var_decl (TREE_TYPE (decl_type)); > >DECL_EXTERNAL (heapvar) = 1; > >if (var_can_have_subvars (heapvar)) > > bitmap_set_bit (handled_struct_type, > > TYPE_UID (TREE_TYPE (decl_type))); > >rvi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS", true, > > true, handled_struct_type); > >if (var_can_have_subvars (heapvar)) > > bitmap_clear_bit (handled_struct_type, > >TYPE_UID (TREE_TYPE (decl_type))); > >rvi->is_restrict_var = 1; > >insert_vi_for_tree (heapvar, rvi); > >make_constraint_from (vi, rvi->id); > >make_param_constraints (rvi); > > > > where we create a heapvarwith a specific aggregate type. Generally > > make_heapvar (for the allocation case) allocates a variable without > > subfields: > > > > static varinfo_t > > make_heapvar (const char *name, bool add_id) > > { > >varinfo_t vi; > >tree heapvar; > > > >heapvar = build_fake_var_decl (ptr_type_node); > >DECL_EXTERNAL (heapvar) = 1; > > > >vi = new_var_info (heapvar, name, add_id); > >vi->is_heap_var = true; > >vi->is_unknown_size_var = true; > >vi->offset = 0; > >vi->fullsize = ~0; > >vi->size = ~0; > >vi->is_full_var = true; > > > > I've once had attempted to split (aka generate subfields) a variable > > on-demand during solving but that never worked well. > > > > So for specific cases like C++ new T we could create heapvars > > appropriately typed. But you have to double-check for correctness > > because of may_have_pointers and so on. > > Hi Richard, I am not 100% sure about the meaning of the > may_have_pointers field. In the source it says: > >270 /* True if this field may contain pointers. */ >271 unsigned int may_have_pointers : 1; > > Does that mean that (1) the field is a pointer type? (2) An integer > which may have hold the value of a pointer? Yes. > Or that (3) there are > pointers pointing to that field? No, that property doesn't exist as flag. > You mention that for C++ new T heap vars are typed appropriately, but we > still need to check for correctness. If I understood correctly, this > means that may_have_poin
[PATCH v2] : Add nitems() and snitems() macros
'nitems()' calculates the length of an array in number of items. It is safe: if a pointer is passed to the macro (or function, in C++), the compilation is broken due to: - In >= C11: _Static_assert() - In C89, C99: Negative anonymous bitfield - In C++: The template requires an array 'snitems()' is equivalent to nitems(), but it returns a 'ptrdiff_t' instead of a 'size_t'. It is useful for comparison with signed integer values. Some BSDs already provide a macro nitems() in , although it usually doesn't provide safety against pointers. This patch uses the same name for compatibility reasons, and to be the least disruptive with existing code. This patch also adds some other macros, which are required by 'nitems()': __is_same_type(_A, _B): Returns non-zero if the two input arguments are of the same type. __is_array(_Arr): Returns non-zero if the input argument is of an array type. __must_be(_Expr, _Msg): Allows using _Static_assert() everywhere an expression can be used. It evaluates '(int)0' or breaks the compilation. __must_be_array(_Arr): It evaluates to '(int)0' if the argument is of an array type. Else, it breaks compilation. __array_len(_Arr): It implements the basic sizeof division needed to calculate the array length. P.S.: I'd like to put this patch in the public domain. Signed-off-by: Alejandro Colomar --- misc/sys/param.h | 60 1 file changed, 60 insertions(+) diff --git a/misc/sys/param.h b/misc/sys/param.h index d7c319b157..88e95c2dba 100644 --- a/misc/sys/param.h +++ b/misc/sys/param.h @@ -102,5 +102,65 @@ #define MIN(a,b) (((a)<(b))?(a):(b)) #define MAX(a,b) (((a)>(b))?(a):(b)) +/* Macros related to the types of variables */ +# define __is_same_type(_A, _B) __builtin_types_compatible_p(__typeof__(_A), \ + __typeof__(_B)) +# define __is_array(_Arr) (!__is_same_type((_Arr), &(_Arr)[0])) + +/* Macros for embedding _Static_assert() in expressions */ +# if __STDC_VERSION__ >= 201112L +# define __must_be(_Expr, _Msg) ( \ +0 * (int)sizeof( \ + struct {\ +_Static_assert((_Expr), _Msg);\ +char _ISO_C_forbids_a_struct_with_no_members; \ + } \ +) \ +) +# else +# define __must_be(_Expr, _Msg) ( \ +0 * (int)sizeof( \ + struct {\ +int : (-!(_Expr)); \ +char _ISO_C_forbids_a_struct_with_no_members; \ + } \ +) \ +) +# endif + +# define __must_be_array(_Arr) __must_be(__is_array(_Arr), "Must be an array!") + +/* Macros for array sizes */ +#if defined(__cplusplus) +# if __cplusplus >= 201103L +template + constexpr inline std::size_t + nitems(const _Tp(&)[_Len]) __THROW + { +return _Len; + } + +template + constexpr inline std::ptrdiff_t + snitems(const _Tp(&)[_Len]) __THROW + { +return _Len; + } + +# else /* __cplusplus < 201103L */ +template + char + (&__nitems_chararr(const _Tp(&)[_Len]))[_Len]; + +# define nitems(_Arr) (sizeof(__nitems_chararr(_Arr))) +# define snitems(_Arr) (static_cast(nitems(_Arr))) +# endif /* __cplusplus < 201103L */ + +#else /* !defined(__cplusplus) */ +# define __array_len(_Arr) (sizeof(_Arr) / sizeof((_Arr)[0])) +# define nitems(_Arr) (__array_len(_Arr) + __must_be_array(_Arr)) +# define snitems(_Arr) ((ptrdiff_t)nitems(_Arr)) +#endif /* !defined(__cplusplus) */ + #endif /* sys/param.h */ -- 2.28.0
Re: [PATCH v2] : Add nitems() and snitems() macros
On 2020-09-25 15:20, Alejandro Colomar wrote: > 'nitems()' calculates the length of an array in number of items. > It is safe: if a pointer is passed to the macro (or function, in C++), > the compilation is broken due to: > - In >= C11: _Static_assert() > - In C89, C99: Negative anonymous bitfield > - In C++: The template requires an array > > 'snitems()' is equivalent to nitems(), > but it returns a 'ptrdiff_t' instead of a 'size_t'. > It is useful for comparison with signed integer values. > > Some BSDs already provide a macro nitems() in , > although it usually doesn't provide safety against pointers. > > This patch uses the same name for compatibility reasons, > and to be the least disruptive with existing code. > > This patch also adds some other macros, which are required by 'nitems()': > > __is_same_type(_A, _B): > Returns non-zero if the two input arguments are of the same type. > > __is_array(_Arr): > Returns non-zero if the input argument is of an array type. > > __must_be(_Expr, _Msg): > Allows using _Static_assert() everywhere an expression can be used. > It evaluates '(int)0' or breaks the compilation. > > __must_be_array(_Arr): > It evaluates to '(int)0' if the argument is of an array type. > Else, it breaks compilation. > > __array_len(_Arr): > It implements the basic sizeof division needed to calculate the array length. > > > P.S.: I'd like to put this patch in the public domain. > > > Signed-off-by: Alejandro Colomar > --- I patched my own system's with this, and while 'nitems()' works fine, I had to include in my main.c to be able to use 'snitems()', because I didn't have 'ptrdiff_t', eventhough already includes . I completely ignore the mechanisms behind system headers including other system headers. Moreover, I didn't find 'ptrdiff_t' defined in any of my systems headers I used 'user@debian:/usr/include$ grep -rn ptrdiff_t'. Does GCC do magic? What's the problem with that? How should I fix the patch? My system: Debian bullseye/sid; x86-64; gcc 10; libc 2.31-3 Thanks, Alex > misc/sys/param.h | 60 > 1 file changed, 60 insertions(+) > > diff --git a/misc/sys/param.h b/misc/sys/param.h > index d7c319b157..88e95c2dba 100644 > --- a/misc/sys/param.h > +++ b/misc/sys/param.h > @@ -102,5 +102,65 @@ > #define MIN(a,b) (((a)<(b))?(a):(b)) > #define MAX(a,b) (((a)>(b))?(a):(b)) > > +/* Macros related to the types of variables */ > +# define __is_same_type(_A, _B) __builtin_types_compatible_p(__typeof__(_A), \ > + __typeof__(_B)) > +# define __is_array(_Arr) (!__is_same_type((_Arr), &(_Arr)[0])) > + > +/* Macros for embedding _Static_assert() in expressions */ > +# if __STDC_VERSION__ >= 201112L > +# define __must_be(_Expr, _Msg) ( \ > +0 * (int)sizeof( \ > + struct { \ > +_Static_assert((_Expr), _Msg); \ > +char _ISO_C_forbids_a_struct_with_no_members; \ > + } \ > +) \ > +) > +# else > +# define __must_be(_Expr, _Msg) ( \ > +0 * (int)sizeof( \ > + struct { \ > +int : (-!(_Expr)); \ > +char _ISO_C_forbids_a_struct_with_no_members; \ > + } \ > +) \ > +) > +# endif > + > +# define __must_be_array(_Arr) __must_be(__is_array(_Arr), "Must be an array!") > + > +/* Macros for array sizes */ > +#if defined(__cplusplus) > +# if __cplusplus >= 201103L > +template > + constexpr inline std::size_t > + nitems(const _Tp(&)[_Len]) __THROW > + { > +return _Len; > + } > + > +template > + constexpr inline std::ptrdiff_t > + snitems(const _Tp(&)[_Len]) __THROW > + { > +return _Len; > + } > + > +# else /* __cplusplus < 201103L */ > +template > + char > + (&__nitems_chararr(const _Tp(&)[_Len]))[_Len]; > + > +# define nitems(_Arr) (sizeof(__nitems_chararr(_Arr))) > +# define snitems(_Arr) (static_cast(nitems(_Arr))) > +# endif /* __cplusplus < 201103L */ > + > +#else /* !defined(__cplusplus) */ > +# define __array_len(_Arr) (sizeof(_Arr) / sizeof((_Arr)[0])) > +# define nitems(_Arr) (__array_len(_Arr) + __must_be_array(_Arr)) > +# define snitems(_Arr) ((ptrdiff_t)nitems(_Arr)) > +#endif /* !defined(__cplusplus) */ > + > > #endif /* sys/param.h */ >
Re: [PATCH v2] : Add nitems() and snitems() macros
On 25/09/20 16:10 +0200, Alejandro Colomar wrote: On 2020-09-25 15:20, Alejandro Colomar wrote: 'nitems()' calculates the length of an array in number of items. It is safe: if a pointer is passed to the macro (or function, in C++), the compilation is broken due to: - In >= C11: _Static_assert() - In C89, C99: Negative anonymous bitfield - In C++: The template requires an array 'snitems()' is equivalent to nitems(), but it returns a 'ptrdiff_t' instead of a 'size_t'. It is useful for comparison with signed integer values. Some BSDs already provide a macro nitems() in , although it usually doesn't provide safety against pointers. This patch uses the same name for compatibility reasons, and to be the least disruptive with existing code. This patch also adds some other macros, which are required by 'nitems()': __is_same_type(_A, _B): Returns non-zero if the two input arguments are of the same type. __is_array(_Arr): Returns non-zero if the input argument is of an array type. __must_be(_Expr, _Msg): Allows using _Static_assert() everywhere an expression can be used. It evaluates '(int)0' or breaks the compilation. __must_be_array(_Arr): It evaluates to '(int)0' if the argument is of an array type. Else, it breaks compilation. __array_len(_Arr): It implements the basic sizeof division needed to calculate the array length. P.S.: I'd like to put this patch in the public domain. Signed-off-by: Alejandro Colomar --- I patched my own system's with this, and while 'nitems()' works fine, I had to include in my main.c to be able to use 'snitems()', because I didn't have 'ptrdiff_t', eventhough already includes . I completely ignore the mechanisms behind system headers including other system headers. Moreover, I didn't find 'ptrdiff_t' defined in any of my systems headers I used 'user@debian:/usr/include$ grep -rn ptrdiff_t'. Does GCC do magic? What's the problem with that? How should I fix the patch? Do you really need to provide snitems? Users can use (ptrdiff_t)nitems if needed, can't they? C++ provides std::ssize because there are reasons to want it in generic contexts when using the function on arbitrary container-like objects. But for an array size you know that ptrdiff_t is wide enough to represent the size of any array. Do you have a use case that requries snitems, or can you assume YAGNI?
Re: [PATCH v2] : Add nitems() and snitems() macros
Hello Jonathan, On 2020-09-25 16:48, Jonathan Wakely wrote: > Do you really need to provide snitems? > > Users can use (ptrdiff_t)nitems if needed, can't they? They can, but that adds casts in the code, which makes longer lines that are somewhat harder to read. To avoid that, users may sometimes omit the cast with possible UB. BTW, I use IMO, array indices should be declared as 'ptrdiff_t' always, and not 'size_t'. More generically, I use unsigned integer types for two reasons: bitwise operations, and library functions that require me to do so. I don't intend to force anyone with my opinion, of course, but if I were to choose a type for 'nitems()', it would be 'ptrdiff_t'. However, for legacy reasons people will expect that macro to be unsigned, so I'd have 'nitems()' unsigned, and then a signed version prefixed with an 's'. Some very interesting links about this topic: Bjarne Stroustrup (and others) about signed and unsigned integers: https://www.youtube.com/watch?v=Puio5dly9N8&t=12m56s https://www.youtube.com/watch?v=Puio5dly9N8&t=42m41s The two links above are two interesting moments of the same video. I guess that might be the reason they added std::ssize, BTW. Google's C++ Style Guide about unsigned integers: https://google.github.io/styleguide/cppguide.html#Integer_Types And the most voted StackOverflow answer to the question 'What is the correct type for array indexes in C?': https://stackoverflow.com/a/3174900/6872717 > > C++ provides std::ssize because there are reasons to want it in > generic contexts when using the function on arbitrary container-like > objects. But for an array size you know that ptrdiff_t is wide enough > to represent the size of any array.> > Do you have a use case that requries snitems, or can you assume YAGNI? > I have a few use cases: 1) int alx_gnuplot_set_style (struct Alx_Gnuplot *restrict gnuplot, int style, const char *restrict opt) { if (style < 0 || style >= ARRAY_SSIZE(styles)) return -1; if (alx_strlcpys(gnuplot->style, styles[style], ARRAY_SIZE(gnuplot->style), NULL)) return -1; if (opt) return alx_strbcatf(gnuplot->style, NULL, " %s", opt); return 0; } [https://github.com/alejandro-colomar/libalx/blob/master/src/extra/plot/setup.c] 2) I have many loops that access arrays; I'll just make up an example of how I normally access arrays: void foo(ptrdiff_t nmemb) { int arr[nmemb]; for (ptrdiff_t i = 0; i < ARRAY_SSIZE(arr); i++) arr[i] = i; } Grepping through my code, I have a similar number of ARRAY_SIZE() and ARRAY_SSIZE(). I could have '#define snitems(arr) ((ptrdiff_t)nitems(arr))' in my projects, but is it really necessary? Did I convince you? :-) Thanks, Alex
Re: [PATCH v2] : Add nitems() and snitems() macros
On 25/09/20 18:30 +0200, Alejandro Colomar via Libstdc++ wrote: Hello Jonathan, On 2020-09-25 16:48, Jonathan Wakely wrote: Do you really need to provide snitems? Users can use (ptrdiff_t)nitems if needed, can't they? They can, but that adds casts in the code, which makes longer lines that are somewhat harder to read. To avoid that, users may sometimes omit the cast with possible UB. BTW, I use IMO, array indices should be declared as 'ptrdiff_t' always, and not 'size_t'. More generically, I use unsigned integer types for two reasons: bitwise operations, and library functions that require me to do so. I don't intend to force anyone with my opinion, of course, but if I were to choose a type for 'nitems()', it would be 'ptrdiff_t'. However, for legacy reasons people will expect that macro to be unsigned, so I'd have 'nitems()' unsigned, and then a signed version prefixed with an 's'. Some very interesting links about this topic: Bjarne Stroustrup (and others) about signed and unsigned integers: https://www.youtube.com/watch?v=Puio5dly9N8&t=12m56s https://www.youtube.com/watch?v=Puio5dly9N8&t=42m41s The two links above are two interesting moments of the same video. I guess that might be the reason they added std::ssize, BTW. Yes, I'm aware of all the rationale. I already said that it makes sense in C++ where you have generic code. I am not convinced that it's necessary to add to when all it does is a cast from size_t to ptrdiff_t.
Re: [PATCH v2] : Add nitems() and snitems() macros
On 25/09/20 18:30 +0200, Alejandro Colomar via Libstdc++ wrote: I have a similar number of ARRAY_SIZE() and ARRAY_SSIZE(). I could have '#define snitems(arr) ((ptrdiff_t)nitems(arr))' in my projects, but is it really necessary? The barrier for adding something to glibc headers should be a LOT higher than "I could [do it in my own code], but is it really necessary?" Did I convince you? :-) No.
Re: [PATCH v2] : Add nitems() and snitems() macros
On 2020-09-25 19:39, Jonathan Wakely wrote: > Yes, I'm aware of all the rationale. I already said that it makes > sense in C++ where you have generic code. I am not convinced that it's > necessary to add to when all it does is a cast from > size_t to ptrdiff_t. > While I would prefer a signed version, I could live with only 'nitems()'. Having all the __must_be_array thing is the most important part. On 2020-09-25 19:42, Jonathan Wakely wrote: On 25/09/20 18:30 +0200, Alejandro Colomar via Libstdc++ wrote: I have a similar number of ARRAY_SIZE() and ARRAY_SSIZE(). I could have '#define snitems(arr) ((ptrdiff_t)nitems(arr))' in my projects, but is it really necessary? The barrier for adding something to glibc headers should be a LOT higher than "I could [do it in my own code], but is it really necessary?" Did I convince you? :-) No. Well, you convinced me :) I'll rewrite the patch, and the problem about will vanish. Cheers, Alex
[PATCH v3] : Add nitems()
'nitems()' calculates the length of an array in number of items. It is safe: if a pointer is passed to the macro (or function, in C++), the compilation is broken due to: - In >= C11: _Static_assert() - In C89, C99: Negative anonymous bitfield - In C++: The template requires an array Some BSDs already provide a macro nitems() in , although it usually doesn't provide safety against pointers. This patch uses the same name for compatibility reasons, and to be the least disruptive with existing code. This patch also adds some other macros, which are required by 'nitems()': __is_same_type(_A, _B): Returns non-zero if the two input arguments are of the same type. __is_array(_Arr): Returns non-zero if the input argument is of an array type. __must_be(_Expr, _Msg): Allows using _Static_assert() everywhere an expression can be used. It evaluates '(int)0' or breaks the compilation. __must_be_array(_Arr): It evaluates to '(int)0' if the argument is of an array type. Else, it breaks compilation. __array_len(_Arr): It implements the basic sizeof division needed to calculate the array length. P.S.: I'd like to put this patch in the public domain. Signed-off-by: Alejandro Colomar --- misc/sys/param.h | 51 1 file changed, 51 insertions(+) diff --git a/misc/sys/param.h b/misc/sys/param.h index d7c319b157..d88cbacc9c 100644 --- a/misc/sys/param.h +++ b/misc/sys/param.h @@ -102,5 +102,56 @@ #define MIN(a,b) (((a)<(b))?(a):(b)) #define MAX(a,b) (((a)>(b))?(a):(b)) +/* Macros related to the types of variables */ +# define __is_same_type(_A, _B) __builtin_types_compatible_p(__typeof__(_A), \ + __typeof__(_B)) +# define __is_array(_Arr) (!__is_same_type((_Arr), &(_Arr)[0])) + +/* Macros for embedding _Static_assert() in expressions */ +# if __STDC_VERSION__ >= 201112L +# define __must_be(_Expr, _Msg) ( \ +0 * (int)sizeof( \ + struct {\ +_Static_assert((_Expr), _Msg);\ +char _ISO_C_forbids_a_struct_with_no_members; \ + } \ +) \ +) +# else +# define __must_be(_Expr, _Msg) ( \ +0 * (int)sizeof( \ + struct {\ +int : (-!(_Expr)); \ +char _ISO_C_forbids_a_struct_with_no_members; \ + } \ +) \ +) +# endif + +# define __must_be_array(_Arr) __must_be(__is_array(_Arr), "Must be an array!") + +/* Macros for array sizes */ +#if defined(__cplusplus) +# if __cplusplus >= 201103L +template + constexpr inline std::size_t + nitems(const _Tp(&)[_Len]) __THROW + { +return _Len; + } + +# else /* __cplusplus < 201103L */ +template + char + (&__nitems_chararr(const _Tp(&)[_Len]))[_Len]; + +# define nitems(_Arr) (sizeof(__nitems_chararr(_Arr))) +# endif /* __cplusplus < 201103L */ + +#else /* !defined(__cplusplus) */ +# define __array_len(_Arr) (sizeof(_Arr) / sizeof((_Arr)[0])) +# define nitems(_Arr) (__array_len(_Arr) + __must_be_array(_Arr)) +#endif /* !defined(__cplusplus) */ + #endif /* sys/param.h */ -- 2.28.0
gcc-9-20200925 is now available
Snapshot gcc-9-20200925 is now available on https://gcc.gnu.org/pub/gcc/snapshots/9-20200925/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 9 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-9 revision 4fb606b503780b91ad79c203003dc41a32cfbab7 You'll find: gcc-9-20200925.tar.xzComplete GCC SHA256=7033f11873f2fc274c7d06c5ec8165e617819a565d7f174775b7500b4069d69a SHA1=df389691b5bf2a8d25e3eb01c735f165f2e05ffd Diffs from 9-20200918 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.