Re: Improve -Wflex-array-member-not-at-end changes.html wording |Plus: and warning bug? (was: [V2][PATCH] gcc-14/changes.html: Deprecate a GCC C extension on flexible array members.)

2023-10-01 Thread Qing Zhao
Hi, Tobias,

Sorry for the late reply.

I has been on vacation after Cauldron, and will be back to work in the mid of 
Oct. will look at this issue at that time.

Qing

> On Sep 25, 2023, at 2:24 PM, Tobias Burnus  wrote:
> 
> Hi all,
> 
> I stumbled over this as I found the wording in the release notes rather 
> unclear.is.
> 
> 
> First, the following gives only a -pedantic warning and not a 
> -Wflex-array-member-not-at-end:
> 
>  struct t { int b; int x[]; };
>  struct q { int b; struct t a[2]; int c; };
> 
> warning: invalid use of structure with flexible array member [-Wpedantic]
> 
> If I remove the "[2]", it shows additionally:
>  warning: structure containing a flexible array member is not at the end of 
> another structure [-Wflex-array-member-not-at-end]
> 
> It seems as if it should print latter warning also inside the struct.
> 
> Qing? Joseph? Thoughts?
> 
> * * *
> 
> Secondly, if this is deprecated, shouldn't then the warning enabled by, e.g., 
> -Wall or made
> otherwise more prominent? (-std=?) - Currently, one either has to find the 
> new flag or use
> -pedantic.
> 
> Or is this not really regarded as deprecated? But then (IMHO) we should not 
> really claim so and just
> add the warning without deprecation.
> 
> BTW; clang-15 prints the -Wgnu-variable-sized-type-not-at-end warning by 
> default.
> 
> Joseph, all: Thoughts?
> 
> * * *
> 
> Cross ref: The patch adding the new warning is r14-2197-g070a6bf0bdc6761
> https://gcc.gnu.org/pipermail/gcc-cvs/2023-June/385730.html (cf. previously 
> in this thread)
> 
> 
> * * *
> 
> Regarding the changes.html wording:
> 
> On 07.08.23 16:22, Qing Zhao via Gcc-patches wrote:
> 
>> Comparing to the 1st version, the only change is to address Richard's
>> comment on refering a warning option for diagnosing deprecated behavior.
> ...
>> +++ b/htdocs/gcc-14/changes.html
>> @@ -30,7 +30,18 @@ a work-in-progress.
>>  
>>  Caveats
>>  
>> -  ...
>> +  C:
>> +  Support for the GCC extension, a structure containing a C99 flexible 
>> array
>> +  member, or a union containing such a structure, is not the last field 
>> of
>> +  another structure, is deprecated. Refer to
>> +  https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html";>
>> +  Zero Length Arrays.
> 
> ...
> 
> I find the first sentence difficult to read. What do you think of the 
> following?
> (It is hard to come up with some good wording.)
> 
> --- a/htdocs/gcc-14/changes.html
> +++ b/htdocs/gcc-14/changes.html
> @@ -31,9 +31,10 @@ a work-in-progress.
> Caveats
> 
>   C:
> -  Support for the GCC extension, a structure containing a C99 flexible 
> array
> -  member, or a union containing such a structure, is not the last field 
> of
> -  another structure, is deprecated. Refer to
> +  Support for the GCC extension that a structure containing a C99 
> flexible
> +  array (and any union containing a member of such structure) can be a
> +  member of a structure has been deprecated for the case that it is not
> +  the last member. Refer to
>   https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html";>
>   Zero Length Arrays.
>   Any code relying on this extension should be modifed to ensure that
> 
> 
> Tobias
> 
> PS:  C17 has:
> "A structure or union shall not contain a member with incomplete or function 
> type (hence, a structure
> shall not contain an instance of itself, but may contain a pointer to an 
> instance of itself), except that
> the last member of a structure with more than one named member may have 
> incomplete array type;
> such a structure (and any union containing, possibly recursively, a member 
> that is such a structure)
> shall not be a member of a structure or an element of an array."
> 
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955



Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Qing Zhao
Hi, Sid, 

Thanks a lot for your time and effort to review this patch set!
And sorry for my late reply due to a long vacation immediately after Cauldron, 
just came back this Monday..

See my reply embedded below:

> On Oct 5, 2023, at 2:51 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-25 11:24, Qing Zhao wrote:
>> Provide a new counted_by attribute to flexible array member field.
> 
> The obligatory "I can't ack the patch but here's a review" disclaimer :)
> 
>> 'counted_by (COUNT)'
>>  The 'counted_by' attribute may be attached to the flexible array
>>  member of a structure.  It indicates that the number of the
>>  elements of the array is given by the field named "COUNT" in the
>>  same structure as the flexible array member.  GCC uses this
>>  information to improve the results of the array bound sanitizer and
>>  the '__builtin_dynamic_object_size'.
>>  For instance, the following code:
>>   struct P {
>> size_t count;
>> char other;
>> char array[] __attribute__ ((counted_by (count)));
>>   } *p;
>>  specifies that the 'array' is a flexible array member whose number
>>  of elements is given by the field 'count' in the same structure.
>>  The field that represents the number of the elements should have an
>>  integer type.  An explicit 'counted_by' annotation defines a
>>  relationship between two objects, 'p->array' and 'p->count', that
>>  'p->array' has _at least_ 'p->count' number of elements available.
>>  This relationship must hold even after any of these related objects
>>  are updated.  It's the user's responsibility to make sure this
>>  relationship to be kept all the time.  Otherwise the results of the
>>  array bound sanitizer and the '__builtin_dynamic_object_size' might
>>  be incorrect.
>>  For instance, in the following example, the allocated array has
>>  less elements than what's specified by the 'sbuf->count', this is
>>  an user error.  As a result, out-of-bounds access to the array
>>  might not be detected.
>>   #define SIZE_BUMP 10
>>   struct P *sbuf;
>>   void alloc_buf (size_t nelems)
>>   {
>> sbuf = (struct P *) malloc (MAX (sizeof (struct P),
>>(offsetof (struct P, array[0])
>> + nelems * sizeof (char;
>> sbuf->count = nelems + SIZE_BUMP;
>> /* This is invalid when the sbuf->array has less than sbuf->count
>>elements.  */
>>   }
>>  In the following example, the 2nd update to the field 'sbuf->count'
>>  of the above structure will permit out-of-bounds access to the
>>  array 'sbuf>array' as well.
>>   #define SIZE_BUMP 10
>>   struct P *sbuf;
>>   void alloc_buf (size_t nelems)
>>   {
>> sbuf = (struct P *) malloc (MAX (sizeof (struct P),
>>(offsetof (struct P, array[0])
>> + (nelems + SIZE_BUMP) * sizeof 
>> (char;
>> sbuf->count = nelems;
>> /* This is valid when the sbuf->array has at least sbuf->count
>>elements.  */
>>   }
>>   void use_buf (int index)
>>   {
>> sbuf->count = sbuf->count + SIZE_BUMP + 1;
>> /* Now the value of sbuf->count is larger than the number
>>of elements of sbuf->array.  */
>> sbuf->array[index] = 0;
>> /* then the out-of-bound access to this array
>>might not be detected.  */
>>   }
>> gcc/c-family/ChangeLog:
>>  PR C/108896
>>  * c-attribs.cc (handle_counted_by_attribute): New function.
>>  (attribute_takes_identifier_p): Add counted_by attribute to the list.
>>  * c-common.cc (c_flexible_array_member_type_p): ...To this.
>>  * c-common.h (c_flexible_array_member_type_p): New prototype.
>> gcc/c/ChangeLog:
>>  PR C/108896
>>  * c-decl.cc (flexible_array_member_type_p): Renamed and moved to...
>>  (add_flexible_array_elts_to_size): Use renamed function.
>>  (is_flexible_array_member_

Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Qing Zhao


>>> +   member FIELD_DECL is a valid field of the containing structure's 
>>> fieldlist,
>>> +   FIELDLIST, Report error and remove this attribute when it's not.  */
>>> +static void
>>> +verify_counted_by_attribute (tree fieldlist, tree field_decl)
>>> +{
>>> +  tree attr_counted_by = lookup_attribute ("counted_by",
>>> +   DECL_ATTRIBUTES (field_decl));
>>> +
>>> +  if (!attr_counted_by)
>>> +return;
>>> +
>>> +  /* If there is an counted_by attribute attached to the field,
>>> + verify it.  */
>>> +
>>> +  const char *fieldname
>>> += IDENTIFIER_POINTER (TREE_VALUE (TREE_VALUE (attr_counted_by)));
>>> +
>>> +  /* Verify the argument of the attrbute is a valid field of the
>> s/attrbute/attribute/
>>> + containing structure.  */
>>> +
>>> +  tree counted_by_field = get_named_field (fieldlist, fieldname);
>>> +
>>> +  /* Error when the field is not found in the containing structure.  */
>>> +  if (!counted_by_field)
>>> +{
>>> +  error_at (DECL_SOURCE_LOCATION (field_decl),
>>> +"%qE attribute argument not a field declaration"
>>> +" in the same structure, ignore it",
>>> +(get_attribute_name (attr_counted_by)));
>> Probably someone with English as a first language would make a better 
>> suggestion, but how about:
>>   Argument specified in %qE attribute is not a field declaration in the
>>   same structure, ignoring it.
>>> +
>>> +  DECL_ATTRIBUTES (field_decl)
>>> += remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
>>> +}
>>> +  else
>>> +  /* Error when the field is not with an integer type.  */
>> Suggest: Flag an error when the field is not of an integer type.
>>> +{
>>> +  while (TREE_CHAIN (counted_by_field))
>>> +counted_by_field = TREE_CHAIN (counted_by_field);
>>> +  tree real_field = TREE_VALUE (counted_by_field);
>>> +
>>> +  if (TREE_CODE (TREE_TYPE (real_field)) != INTEGER_TYPE)
>>> +{
>>> +  error_at (DECL_SOURCE_LOCATION (field_decl),
>>> + "%qE attribute argument not a field declaration"
>>> + " with integer type, ignore it",
>>> + (get_attribute_name (attr_counted_by)));
>> Suggest:
>>   Argument specified in %qE attribute is not of an integer type,
>>   ignoring it.
>>> +
>>> +  DECL_ATTRIBUTES (field_decl)
>>> += remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
>>> +}
>>> +}
>>> +
>>> +  return;
> 
> I forgot to mention the redundant return here.

Could you please clarify a little bit here, why the return here is redundant? 
> 
>>> +}
>>>   /* Fill in the fields of a RECORD_TYPE or UNION_TYPE node, T.



Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Qing Zhao


> On Oct 18, 2023, at 11:18 AM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-18 10:51, Qing Zhao wrote:
>>>>> +   member FIELD_DECL is a valid field of the containing structure's 
>>>>> fieldlist,
>>>>> +   FIELDLIST, Report error and remove this attribute when it's not.  */
>>>>> +static void
>>>>> +verify_counted_by_attribute (tree fieldlist, tree field_decl)
>>>>> +{
>>>>> +  tree attr_counted_by = lookup_attribute ("counted_by",
>>>>> +   DECL_ATTRIBUTES (field_decl));
>>>>> +
>>>>> +  if (!attr_counted_by)
>>>>> +return;
>>>>> +
>>>>> +  /* If there is an counted_by attribute attached to the field,
>>>>> + verify it.  */
>>>>> +
>>>>> +  const char *fieldname
>>>>> += IDENTIFIER_POINTER (TREE_VALUE (TREE_VALUE (attr_counted_by)));
>>>>> +
>>>>> +  /* Verify the argument of the attrbute is a valid field of the
>>>> s/attrbute/attribute/
>>>>> + containing structure.  */
>>>>> +
>>>>> +  tree counted_by_field = get_named_field (fieldlist, fieldname);
>>>>> +
>>>>> +  /* Error when the field is not found in the containing structure.  */
>>>>> +  if (!counted_by_field)
>>>>> +{
>>>>> +  error_at (DECL_SOURCE_LOCATION (field_decl),
>>>>> +"%qE attribute argument not a field declaration"
>>>>> +" in the same structure, ignore it",
>>>>> +(get_attribute_name (attr_counted_by)));
>>>> Probably someone with English as a first language would make a better 
>>>> suggestion, but how about:
>>>>   Argument specified in %qE attribute is not a field declaration in the
>>>>   same structure, ignoring it.
>>>>> +
>>>>> +  DECL_ATTRIBUTES (field_decl)
>>>>> += remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
>>>>> +}
>>>>> +  else
>>>>> +  /* Error when the field is not with an integer type.  */
>>>> Suggest: Flag an error when the field is not of an integer type.
>>>>> +{
>>>>> +  while (TREE_CHAIN (counted_by_field))
>>>>> +counted_by_field = TREE_CHAIN (counted_by_field);
>>>>> +  tree real_field = TREE_VALUE (counted_by_field);
>>>>> +
>>>>> +  if (TREE_CODE (TREE_TYPE (real_field)) != INTEGER_TYPE)
>>>>> +{
>>>>> +  error_at (DECL_SOURCE_LOCATION (field_decl),
>>>>> + "%qE attribute argument not a field declaration"
>>>>> + " with integer type, ignore it",
>>>>> + (get_attribute_name (attr_counted_by)));
>>>> Suggest:
>>>>   Argument specified in %qE attribute is not of an integer type,
>>>>   ignoring it.
>>>>> +
>>>>> +  DECL_ATTRIBUTES (field_decl)
>>>>> += remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
>>>>> +}
>>>>> +}
>>>>> +
>>>>> +  return;
>>> 
>>> I forgot to mention the redundant return here.
>> Could you please clarify a little bit here, why the return here is redundant?
> 
> It's the last line in the function, so even without that statement the 
> function will return.
Oh, I see. -:)
Actually,I always put an explicit return  there even though it’s the last line 
and return implicitly. 

Qing

> 
> Thanks,
> Sid



Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-18 Thread Qing Zhao


> On Oct 6, 2023, at 4:01 PM, Martin Uecker  wrote:
> 
> Am Freitag, dem 06.10.2023 um 06:50 -0400 schrieb Siddhesh Poyarekar:
>> On 2023-10-06 01:11, Martin Uecker wrote:
>>> Am Donnerstag, dem 05.10.2023 um 15:35 -0700 schrieb Kees Cook:
 On Thu, Oct 05, 2023 at 04:08:52PM -0400, Siddhesh Poyarekar wrote:
> 2. How would you handle signedness of the size field?  The size gets
> converted to sizetype everywhere it is used and overflows/underflows may
> produce interesting results.  Do you want to limit the types to unsigned 
> or
> do you want to add a disclaimer in the docs?  The former seems like the
> *right* thing to do given that it is a new feature; best to enforce the
> cleaner habit at the outset.
 
 The Linux kernel has a lot of "int" counters, so the goal is to catch
 negative offsets just like too-large offsets at runtime with the sanitizer
 and report 0 for __bdos. Refactoring all these to be unsigned is going
 to take time since at least some of them use the negative values as
 special values unrelated to array indexing. :(
 
 So, perhaps if unsigned counters are worth enforcing, can this be a
 separate warning the kernel can turn off initially?
 
>>> 
>>> I think unsigned counters are much more problematic than signed ones
>>> because wraparound errors are more difficult to find.
>>> 
>>> With unsigned you could potentially diagnose wraparound, but only if we
>>> add -fsanitize=unsigned-overflow *and* add mechanism to mark intentional
>>> wraparound *and* everybody adds this annotation after carefully screening
>>> their code *and* rewriting all operations such as (counter - 3) + 5
>>> where the wraparound in the intermediate expression is harmless.
>>> 
>>> For this reason, I do not think we should ever enforce some rule that
>>> the counter has to be unsigned.
>>> 
>>> What we could do, is detect *storing* negative values into the
>>> counter at run-time using UBSan. (but if negative values are
>>> used for special cases, one also should be able to turn this
>>> off).
>> 
>> All of the object size detection relies on object sizes being sizetype. 
>> The closest we could do with that is detect (sz != SIZE_MAX && sz > 
>> size_t / 2), since allocators typically cannot allocate more than 
>> SIZE_MAX / 2.
> 
> I was talking about the counter in:
> 
> struct {
>  int counter;
>  char buf[] __counted_by__((counter))
> };
> 
> which could be checked to be positive either when stored to or 
> when buf is used.
> 
> And yes, we could also check the size of buf.  Not sure what is
> done for VLAs now, but I guess it could be similar.
> 
For VLAs, the bounds expression could be both signed or unsigned. 
But we have added a sanitizer option -fsanitize=vla-bound to catch the cases 
when the size of the VLA is not positive.

For example:

opc@qinzhao-ol8u3-x86 Martin]$ cat t3.c
#include 
size_t foo(int m)
{
  char t[m];

  return sizeof(t);
}

int main()
{
  printf ("the sizeof flexm is %lu \n", foo(-1));
  return 0;
}
[opc@qinzhao-ol8u3-x86 Martin]$ sh t
/home/opc/Install/latest-d/bin/gcc -fsanitize=undefined -O2 -Wall -Wpedantic 
t3.c
t3.c:4:8: runtime error: variable length array bound evaluates to non-positive 
value -1
the sizeof flexm is 18446744073609551616 


We can do the same thing for “counted_by”. i.e:

1. No specification for signed or unsigned for counted_by field.
2. Add an sanitizer option -fsanitize=counted-by-bound to catch the cases when 
the size of the counted-by is not positive.

Is this good enough?

Qing
> Best,
> Martin
> 
> 
> 
>> 
>> Sid



Re: [PATCH v2] gcc: Introduce -fhardened

2023-10-18 Thread Qing Zhao
Marek,

Sorry for the late comment (I was just back from a long vacation immediate 
after Cauldron). 

One question:

Is the option “-fhandened” for production build or for development build? 

If it’s for development build, then adding -ftrivial-auto-var-init=pattern is 
reasonable since the major purpose for  -ftrivial-auto-var-init=pattern is for 
debugging, the runtime overhead of -ftrivial-auto-var-init=pattern is higher 
then -ftrivial-auto-var-init=zero.

However, if it’s for production build, then adding -ftrivial-auto-var-init=zero 
is better since the major purpose for -ftrivial-auto-var-init=zero is for 
production build to eliminate all uninitialization. And the runtime overhead of 
=zero is smaller than =pattern.

Qing
> On Oct 11, 2023, at 4:48 PM, Marek Polacek  wrote:
> 
> On Tue, Sep 19, 2023 at 10:58:19AM -0400, Marek Polacek wrote:
>> On Mon, Sep 18, 2023 at 08:57:39AM +0200, Richard Biener wrote:
>>> On Fri, Sep 15, 2023 at 5:09 PM Marek Polacek via Gcc-patches
>>>  wrote:
 
 Bootstrapped/regtested on x86_64-pc-linux-gnu, 
 powerpc64le-unknown-linux-gnu,
 and aarch64-unknown-linux-gnu; ok for trunk?
 
 -- >8 --
 In 
 I proposed -fhardened, a new umbrella option that enables a reasonable set
 of hardening flags.  The read of the room seems to be that the option
 would be useful.  So here's a patch implementing that option.
 
 Currently, -fhardened enables:
 
  -D_FORTIFY_SOURCE=3 (or =2 for older glibcs)
  -D_GLIBCXX_ASSERTIONS
  -ftrivial-auto-var-init=pattern
  -fPIE  -pie  -Wl,-z,relro,-z,now
  -fstack-protector-strong
  -fstack-clash-protection
  -fcf-protection=full (x86 GNU/Linux only)
 
 -fhardened will not override options that were specified on the command 
 line
 (before or after -fhardened).  For example,
 
 -D_FORTIFY_SOURCE=1 -fhardened
 
 means that _FORTIFY_SOURCE=1 will be used.  Similarly,
 
  -fhardened -fstack-protector
 
 will not enable -fstack-protector-strong.
 
 In DW_AT_producer it is reflected only as -fhardened; it doesn't expand
 to anything.  I think we need a better way to show what it actually
 enables.
>>> 
>>> I do think we need to find a solution here to solve asserting compliance.
>> 
>> Fair enough.
>> 
>>> Maybe we can have -Whardened that will diagnose any altering of
>>> -fhardened by other options on the command-line or by missed target
>>> implementations?  People might for example use -fstack-protector
>>> but don't really want to make protection lower than requested with 
>>> -fhardened.
>>> 
>>> Any such conflict is much less appearant than when you use the
>>> flags -fhardened composes.
>> 
>> How about: --help=hardened says which options -fhardened attempts to
>> enable, and -Whardened warns when it didn't enable an option?  E.g.,
>> 
>>  -fstack-protector -fhardened -Whardened
>> 
>> would say that it didn't enable -fstack-protector-strong because
>> -fstack-protector was specified on the command line?
>> 
>> If !HAVE_LD_NOW_SUPPORT, --help=hardened probably doesn't even have to
>> list -z now, likewise for -z relro.
>> 
>> Unclear if -Whardened should be enabled by default, but probably yes?
> 
> Here's v2 which adds -Whardened (enabled by default).
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> -- >8 --
> In 
> I proposed -fhardened, a new umbrella option that enables a reasonable set
> of hardening flags.  The read of the room seems to be that the option
> would be useful.  So here's a patch implementing that option.
> 
> Currently, -fhardened enables:
> 
>  -D_FORTIFY_SOURCE=3 (or =2 for older glibcs)
>  -D_GLIBCXX_ASSERTIONS
>  -ftrivial-auto-var-init=pattern
>  -fPIE  -pie  -Wl,-z,relro,-z,now
>  -fstack-protector-strong
>  -fstack-clash-protection
>  -fcf-protection=full (x86 GNU/Linux only)
> 
> -fhardened will not override options that were specified on the command line
> (before or after -fhardened).  For example,
> 
> -D_FORTIFY_SOURCE=1 -fhardened
> 
> means that _FORTIFY_SOURCE=1 will be used.  Similarly,
> 
>  -fhardened -fstack-protector
> 
> will not enable -fstack-protector-strong.
> 
> In DW_AT_producer it is reflected only as -fhardened; it doesn't expand
> to anything.  This patch provides -Whardened, enabled by default, which
> warns when -fhardened couldn't enable a particular option.  I think most
> often it will say that _FORTIFY_SOURCE wasn't enabled because optimization
> were not enabled.
> 
> gcc/c-family/ChangeLog:
> 
>   * c-opts.cc (c_finish_options): Maybe cpp_define _FORTIFY_SOURCE
>   and _GLIBCXX_ASSERTIONS.
> 
> gcc/ChangeLog:
> 
>   * common.opt (Whardened, fhardened): New options.
>   * config.in: Regenerate.
>   * config/bpf/bpf.cc: Include "opts.h".
>   (bpf_option_o

Re: [V3][PATCH 2/3] Use the counted_by atribute info in builtin object size [PR108896]

2023-10-18 Thread Qing Zhao
Hi, Sid,

Thanks a lot for the detailed comments.

See my responds embedded below.

Qing

> On Oct 5, 2023, at 4:01 PM, Siddhesh Poyarekar  wrote:
> 
> 
> 
> On 2023-08-25 11:24, Qing Zhao wrote:
>> Use the counted_by atribute info in builtin object size to compute the
>> subobject size for flexible array members.
>> gcc/ChangeLog:
>>  PR C/108896
>>  * tree-object-size.cc (addr_object_size): Use the counted_by
>>  attribute info.
>>  * tree.cc (component_ref_has_counted_by_p): New function.
>>  (component_ref_get_counted_by): New function.
>>  * tree.h (component_ref_has_counted_by_p): New prototype.
>>  (component_ref_get_counted_by): New prototype.
>> gcc/testsuite/ChangeLog:
>>  PR C/108896
>>  * gcc.dg/flex-array-counted-by-2.c: New test.
>>  * gcc.dg/flex-array-counted-by-3.c: New test.
>> ---
>>  .../gcc.dg/flex-array-counted-by-2.c  |  74 ++
>>  .../gcc.dg/flex-array-counted-by-3.c  | 210 ++
>>  gcc/tree-object-size.cc   |  37 ++-
>>  gcc/tree.cc   |  95 +++-
>>  gcc/tree.h|  10 +
>>  5 files changed, 418 insertions(+), 8 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
>>  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
>> diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c 
>> b/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
>> new file mode 100644
>> index ..ec580c1f1f01
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
>> @@ -0,0 +1,74 @@
>> +/* test the attribute counted_by and its usage in
>> + * __builtin_dynamic_object_size.  */
>> +/* { dg-do run } */
>> +/* { dg-options "-O2" } */
>> +
>> +#include "builtin-object-size-common.h"
>> +
>> +#define expect(p, _v) do { \
>> +size_t v = _v; \
>> +if (p == v) \
>> +__builtin_printf ("ok:  %s == %zd\n", #p, p); \
>> +else \
>> +{  \
>> +  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
>> +  FAIL (); \
>> +} \
>> +} while (0);
> 
> You're using this in a bunch of tests already; does it make sense to 
> consolidate it into builtin-object-size-common.h?
Will do this. 
> 
>> +
>> +struct flex {
>> +  int b;
>> +  int c[];
>> +} *array_flex;
>> +
>> +struct annotated {
>> +  int b;
>> +  int c[] __attribute__ ((counted_by (b)));
>> +} *array_annotated;
>> +
>> +struct nested_annotated {
>> +  struct {
>> +union {
>> +  int b;
>> +  float f;  
>> +};
>> +int n;
>> +  };
>> +  int c[] __attribute__ ((counted_by (b)));
>> +} *array_nested_annotated;
>> +
>> +void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
>> +{
>> +  array_flex
>> += (struct flex *)malloc (sizeof (struct flex)
>> + + normal_count *  sizeof (int));
>> +  array_flex->b = normal_count;
>> +
>> +  array_annotated
>> += (struct annotated *)malloc (sizeof (struct annotated)
>> +  + attr_count *  sizeof (int));
>> +  array_annotated->b = attr_count;
>> +
>> +  array_nested_annotated
>> += (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
>> + + attr_count *  sizeof (int));
>> +  array_nested_annotated->b = attr_count;
>> +
>> +  return;
>> +}
>> +
>> +void __attribute__((__noinline__)) test ()
>> +{
>> +expect(__builtin_dynamic_object_size(array_flex->c, 1), -1);
>> +expect(__builtin_dynamic_object_size(array_annotated->c, 1),
>> +   array_annotated->b * sizeof (int));
>> +expect(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
>> +   array_nested_annotated->b * sizeof (int));
>> +}
> 
> Maybe another test where the allocation, size assignment and __bdos call 
> happen in the same function, where the allocator is not recognized by gcc:
> 
> void *
> __attribute__ ((noinline))
> alloc (size_t sz)
> {
>  return __builtin_malloc (sz);
> }
> 
> void test (size_t sz)
> {
>  array_annotated = alloc (sz);
>  array_annotated->b = sz;
>  return __builtin_dynamic_object_size (array_annotated->c, 1);
> }
> 
> The interesting thing to test (and ensure in the 

Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-18 Thread Qing Zhao


> On Oct 5, 2023, at 4:08 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-25 11:24, Qing Zhao wrote:
>> This is the 3rd version of the patch, per our discussion based on the
>> review comments for the 1st and 2nd version, the major changes in this
>> version are:
> 
> Hi Qing,
> 
> I hope the review was helpful.  Overall, a couple of things to consider:
> 
> 1. How would you handle potential reordering between assignment of the size 
> to the counted_by field with the __bdos call that may consume it? You'll 
> probably need to express some kind of dependency there or in the worst case, 
> insert a barrier to disallow reordering.

Good point! 

So, your example in the respond to [V3][PATCH 2/3]Use the counted_by atribute 
info in builtin object size [PR108896]:
“
Maybe another test where the allocation, size assignment and __bdos call happen 
in the same function, where the allocator is not recognized by gcc:

void *
__attribute__ ((noinline))
alloc (size_t sz)
{
 return __builtin_malloc (sz);
}

void test (size_t sz)
{
 array_annotated = alloc (sz);
 array_annotated->b = sz;
 return __builtin_dynamic_object_size (array_annotated->c, 1);
}

The interesting thing to test (and ensure in the codegen) is that the 
assignment to array_annotated->b does not get reordered to below the 
__builtin_dynamic_object_size call since technically there is no data 
dependency between the two.
“
Will test on this. 

Not sure whether the current GCC alias analysis is able to distinguish one 
field of a structure from another field of the same structure, if YES, then
We need to add an explicit dependency edge from the write to 
“array_annotated->b” to the call to 
“__builtin_dynamic_object_size(array_annotated->c,1)”.
I will check on this and see how to resolve this issue.

I guess the possible solution is that we can add an implicit ref to 
“array_annotated->b” at the call to 
“__builtin_dynamic_object_size(array_annotated->c, 1)” if the counted_by 
attribute is available. That should resolve the issue.

Richard, what do you think on this?

> 
> 2. How would you handle signedness of the size field?  The size gets 
> converted to sizetype everywhere it is used and overflows/underflows may 
> produce interesting results.  Do you want to limit the types to unsigned or 
> do you want to add a disclaimer in the docs?  The former seems like the 
> *right* thing to do given that it is a new feature; best to enforce the 
> cleaner habit at the outset.

As I replied to Martin in another email, I plan to do the following to resolve 
this issue:

1. No specification for signed or unsigned for counted_by field.
2. Add a sanitizer option -fsanitize=counted-by-bound to catch the cases when 
the size of the counted-by is not positive.

Then, we will be consistent with the handling of VLA. 

So, I will not change anything for the current patch.
However, I will add the sanitizer option in a followup patch set.

Let me know your opinion.

thanks.

Qing

> 
> Thanks,
> Sid
> 
>> ***Against 1st version:
>> 1. change the name "element_count" to "counted_by";
>> 2. change the parameter for the attribute from a STRING to an
>> Identifier;
>> 3. Add logic and testing cases to handle anonymous structure/unions;
>> 4. Clarify documentation to permit the situation when the allocation
>> size is larger than what's specified by "counted_by", at the same time,
>> it's user's error if allocation size is smaller than what's specified by
>> "counted_by";
>> 5. Add a complete testing case for using counted_by attribute in
>> __builtin_dynamic_object_size when there is mismatch between the
>> allocation size and the value of "counted_by", the expecting behavior
>> for each case and the explanation on why in the comments.
>> ***Against 2rd version:
>> 1. Identify a tree node sharing issue and fixed it in the routine
>>"component_ref_get_counted_ty" of tree.cc;
>> 2. Update the documentation and testing cases with the clear usage
>>of the fomula to compute the allocation size:
>> MAX (sizeof (struct A), offsetof (struct A, array[0]) + counted_by * 
>> sizeof(element))
>>(the algorithm used in tree-object-size.cc is correct).
>> In this set of patches, the major functionality provided is:
>> 1. a new attribute "counted_by";
>> 2. use this new attribute in bound sanitizer;
>> 3. use this new attribute in dynamic object size for subobject size;
>> As discussed, I plan to add two more separate patches sets after this initial
>> patch set is approved and committed.
>> set 1. A new warning option and a new sanitizer option for the user error
>>   when the allocation size is smal

[version 2] Re: [PATCH][Middle-end]3rd patch of PR78809

2018-07-02 Thread Qing Zhao
Hi, Jeff,

thanks a lot for your review and comments.

I have addressed your comments,updated the patch, retested on both
aarch64 and x86.

The major changes in this version compared to the previous version are:

1. in routine “expand_builtin_memcmp”:
* move the inlining transformation AFTER the warning is issues for
-Wstringop-overflow;
* only apply inlining when there is No warning is issued.
2. in the testsuite, add a new testcase strcmpopt_6.c for this case.
3. update comments to:
* capitalize the first word.
* capitalize all the arguments.

NOTE, the routine ”expand_builtin_strcmp” and “expand_builtin_strncmp" are not 
changed.
the reason is:  there is NO overflow checking for these two routines currently.
if we need overflow checking for these two routines, I think that a separate 
patch is needed.
if this is needed, let me know, I can work on this separate patch for issuing 
warning for strcmp/strncmp when
-Wstringop-overflow is specified.

The new patch is as following, please take a look at it.

thanks.

Qing

gcc/ChangeLog

+2018-07-02  Qing Zhao  
+
+   PR middle-end/78809
+   * builtins.c (expand_builtin_memcmp): Inline the calls first
+   when result_eq is false.
+   (expand_builtin_strcmp): Inline the calls first.
+   (expand_builtin_strncmp): Likewise.
+   (inline_string_cmp): New routine. Expand a string compare
+   call by using a sequence of char comparison.
+   (inline_expand_builtin_string_cmp): New routine. Inline expansion
+   a call to str(n)cmp/memcmp.
+   * doc/invoke.texi (--param builtin-string-cmp-inline-length): New 
option.
+   * params.def (BUILTIN_STRING_CMP_INLINE_LENGTH): New.
+

gcc/testsuite/ChangeLog

+2018-07-02  Qing Zhao  
+
+   PR middle-end/78809
+   * gcc.dg/strcmpopt_5.c: New test.
+   * gcc.dg/strcmpopt_6.c: New test.
+



0001-3nd-Patch-for-PR78009.patch
Description: Binary data


> On Jun 28, 2018, at 12:10 AM, Jeff Law  wrote:
> So I still need to dig into this patch.  But I wanted to raise an
> potential issue and get yours and Martin's thoughts on it.
> 
> Martin (and others) have been working hard to improve GCC's ability to
> give good diagnostics for various problems with calls to mem* and str*
> functions (buffer overflows, restrict issues, etc).
> 
> One of the problems Martin has identified is early conversion of these
> calls into inlined direct operations.  If those conversions happen prior
> to the analysis for warnings we obviously can't issue any relevant warnings.
> 
> Please capitalize the first word in sentences like this.  This nit
> appears in most of your comments.
> 
> 
> So I believe you do inline expansion here prior to the checks for
> Wstringop_overflow.  I think you can safely move this code to after the
> warn_stringop_overflow checks.  Though you may want to make this code
> conditional on both calls to check_access returning true and avoiding
> your transformation if either or both calls return false.
> 
> Alternately you'd need to verify that inline_expand_builtin_string_cmp
> always returns false for cases which are going to generate a warning.
> But that seems a bit tougher to maintain over time if we were to add
> more warnings to this code.
> 
> When referring to arguments in comments, please capitalize them.  ie
> VAR_STR, CONST_STR, etc.
> 
> 
> But leave them as lower case in code fragments like this.
> 
> 
> So this generally looks pretty good.  THe biggest technical concern is
> making sure we're doing the right thing WRT issuing warnings.  You can
> tackle that problem by deferring inlining to a later point after
> warnings have been issued or by verifying that your routines do not
> inline in cases where warnings will be issued.  It may be worth adding
> testcases for these issues.
> 
> There's a large number of comments that need capitalization fixes.
> 
> Given there was no measured runtime performance impact, but slight
> improvements on codesize for values <= 3, let's go ahead with that as
> the default.
> 
> Can you address the issues above and repost for final review?
> 
> Thanks,
> jeff



Re: [PATCH][Middle-end]3rd patch of PR78809

2018-07-02 Thread Qing Zhao
Hi, Jeff,

thanks a lot for your review and comments.

I have addressed your comments,updated the patch, retested on both
aarch64 and x86.

The major changes in this version compared to the previous version are:

1. in routine expand_builtin_memcmp:
* move the inlining transformation AFTER the warning is issues for
-Wstringop-overflow;
* only apply inlining when there is No warning is issued.
2. in the testsuite, add a new testcase strcmpopt_6.c for this case.
3. update comments to:
* capitalize the first word.
* capitalize all the arguments.

NOTE, the routine expand_builtin_strcmp and expand_builtin_strncmp are not 
changed.
the reason is:  there is NO overflow checking for these two routines currently.
if we need overflow checking for these two routines, I think that a separate 
patch is needed.
if this is needed, let me know, I can work on this separate patch for issuing 
warning for strcmp/strncmp when
-Wstringop-overflow is specified.

The new patch is as following, please take a look at it.

thanks.

Qing

gcc/ChangeLog

+2018-07-02  Qing Zhao  
+
+   PR middle-end/78809
+   * builtins.c (expand_builtin_memcmp): Inline the calls first
+   when result_eq is false.
+   (expand_builtin_strcmp): Inline the calls first.
+   (expand_builtin_strncmp): Likewise.
+   (inline_string_cmp): New routine. Expand a string compare
+   call by using a sequence of char comparison.
+   (inline_expand_builtin_string_cmp): New routine. Inline expansion
+   a call to str(n)cmp/memcmp.
+   * doc/invoke.texi (--param builtin-string-cmp-inline-length): New 
option.
+   * params.def (BUILTIN_STRING_CMP_INLINE_LENGTH): New.
+

gcc/testsuite/ChangeLog

+2018-07-02  Qing Zhao  
+
+   PR middle-end/78809
+   * gcc.dg/strcmpopt_5.c: New test.
+   * gcc.dg/strcmpopt_6.c: New test.
+



0001-3nd-Patch-for-PR78009.patch
Description: Binary data


> But leave them as lower case in code fragments like this.
> 
> 
> So this generally looks pretty good.  THe biggest technical concern is
> making sure we're doing the right thing WRT issuing warnings.  You can
> tackle that problem by deferring inlining to a later point after
> warnings have been issued or by verifying that your routines do not
> inline in cases where warnings will be issued.  It may be worth adding
> testcases for these issues.
> 
> There's a large number of comments that need capitalization fixes.
> 
> Given there was no measured runtime performance impact, but slight
> improvements on codesize for values <= 3, let's go ahead with that as
> the default.
> 
> Can you address the issues above and repost for final review?
> 
> Thanks,
> jeff



Test (please ignore)

2018-07-02 Thread Qing Zhao




Re: [PATCH][Middle-end]3rd patch of PR78809

2018-07-05 Thread Qing Zhao
Hi,

I have sent two emails with the updated patches on 7/3:

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00065.html
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00070.html

however, these 2 emails  were not successfully forwarded to the 
gcc-patches@gcc.gnu.org mailing list.

So, I am sending the same email again in this one, hopefully this time it can 
go through.
Qing

Hi, Jeff,

thanks a lot for your review and comments.

I have addressed your comments,updated the patch, retested on both
aarch64 and x86.

The major changes in this version compared to the previous version are:

1. in routine expand_builtin_memcmp:
* move the inlining transformation AFTER the warning is issues for
-Wstringop-overflow;
* only apply inlining when there is No warning is issued.
2. in the testsuite, add a new testcase strcmpopt_6.c for this case.
3. update comments to:
* capitalize the first word.
* capitalize all the arguments.

NOTE, the routine expand_builtin_strcmp and expand_builtin_strncmp are not 
changed.
the reason is:  there is NO overflow checking for these two routines currently.
if we need overflow checking for these two routines, I think that a separate 
patch is needed.
if this is needed, let me know, I can work on this separate patch for issuing 
warning for strcmp/strncmp when
-Wstringop-overflow is specified.

The new patch is as following, please take a look at it.

thanks.

Qing

gcc/ChangeLog

+2018-07-02  Qing Zhao  
+
+   PR middle-end/78809
+   * builtins.c (expand_builtin_memcmp): Inline the calls first
+   when result_eq is false.
+   (expand_builtin_strcmp): Inline the calls first.
+   (expand_builtin_strncmp): Likewise.
+   (inline_string_cmp): New routine. Expand a string compare
+   call by using a sequence of char comparison.
+   (inline_expand_builtin_string_cmp): New routine. Inline expansion
+   a call to str(n)cmp/memcmp.
+   * doc/invoke.texi (--param builtin-string-cmp-inline-length): New 
option.
+   * params.def (BUILTIN_STRING_CMP_INLINE_LENGTH): New.
+

gcc/testsuite/ChangeLog

+2018-07-02  Qing Zhao  
+
+   PR middle-end/78809
+   * gcc.dg/strcmpopt_5.c: New test.
+   * gcc.dg/strcmpopt_6.c: New test.
+



0001-3nd-Patch-for-PR78009.patch
Description: Binary data


> On Jun 28, 2018, at 12:10 AM, Jeff Law  wrote:
> 
> 
> So this generally looks pretty good.  THe biggest technical concern is
> making sure we're doing the right thing WRT issuing warnings.  You can
> tackle that problem by deferring inlining to a later point after
> warnings have been issued or by verifying that your routines do not
> inline in cases where warnings will be issued.  It may be worth adding
> testcases for these issues.
> 
> There's a large number of comments that need capitalization fixes.
> 
> Given there was no measured runtime performance impact, but slight
> improvements on codesize for values <= 3, let's go ahead with that as
> the default.
> 
> Can you address the issues above and repost for final review?
> 
> Thanks,
> jeff



Re: [PATCH][Middle-end]3rd patch of PR78809

2018-07-09 Thread Qing Zhao
Hi, Martin,

thanks a lot for your comments. 

> On Jul 5, 2018, at 11:30 AM, Martin Sebor  wrote:
> 
> One of the basic design principles that I myself have
> accidentally violated in the past is that warning options
> should not impact the emitted object code.  I don't think
> your patch actually does introduce this dependency by having
> the codegen depend on the result of check_access() -- I'm
> pretty sure the function is designed to do the validation
> irrespective of warning options and return based on
> the result of the validation and not based on whether
> a warning was issued.  But the choice of the variable name,
> no_overflow_warn, suggests that it does, in fact, have this
> effect.  So I would suggest to rename the variable and add
> a test that verifies that this dependency does not exist.

I agree that warning options should not impact the emitted object code. 
and my current change seems violate this principle:

in the following change:

+  bool no_overflow_warn = true;

   /* Diagnose calls where the specified length exceeds the size of either
  object.  */
   if (warn_stringop_overflow)
 {
   tree size = compute_objsize (arg1, 0);
-  if (check_access (exp, /*dst=*/NULL_TREE, /*src=*/NULL_TREE, len,
-   /*maxread=*/NULL_TREE, size, /*objsize=*/NULL_TREE))
+  no_overflow_warn = check_access (exp, /*dst=*/NULL_TREE, 
/*src=*/NULL_TREE,
+  len, /*maxread=*/NULL_TREE, size,
+  /*objsize=*/NULL_TREE);
+  if (no_overflow_warn) 
{
  size = compute_objsize (arg2, 0);
- check_access (exp, /*dst=*/NULL_TREE, /*src=*/NULL_TREE, len,
-   /*maxread=*/NULL_TREE, size, /*objsize=*/NULL_TREE);
+ no_overflow_warn = check_access (exp, /*dst=*/NULL_TREE, 
/*src=*/NULL_TREE,
+  len,  /*maxread=*/NULL_TREE, size,
+  /*objsize=*/NULL_TREE);
}
 }

+  /* Due to the performance benefit, always inline the calls first 
+ when result_eq is false.  */
+  rtx result = NULL_RTX;
+   
+  if (!result_eq && fcode != BUILT_IN_BCMP && no_overflow_warn) 
+{
+  result = inline_expand_builtin_string_cmp (exp, target, true);
+  if (result)

The variable no_overflow_warn DEPENDs on the warning option 
warn_stringop_overflow, and this
variable is used to control the code generation.  such behavior seems violate 
the above mentioned
principle.

However, this is not a problem that can be easily fixed based on the the 
current design, which has the following issues as my
understanding:

1. the routine check_access issues warnings by default, then it seems 
necessary to guard the call
to this routine with the warning option;
2. then the returned value of the routine check_access has to depend on 
the warning option.

in order to fix the current problem I have, an approach is to rewrite the 
routine check_access to guard the issue warning inside
the routine with the warning option passed as an additional parameter.

let me know anything I am missing so far.

> 
> Beyond that, an enhancement to this optimization that might
> be worth considering is inlining even non-constant calls
> with array arguments whose size is no greater than the limit.
> As in:
> 
>  extern char a[4], *b;
> 
>  int n = strcmp (a, b);
> 
> Because strcmp arguments are required to be nul-terminated
> strings, a's length above must be at most 3.  This is analogous
> to similar optimizations GCC performs, such as folding to zero
> calls to strlen() with one-element arrays.

Yes, I agree that this will be another good enhancement to the strcmp inlining.

however, it’s not easy to be integrated with my current patch.  The major issue 
is:

 The inlined code for the strcmp call without string constant will be 
different than the inlined code for the
strcmp call with string constant,  then:

1. the default value for the threshold that control the maximum length 
of the string length for inlining will
be different than the one for the strcmp call with string constant,  more 
experiments need to be run and a new parameter
need to be added to control this;
2. the inlined transformed code will be different than the current one. 

based on the above, I’d like to open a new PR to record this new enhancement 
and finish it with a new patch later.

what’s your opinion on this?

Qing
> 
> Martin



Re: [PATCH][Middle-end]3rd patch of PR78809

2018-07-10 Thread Qing Zhao


> On Jul 9, 2018, at 3:25 PM, Martin Sebor  wrote:
> 
> check_access() calls warning_at() to issue warnings, and that
> function only issues warnings if they are enabled, so the guard
> isn't necessary to make it work this way.

Okay I see.

then, in the current code: (for routine expand_builtin_memcmp)

  /* Diagnose calls where the specified length exceeds the size of either
 object.  */
  if (warn_stringop_overflow)
{
  tree size = compute_objsize (arg1, 0);
  if (check_access (exp, /*dst=*/NULL_TREE, /*src=*/NULL_TREE, len,
/*maxread=*/NULL_TREE, size, /*objsize=*/NULL_TREE))
{
  size = compute_objsize (arg2, 0);
  check_access (exp, /*dst=*/NULL_TREE, /*src=*/NULL_TREE, len,
/*maxread=*/NULL_TREE, size, /*objsize=*/NULL_TREE);
}
}

Is the above condition on variable warn_stringop_overflow unnecessary?
all the warnings inside check_access are controlled by OPT_Wstringop_overflow_.

can I safely delete the above condition if (warn_stringop_overflow)?

> 
>>> Beyond that, an enhancement to this optimization that might
>>> be worth considering is inlining even non-constant calls
>>> with array arguments whose size is no greater than the limit.
>>> As in:
>>> 
>>> extern char a[4], *b;
>>> 
>>> int n = strcmp (a, b);
>>> 
>>> Because strcmp arguments are required to be nul-terminated
>>> strings, a's length above must be at most 3.  This is analogous
>>> to similar optimizations GCC performs, such as folding to zero
>>> calls to strlen() with one-element arrays.
>> 
>> Yes, I agree that this will be another good enhancement to the strcmp 
>> inlining.
>> 
>> however, it’s not easy to be integrated with my current patch.  The major 
>> issue is:
>> 
>>   The inlined code for the strcmp call without string constant will be 
>> different than the inlined code for the
>> strcmp call with string constant,  then:
>> 
>>  1. the default value for the threshold that control the maximum length 
>> of the string length for inlining will
>> be different than the one for the strcmp call with string constant,  more 
>> experiments need to be run and a new parameter
>> need to be added to control this;
>>  2. the inlined transformed code will be different than the current one.
>> 
>> based on the above, I’d like to open a new PR to record this new enhancement 
>> and finish it with a new patch later.
>> 
>> what’s your opinion on this?
> 
> I'm not sure I see the issues above as problems and I would expect
> the non-constant optimization to naturally handle the constant case
> as well.  But if you prefer it that way, implementing the non-constant
> optimization in a separate step sounds reasonable to me.  It's your
> call.

the inlined code for call to strcmp with constant string will only have one 
load instruction for each byte, but for call to strcmp
without constant string, there will be  two load instructions for each byte.  
So, the run time performance impact will be different.
we need separate default values of the maximum length of the string to enable 
the transformation. 

I will create a PR on this and add a new patch after this one.

thanks.

Qing



Re: [PATCH][Middle-end]3rd patch of PR78809

2018-07-10 Thread Qing Zhao
Richard and Martin,

thanks for the info.

> On Jul 10, 2018, at 11:29 AM, Richard Biener  wrote:
>> Is the above condition on variable warn_stringop_overflow unnecessary?
>> all the warnings inside check_access are controlled by
>> OPT_Wstringop_overflow_.
> 
> Well, the condition certainly saves compile time. 



> On Jul 10, 2018, at 11:55 AM, Martin Sebor  wrote:
>> 
>> Is the above condition on variable warn_stringop_overflow unnecessary?
>> all the warnings inside check_access are controlled by 
>> OPT_Wstringop_overflow_.
>> 
>> can I safely delete the above condition if (warn_stringop_overflow)?
> 
> I think the check above is only there to avoid the overhead
> of the two calls to compute_objsize and check_access.  There
> are a few more like it in other functions in the file and
> they all should be safe to remove, but also safe to keep.
> (Some of them might make it easy to inadvertently introduce
> a dependency between the warning option and an optimization
> so that's something to consider.)

currently,  the condition is there for saving compilation time.
However, for my patch, I need the return value of check_access to control 
whether 
to invoking inlining or not,  therefore,  the call to check_access should 
always be
invoked for code generation.  The condition need to be deleted.

let me know if I still miss anything here.

 based on the above, I’d like to open a new PR to record this new 
 enhancement and finish it with a new patch later.
 
 what’s your opinion on this?
>>> 
>>> I'm not sure I see the issues above as problems and I would expect
>>> the non-constant optimization to naturally handle the constant case
>>> as well.  But if you prefer it that way, implementing the non-constant
>>> optimization in a separate step sounds reasonable to me.  It's your
>>> call.
>> 
>> the inlined code for call to strcmp with constant string will only have one 
>> load instruction for each byte, but for call to strcmp
>> without constant string, there will be  two load instructions for each byte. 
>>  So, the run time performance impact will be different.
>> we need separate default values of the maximum length of the string to 
>> enable the transformation.
> 
> You're right, that's true for builtins.c where all we have to
> work with is arrays with unknown contents and string literals.
> The strlen pass, on the other hand, has access to the lengths
> of even unknown strings.  That suggests that an even better
> place for the optimization might be the strlen pass where
> the folding could happen earlier and at a higher level, which
> might even obviate having to worry about the constant vs non-
> constant handling.

Yes, looks like the inlining of call to strcmp with all variable strings might 
need to be done at
strlen pass in order to get more necessary info. 

In addition to this, I still feel that these two inlining could be separated.  
the generated code of inlining of call to strcmp with constant string
could be more optimal than the inlining of call to strcmp without constant 
strings. the cost models are different.

I just created PR:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86467 


for this work.

> 
>> I will create a PR on this and add a new patch after this one.
> 
> Sure, one step at a time makes sense.  I don't think there is
> any harm in having the optimization in two places: builtins.c
> and strlen.

Thanks a lot for your suggestions.

Qing



[PATCH][Middle-end][version 3]3rd patch of PR78809

2018-07-11 Thread Qing Zhao
Hi,   This is the 3rd version of the patch for the last part of PR78809.

the major change in this version is to address the following concerns raised by 
Martin:

> One of the basic design principles that I myself have
> accidentally violated in the past is that warning options
> should not impact the emitted object code.  I don't think
> your patch actually does introduce this dependency by having
> the codegen depend on the result of check_access() -- I'm
> pretty sure the function is designed to do the validation
> irrespective of warning options and return based on
> the result of the validation and not based on whether
> a warning was issued.  But the choice of the variable name,
> no_overflow_warn, suggests that it does, in fact, have this
> effect.  So I would suggest to rename the variable and add
> a test that verifies that this dependency does not exist.

I have addressed this concern as following per our discussion:

1. in routine expand_builtin_memcmp, 
* delete the condition if (warn_stringop_overflow) before check_access;
* change the name of the variable that holds the return value of check_access 
to no_overflow

2. in the testsuite, change the new testcase strcmpopt_6.c to inhibit inlining 
when check_access
detects error (Not depend on whether the warning option is ON or not).

the following is the new patch, tested on both X86 and aarch64, no regression.

Okay for thunk?

thanks.

Qing

gcc/ChangeLog:

+2018-07-11  Qing Zhao  
+
+   PR middle-end/78809
+   * builtins.c (expand_builtin_memcmp): Inline the calls first
+   when result_eq is false.
+   (expand_builtin_strcmp): Inline the calls first.
+   (expand_builtin_strncmp): Likewise.
+   (inline_string_cmp): New routine. Expand a string compare 
+   call by using a sequence of char comparison.
+   (inline_expand_builtin_string_cmp): New routine. Inline expansion
+   a call to str(n)cmp/memcmp.
+   * doc/invoke.texi (--param builtin-string-cmp-inline-length): New 
option.
+   * params.def (BUILTIN_STRING_CMP_INLINE_LENGTH): New.
+

gcc/testsuite/ChangeLog:

+2018-07-11  Qing Zhao  
+
+   PR middle-end/78809
+   * gcc.dg/strcmpopt_5.c: New test.
+   * gcc.dg/strcmpopt_6.c: New test.
+



0001-3nd-Patch-for-PR78009.patch
Description: Binary data


> On Jul 5, 2018, at 10:46 AM, Qing Zhao  wrote:
> 
> Hi,
> 
> I have sent two emails with the updated patches on 7/3:
> 
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00065.html
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00070.html
> 
> however, these 2 emails  were not successfully forwarded to the 
> gcc-patches@gcc.gnu.org mailing list.
> 
> So, I am sending the same email again in this one, hopefully this time it can 
> go through.
> Qing
> 
> Hi, Jeff,
> 
> thanks a lot for your review and comments.
> 
> I have addressed your comments,updated the patch, retested on both
> aarch64 and x86.
> 
> The major changes in this version compared to the previous version are:
> 
>   1. in routine expand_builtin_memcmp:
> * move the inlining transformation AFTER the warning is issues for
> -Wstringop-overflow;
> * only apply inlining when there is No warning is issued.
>   2. in the testsuite, add a new testcase strcmpopt_6.c for this case.
>   3. update comments to:
> * capitalize the first word.
> * capitalize all the arguments.
> 
> NOTE, the routine expand_builtin_strcmp and expand_builtin_strncmp are not 
> changed.
> the reason is:  there is NO overflow checking for these two routines 
> currently.
> if we need overflow checking for these two routines, I think that a separate 
> patch is needed.
> if this is needed, let me know, I can work on this separate patch for issuing 
> warning for strcmp/strncmp when
> -Wstringop-overflow is specified.



Re: [PATCH][Middle-end][version 3]3rd patch of PR78809

2018-07-13 Thread Qing Zhao
thank you.

the patch was just committed into trunk as:

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=262636 
<https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=262636>

Qing
> On Jul 12, 2018, at 12:03 PM, Jeff Law  wrote:
> 
>> 
>> gcc/ChangeLog:
>> 
>> +2018-07-11  Qing Zhao  
>> +
>> +PR middle-end/78809
>> +* builtins.c (expand_builtin_memcmp): Inline the calls first
>> +when result_eq is false.
>> +(expand_builtin_strcmp): Inline the calls first.
>> +(expand_builtin_strncmp): Likewise.
>> +(inline_string_cmp): New routine. Expand a string compare 
>> +call by using a sequence of char comparison.
>> +(inline_expand_builtin_string_cmp): New routine. Inline expansion
>> +a call to str(n)cmp/memcmp.
>> +* doc/invoke.texi (--param builtin-string-cmp-inline-length): New 
>> option.
>> +    * params.def (BUILTIN_STRING_CMP_INLINE_LENGTH): New.
>> +
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> +2018-07-11  Qing Zhao  
>> +
>> +PR middle-end/78809
>> +* gcc.dg/strcmpopt_5.c: New test.
>> +* gcc.dg/strcmpopt_6.c: New test.
> OK
> THanks
> 
> Jeff



[PATCH][Middle-end]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-19 Thread Qing Zhao
Hi, 

As Wilco mentioned in PR78809 after I checked in the last part of 
implementation of inline strcmp:

See  http://www.iso-9899.info/n1570.html
 section 7.24.4:

"The sign of a nonzero value returned by the comparison functions memcmp, 
strcmp, and strncmp is determined 
by the sign of the difference between the values of the first pair of 
characters (both interpreted as unsigned char)
 that differ in the objects being compared."

currently, in my implementation, I used char type when expanding 
strcmp/strncmp, and unsigned char when expanding
memcmp.

from the C standard, we should use unsigned char for all strcmp/strncmp/memcmp.

the change is quite simple, and I have tested it on X86, aarch64 and powerPC, 
no regressions.

Okay for trunk?

Qing

gcc/ChangeLog:

+2018-07-19  Qing Zhao  
+
+   * builtins.c (expand_builtin_memcmp): Delete the last parameter for
+   call to inline_expand_builtin_string_cmp.
+   (expand_builtin_strcmp): Likewise.
+   (expand_builtin_strncmp): Likewise.
+   (inline_string_cmp): Delete the last parameter, change char_type_node
+   to unsigned_char_type_node for strcmp/strncmp;
+   (inline_expand_builtin_string_cmp): Delete the last parameter.
+


78809C_uchar.patch
Description: Binary data


Re: [PATCH][Middle-end]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-19 Thread Qing Zhao
Jakub,

thanks a lot for you review and comments.

> On Jul 19, 2018, at 12:31 PM, Jakub Jelinek  wrote:
> 
> On Thu, Jul 19, 2018 at 11:49:16AM -0500, Qing Zhao wrote:
>> As Wilco mentioned in PR78809 after I checked in the last part of 
>> implementation of inline strcmp:
>> 
>> See  http://www.iso-9899.info/n1570.html
>> section 7.24.4:
>> 
>> "The sign of a nonzero value returned by the comparison functions memcmp, 
>> strcmp, and strncmp is determined 
>> by the sign of the difference between the values of the first pair of 
>> characters (both interpreted as unsigned char)
>> that differ in the objects being compared."
>> 
>> currently, in my implementation, I used char type when expanding 
>> strcmp/strncmp, and unsigned char when expanding
>> memcmp.
>> 
>> from the C standard, we should use unsigned char for all 
>> strcmp/strncmp/memcmp.
>> 
>> the change is quite simple, and I have tested it on X86, aarch64 and 
>> powerPC, no regressions.
>> 
>> Okay for trunk?
> 
> If you expand it as (int) ((unsigned char *)p)[n] - (int) ((unsigned char 
> *)q)[n]
> then aren't you relying on int type to have wider precision than unsigned char
> (or unit_mode being narrower than mode)?

do you imply that we should only expand it as  (int) ((unsigned char *)p)[n] - 
(int) ((unsigned char *)q)[n] when we are sure
int type is wider than unsigned char? 

>  I don't see anywhere where you'd
> give up on doing the inline expansion on targets where e.g. lowest
> addressable unit would be 16-bit and int would be 16-bit too.

even on this targets, is char type still 8-bit?
then int type is still wider than char?

> On targets where int is as wide as char, one would need to expand it instead
> as something like:
> if (((unsigned char *)p)[n] == ((unsigned char *)q)[n]) loop;
> ret = ((unsigned char *)p)[n] < ((unsigned char *)q)[n] ? -1 : 1;
> or similar or just use the library routine.


even when int type is as wide as char,  expand it as (int) ((unsigned char 
*)p)[n] - (int) ((unsigned char *)q)[n]
should still be correct (even though not optimal), doesn’t it?

do I miss anything in this part?

> 
> Also:
>  var_rtx
>= adjust_address (var_rtx_array, TYPE_MODE (unit_type_node), offset);
>  const_rtx = c_readstr (const_str + offset, unit_mode);
>  rtx op0 = (const_str_n == 1) ? const_rtx : var_rtx;
>  rtx op1 = (const_str_n == 1) ? var_rtx : const_rtx;
> 
>  result = expand_simple_binop (mode, MINUS, op0, op1,
>result, is_memcmp ? 1 : 0, OPTAB_WIDEN);
> doesn't look correct to me, var_rtx and const_rtx here are in unit_mode,
> you need to convert those to mode before you can use those in
> expand_simple_binop, using
>  op0 = convert_modes (mode, unit_mode, op0, 1);
>  op1 = convert_modes (mode, unit_mode, op1, 1);
> before the expand_simple_binop.
> While expand_simple_binop is called with an unsignedp argument, that is
> meant for the cases where the expansion needs to widen it further, not for
> calling expand_simple_binop with arguments with known incorrect mode;
> furthermore, one of them being CONST_INT which has VOIDmode.

thank you for raising this issue, Yes, I will update this part of the code as 
you suggested.

Qing



Re: [PATCH][Middle-end]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-19 Thread Qing Zhao


> On Jul 19, 2018, at 2:24 PM, Jakub Jelinek  wrote:
> 
> On Thu, Jul 19, 2018 at 02:06:16PM -0500, Qing Zhao wrote:
>>> If you expand it as (int) ((unsigned char *)p)[n] - (int) ((unsigned char 
>>> *)q)[n]
>>> then aren't you relying on int type to have wider precision than unsigned 
>>> char
>>> (or unit_mode being narrower than mode)?
>> 
>> do you imply that we should only expand it as  (int) ((unsigned char *)p)[n] 
>> - (int) ((unsigned char *)q)[n] when we are sure
>> int type is wider than unsigned char? 
> 
> Yes.
> 
>>> I don't see anywhere where you'd
>>> give up on doing the inline expansion on targets where e.g. lowest
>>> addressable unit would be 16-bit and int would be 16-bit too.
>> 
>> even on this targets, is char type still 8-bit?
>> then int type is still wider than char?
> 
> C requires that int is at least 16-bit wide, so the sizeof (int) == sizeof
> (char) case is only possible say with 16-bit char and 16-bit int, or 32-bit
> char and 32-bit int etc.
> 
>>> On targets where int is as wide as char, one would need to expand it instead
>>> as something like:
>>> if (((unsigned char *)p)[n] == ((unsigned char *)q)[n]) loop;
>>> ret = ((unsigned char *)p)[n] < ((unsigned char *)q)[n] ? -1 : 1;
>>> or similar or just use the library routine.
>> 
>> 
>> even when int type is as wide as char,  expand it as (int) ((unsigned char 
>> *)p)[n] - (int) ((unsigned char *)q)[n]
>> should still be correct (even though not optimal), doesn’t it?
> 
> No.  Consider p[n] being e.g. 1 and q[n] being __SCHAR_MAX__ + 3U and 16-bit
> int and 16-bit char.  Then (unsigned char) 0x0001 < (unsigned char) 0x8002,
> so it should return a negative number.  But (int) (0x0001U - 0x8002U) is
> 0x7fff, which is a positive int.  Now, if int is 17-bit and char is 16-bit,
> this works fine, because is then -0x8001 and thus negative.

Okay, I see now.
really appreciate for your detailed explanation.
> 
> The above really works only if int is at least one bit wider than unsigned
> char.

Then, I will add a check to exclude the inlining when int is NOT wider than 
unsigned char on the target.

is the following the correct check:  (exp is the call to strcmp)

 if (CHAR_TYPE_SIZE >= TYPE_PRECISION (TREE_TYPE (exp)))

 
Thanks.

Qing




[PATCH][Middle-end][version 2]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-20 Thread Qing Zhao
Hi,

this is the 2nd version of the change, mainly addressed Jakub’s comments:

1. Give up the inlining expansion for strcmp/strncmp/memcmp on a target
where the type of the call has same or narrower presicion than unsigned
char.
2.  add conversions before expand_simple_binop to the two operands.

and
3. also updated comments of routine inline_string_cmp to reflect the conversions
in the expanded code.

have tested on X86 and aarch64. No regressions.

Okay for thunk?

Qing

gcc/ChangeLog:

+2018-07-20  Qing Zhao  
+
+   * builtins.c (expand_builtin_memcmp): Delete the last parameter for
+   call to inline_expand_builtin_string_cmp.
+   (expand_builtin_strcmp): Likewise.
+   (expand_builtin_strncmp): Likewise.
+   (inline_string_cmp): Delete the last parameter, change char_type_node
+   to unsigned_char_type_node for strcmp/strncmp, add conversions to the
+   two operands.
+   (inline_expand_builtin_string_cmp): Delete the last parameter, give up
+   the inlining expansion on target where the type of the call has same or 
+   narrower presicion than unsigned char.
+



78809C_uchar.patch
Description: Binary data


> On Jul 19, 2018, at 12:31 PM, Jakub Jelinek  wrote:
> 
> If you expand it as (int) ((unsigned char *)p)[n] - (int) ((unsigned char 
> *)q)[n]
> then aren't you relying on int type to have wider precision than unsigned char
> (or unit_mode being narrower than mode)?  I don't see anywhere where you'd
> give up on doing the inline expansion on targets where e.g. lowest
> addressable unit would be 16-bit and int would be 16-bit too.
> On targets where int is as wide as char, one would need to expand it instead
> as something like:
> if (((unsigned char *)p)[n] == ((unsigned char *)q)[n]) loop;
> ret = ((unsigned char *)p)[n] < ((unsigned char *)q)[n] ? -1 : 1;
> or similar or just use the library routine.
> 
> Also:
>  var_rtx
>= adjust_address (var_rtx_array, TYPE_MODE (unit_type_node), offset);
>  const_rtx = c_readstr (const_str + offset, unit_mode);
>  rtx op0 = (const_str_n == 1) ? const_rtx : var_rtx;
>  rtx op1 = (const_str_n == 1) ? var_rtx : const_rtx;
> 
>  result = expand_simple_binop (mode, MINUS, op0, op1,
>result, is_memcmp ? 1 : 0, OPTAB_WIDEN);
> doesn't look correct to me, var_rtx and const_rtx here are in unit_mode,
> you need to convert those to mode before you can use those in
> expand_simple_binop, using
>  op0 = convert_modes (mode, unit_mode, op0, 1);
>  op1 = convert_modes (mode, unit_mode, op1, 1);
> before the expand_simple_binop.
> While expand_simple_binop is called with an unsignedp argument, that is
> meant for the cases where the expansion needs to widen it further, not for
> calling expand_simple_binop with arguments with known incorrect mode;
> furthermore, one of them being CONST_INT which has VOIDmode.



Re: [PATCH][Middle-end][version 2]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-20 Thread Qing Zhao


> On Jul 20, 2018, at 9:59 AM, Jakub Jelinek  wrote:
> 
> On Fri, Jul 20, 2018 at 09:53:24AM -0500, Qing Zhao wrote:
>> +2018-07-20  Qing Zhao  
>> +
>> +   * builtins.c (expand_builtin_memcmp): Delete the last parameter for
>> +   call to inline_expand_builtin_string_cmp.
>> +   (expand_builtin_strcmp): Likewise.
>> +   (expand_builtin_strncmp): Likewise.
>> +   (inline_string_cmp): Delete the last parameter, change char_type_node
>> +   to unsigned_char_type_node for strcmp/strncmp, add conversions to the
>> +   two operands.
>> +   (inline_expand_builtin_string_cmp): Delete the last parameter, give 
>> up
>> +   the inlining expansion on target where the type of the call has same 
>> or 
>> +   narrower presicion than unsigned char.
> 
> s/presicion/precision/
> 
> Also in the patch, where there is another typo, s/of/or/.

Okay.
> 
> Ok for trunk with that fixed.

thanks a lot for the review.

Qing
> 
>   Jakub



Re: [PATCH][Middle-end][version 2]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-20 Thread Qing Zhao
the patch was committed as:

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=262907 
<https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=262907>

thanks.

Qing
> On Jul 20, 2018, at 9:59 AM, Jakub Jelinek  wrote:
> 
> On Fri, Jul 20, 2018 at 09:53:24AM -0500, Qing Zhao wrote:
>> +2018-07-20  Qing Zhao  
>> +
>> +   * builtins.c (expand_builtin_memcmp): Delete the last parameter for
>> +   call to inline_expand_builtin_string_cmp.
>> +   (expand_builtin_strcmp): Likewise.
>> +   (expand_builtin_strncmp): Likewise.
>> +   (inline_string_cmp): Delete the last parameter, change char_type_node
>> +   to unsigned_char_type_node for strcmp/strncmp, add conversions to the
>> +   two operands.
>> +   (inline_expand_builtin_string_cmp): Delete the last parameter, give 
>> up
>> +   the inlining expansion on target where the type of the call has same 
>> or 
>> +   narrower presicion than unsigned char.
> 
> s/presicion/precision/
> 
> Also in the patch, where there is another typo, s/of/or/.
> 
> Ok for trunk with that fixed.
> 
>   Jakub



[PATCH][Middle-end] disable strcmp/strncmp inlining with O2 below and Os

2018-07-25 Thread Qing Zhao
Hi,

As Wilco suggested, the new added strcmp/strncmp inlining should be only 
enabled with O2 and above.

this is the simple patch for this change.

tested on both X86 and aarch64.

Okay for thunk?

Qing

gcc/ChangeLog:

+2018-07-25  Qing Zhao  
+
+   * builtins.c (inline_expand_builtin_string_cmp): Disable inlining
+   when optimization level is lower than 2 or optimize for size.
+   

gcc/testsuite/ChangeLog:

+2018-07-25  Qing Zhao  
+
+   * gcc.dg/strcmpopt_5.c: Change to O2 to enable the transformation.
+   * gcc.dg/strcmpopt_6.c: Likewise.
+



78809_O2.patch
Description: Binary data


Re: [PATCH][Middle-end] disable strcmp/strncmp inlining with O2 below and Os

2018-07-26 Thread Qing Zhao


> On Jul 26, 2018, at 3:26 AM, Richard Biener  wrote:
> 
> On Wed, 25 Jul 2018, Qing Zhao wrote:
> 
>> Hi,
>> 
>> As Wilco suggested, the new added strcmp/strncmp inlining should be only 
>> enabled with O2 and above.
>> 
>> this is the simple patch for this change.
>> 
>> tested on both X86 and aarch64.
>> 
>> Okay for thunk?
> 
> You should simply use
> 
>  if (optimize_insn_for_size_p ())
>return NULL_RTX;
> 
> to be properly profile-aware.  OK with that change.

thanks for the review.

I will make the change, retest it, and then commit it.

Qing
> 
> Richard.
> 



Re: Improve -Wflex-array-member-not-at-end changes.html wording |Plus: and warning bug? (was: [V2][PATCH] gcc-14/changes.html: Deprecate a GCC C extension on flexible array members.)

2023-10-19 Thread Qing Zhao
Hi, Tobias,

Sorry for the late reply (just came back from a long vacation after Cauldron).

And thank you for reporting this issue.

Please see my reply embedded below:

> On Sep 25, 2023, at 2:24 PM, Tobias Burnus  wrote:
> 
> Hi all,
> 
> I stumbled over this as I found the wording in the release notes rather 
> unclear.is.
> 
> 
> First, the following gives only a -pedantic warning and not a 
> -Wflex-array-member-not-at-end:
> 
>  struct t { int b; int x[]; };
>  struct q { int b; struct t a[2]; int c; };
> 
> warning: invalid use of structure with flexible array member [-Wpedantic]
> 
> If I remove the "[2]", it shows additionally:
>  warning: structure containing a flexible array member is not at the end of 
> another structure [-Wflex-array-member-not-at-end]

I think that the above behavior is correct as Richard mentioned previously.  -:)

First, by C99, a structure with flexible array member cannot be an element of 
an array, 
Therefore, struct t a[2] is an incorrect usage, the warning:

warning: invalid use of structure with flexible array member [-Wpedantic]

is complaining about this standard violation.  Though the diagnositic  message 
might need to be more specific like the following:

Warning: invalid use of structure with flexible array member as array element 
[-Wpedantic]

Then, after fixing this standard violation issue, the testing case is as 
following:

 struct t { int b; int x[]; };
 struct q { int b; struct t a; int c; };

adding -Wflex-array-member-not-at-end will report the expecting message:

/home/opc/Install/latest/bin/gcc -O  -Wflex-array-member-not-at-end t.c -S
t.c:2:29: warning: structure containing a flexible array member is not at the 
end of another structure [-Wflex-array-member-not-at-end]
2 |  struct q { int b; struct t a; int c; };
  | ^

So, I think that GCC’s behavior is correct. 

However, we might need to make the diagnostic message more accurate here (I can 
submit a small patch to improve this). 

> 
> It seems as if it should print latter warning also inside the struct.
> 
> Qing? Joseph? Thoughts?
> 
> * * *
> 
> Secondly, if this is deprecated, shouldn't then the warning enabled by, e.g., 
> -Wall or made
> otherwise more prominent? (-std=?) - Currently, one either has to find the 
> new flag or use
> -pedantic.

Yes, agreed, However, I think that it might be better to delay this to next GCC 
release by giving users plenty time to fix all the 
-Wflex-array-member-not-at-end warnings.  As I know, linux kernel exposed a lot 
of warnings when adding -Wflex-array-member-not-at-end, and kernel people are 
trying to fix all these warnings in the source base.  

> 
> Or is this not really regarded as deprecated? But then (IMHO) we should not 
> really claim so and just
> add the warning without deprecation.

I think that our final goal is to deprecate this ambiguous extension from GCC 
completely, but we need time to mitigate users step by step. 
> 
> BTW; clang-15 prints the -Wgnu-variable-sized-type-not-at-end warning by 
> default.
> 
> Joseph, all: Thoughts?
> 
> * * *
> 
> Cross ref: The patch adding the new warning is r14-2197-g070a6bf0bdc6761
> https://gcc.gnu.org/pipermail/gcc-cvs/2023-June/385730.html (cf. previously 
> in this thread)
> 
> 
> * * *
> 
> Regarding the changes.html wording:
> 
> On 07.08.23 16:22, Qing Zhao via Gcc-patches wrote:
> 
>> Comparing to the 1st version, the only change is to address Richard's
>> comment on refering a warning option for diagnosing deprecated behavior.
> ...
>> +++ b/htdocs/gcc-14/changes.html
>> @@ -30,7 +30,18 @@ a work-in-progress.
>>  
>>  Caveats
>>  
>> -  ...
>> +  C:
>> +  Support for the GCC extension, a structure containing a C99 flexible 
>> array
>> +  member, or a union containing such a structure, is not the last field 
>> of
>> +  another structure, is deprecated. Refer to
>> +  https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html";>
>> +  Zero Length Arrays.
> 
> ...
> 
> I find the first sentence difficult to read. What do you think of the 
> following?
> (It is hard to come up with some good wording.)
> 
> --- a/htdocs/gcc-14/changes.html
> +++ b/htdocs/gcc-14/changes.html
> @@ -31,9 +31,10 @@ a work-in-progress.
> Caveats
> 
>   C:
> -  Support for the GCC extension, a structure containing a C99 flexible 
> array
> -  member, or a union containing such a structure, is not the last field 
> of
> -  another structure, is deprecated. Refer to
> +  Support for the GCC extension that a structure containing a C99 
> flexible
> +  array (and any union containing a member of such

HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-20 Thread Qing Zhao
Sid,

(Richard, can you please help me to make sure this? Thanks a lot)

I studied a little bit more on the following question you raised during the 
review process:

For the following small testing case: 

  1 struct annotated {
  2   int foo;
  3   char array[] __attribute__((counted_by (foo)));
  4 };
  5 
  6 extern struct annotated * alloc_buf (int);
  7 
  8 int test (int sz)
  9 {
 10   struct annotated * array_annotated = alloc_buf (sz);
 11   array_annotated->foo = sz;
 12   return __builtin_dynamic_object_size (array_annotated->array, 1);
 13 }

Whether the assignment of the size to the counted_by field at line 11 and the 
consumer of the size at line 12 at call to __bdos might be reordered by GCC? 

The following is my thought:

1. _bdos computation passes (both pass_early_object_sizes and 
pass_object_sizes) are in the early stage of SSA optimizations. In which, 
pass_early_object_sizes happens before almost all the optimizations, no 
reordering is possible in this pass;

2. Then how about the pass “pass_object_sizes”?

   Immediately after the pass_build_ssa,  the IR for the routine “test” is  
with the SSA form: (compiled with -O3):

  1 int test (int sz)
  2 {
  3   struct annotated * array_annotated;
  4   char[0:] * _1;
  5   long unsigned int _2;
  6   int _8;
  7   
  8:
  9   array_annotated_6 = alloc_buf (sz_4(D));
 10   array_annotated_6->foo = sz_4(D);
 11   _1 = &array_annotated_6->array;
 12   _2 = __builtin_dynamic_object_size (_1, 1);
 13   _8 = (int) _2;
 14   return _8; 
 15 } 

In the above IR, the key portion is line 10 and line 11: (whether these two 
lines might be reordered with SSA optimization?)

 10   array_annotated_6->foo = sz_4(D);
 11   _1 = &array_annotated_6->array;

The major question here is: whether the SSA optimizations are able to 
distinguish the object “array_annotated_6->foo” at line 10 is independent with
the object “array_annotated-_6->array” at line 11?

If the SSA optimizations can distinguish “array_annotated_6->foo” from 
“array_annotated_6->array”, then these two lines might be reordered.
Otherwise, these two lines will not be reordered by SSA optimizations.

I am not very familiar with the details of the SSA optimizations, but my guess 
is, two fields of the same structure might not be distinguished by the SSA 
optimizations, then line 10 and line 11 will not be reordered by SSA 
optimizations.

Richard, is my guess correct?

Thanks a lot for your help.

Qing

>> On Oct 5, 2023, at 4:08 PM, Siddhesh Poyarekar  wrote:
>> 
>> I hope the review was helpful.  Overall, a couple of things to consider:
>> 
>> 1. How would you handle potential reordering between assignment of the size 
>> to the counted_by field with the __bdos call that may consume it? You'll 
>> probably need to express some kind of dependency there or in the worst case, 
>> insert a barrier to disallow reordering.
> 
> Good point! 
> 
> So, your example in the respond to [V3][PATCH 2/3]Use the counted_by atribute 
> info in builtin object size [PR108896]:
> “
> Maybe another test where the allocation, size assignment and __bdos call 
> happen in the same function, where the allocator is not recognized by gcc:
> 
> void *
> __attribute__ ((noinline))
> alloc (size_t sz)
> {
> return __builtin_malloc (sz);
> }
> 
> void test (size_t sz)
> {
> array_annotated = alloc (sz);
> array_annotated->b = sz;
> return __builtin_dynamic_object_size (array_annotated->c, 1);
> }
> 
> The interesting thing to test (and ensure in the codegen) is that the 
> assignment to array_annotated->b does not get reordered to below the 
> __builtin_dynamic_object_size call since technically there is no data 
> dependency between the two.
> “
> Will test on this. 
> 
> Not sure whether the current GCC alias analysis is able to distinguish one 
> field of a structure from another field of the same structure, if YES, then
> We need to add an explicit dependency edge from the write to 
> “array_annotated->b” to the call to 
> “__builtin_dynamic_object_size(array_annotated->c,1)”.
> I will check on this and see how to resolve this issue.
> 
> I guess the possible solution is that we can add an implicit ref to 
> “array_annotated->b” at the call to 
> “__builtin_dynamic_object_size(array_annotated->c, 1)” if the counted_by 
> attribute is available. That should resolve the issue.
> 
> Richard, what do you think on this?
> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-20 Thread Qing Zhao


> On Oct 20, 2023, at 2:22 PM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 20.10.2023 um 19:09 schrieb Qing Zhao :
>> 
>> Sid,
>> 
>> (Richard, can you please help me to make sure this? Thanks a lot)
>> 
>> I studied a little bit more on the following question you raised during the 
>> review process:
>> 
>> For the following small testing case: 
>> 
>> 1 struct annotated {
>> 2   int foo;
>> 3   char array[] __attribute__((counted_by (foo)));
>> 4 };
>> 5 
>> 6 extern struct annotated * alloc_buf (int);
>> 7 
>> 8 int test (int sz)
>> 9 {
>> 10   struct annotated * array_annotated = alloc_buf (sz);
>> 11   array_annotated->foo = sz;
>> 12   return __builtin_dynamic_object_size (array_annotated->array, 1);
>> 13 }
>> 
>> Whether the assignment of the size to the counted_by field at line 11 and 
>> the consumer of the size at line 12 at call to __bdos might be reordered by 
>> GCC? 
>> 
>> The following is my thought:
>> 
>> 1. _bdos computation passes (both pass_early_object_sizes and 
>> pass_object_sizes) are in the early stage of SSA optimizations. In which, 
>> pass_early_object_sizes happens before almost all the optimizations, no 
>> reordering is possible in this pass;
>> 
>> 2. Then how about the pass “pass_object_sizes”?
>> 
>>  Immediately after the pass_build_ssa,  the IR for the routine “test” is  
>> with the SSA form: (compiled with -O3):
>> 
>> 1 int test (int sz)
>> 2 {
>> 3   struct annotated * array_annotated;
>> 4   char[0:] * _1;
>> 5   long unsigned int _2;
>> 6   int _8;
>> 7   
>> 8:
>> 9   array_annotated_6 = alloc_buf (sz_4(D));
>> 10   array_annotated_6->foo = sz_4(D);
>> 11   _1 = &array_annotated_6->array;
>> 12   _2 = __builtin_dynamic_object_size (_1, 1);
>> 13   _8 = (int) _2;
>> 14   return _8; 
>> 15 } 
>> 
>> In the above IR, the key portion is line 10 and line 11: (whether these two 
>> lines might be reordered with SSA optimization?)
>> 
>> 10   array_annotated_6->foo = sz_4(D);
>> 11   _1 = &array_annotated_6->array;
>> 
>> The major question here is: whether the SSA optimizations are able to 
>> distinguish the object “array_annotated_6->foo” at line 10 is independent 
>> with
>> the object “array_annotated-_6->array” at line 11?
>> 
>> If the SSA optimizations can distinguish “array_annotated_6->foo” from 
>> “array_annotated_6->array”, then these two lines might be reordered.
>> Otherwise, these two lines will not be reordered by SSA optimizations.
>> 
>> I am not very familiar with the details of the SSA optimizations, but my 
>> guess is, two fields of the same structure might not be distinguished by the 
>> SSA optimizations, then line 10 and line 11 will not be reordered by SSA 
>> optimizations.
>> 
>> Richard, is my guess correct?
> 
> There is no data dependence between the memory access and the address 
> computation so nothing prevents the reordering.  

Okay, I see.  then:

10   array_annotated_6->foo = sz_4(D);
11   _1 = &array_annotated_6->array;

Line 10 and line 11 could be reordered.

And then
10   array_annotated_6->foo = sz_4(D);
12   _2 = __builtin_dynamic_object_size (_1, 1);

Line 10 and 12 could be reordered too.

Then what’s the best way to add such data dependence in the IR?

How about the following:

  Add one more parameter to __builtin_dynamic_object_size(), i.e 

__builtin_dynamic_object_size (_1,1,array_annotated->foo)? 

When we see the structure field has counted_by attribute. 

Then we can enforce such data dependence and avoid potential reordering.

What’s your opinion? Do you have other suggestion on the solution?

Qing



If you put another same bos call before the access I expect the addresses to be 
CSEd, effectively moving the later before the access.
> 
> Richard 
> 
>> Thanks a lot for your help.
>> 
>> Qing
>> 
>>>>> On Oct 5, 2023, at 4:08 PM, Siddhesh Poyarekar  
>>>>> wrote:
>>>> 
>>>> I hope the review was helpful.  Overall, a couple of things to consider:
>>>> 
>>>> 1. How would you handle potential reordering between assignment of the 
>>>> size to the counted_by field with the __bdos call that may consume it? 
>>>> You'll probably need to express some kind of dependency there or in the 
>>>> worst case, insert a barrier to disallow reordering.
>>> 
>>> Good point! 
>>> 
>

Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-20 Thread Qing Zhao


> On Oct 20, 2023, at 2:34 PM, Kees Cook  wrote:
> 
> On Fri, Oct 20, 2023 at 11:50:11AM +0200, Martin Uecker wrote:
>> Am Donnerstag, dem 19.10.2023 um 16:33 -0700 schrieb Kees Cook:
>>> On Wed, Oct 18, 2023 at 09:11:43PM +, Qing Zhao wrote:
>>>> As I replied to Martin in another email, I plan to do the following to 
>>>> resolve this issue:
>>>> 
>>>> 1. No specification for signed or unsigned for counted_by field.
>>>> 2. Add a sanitizer option -fsanitize=counted-by-bound to catch the cases 
>>>> when the size of the counted-by is not positive.
>>> 
>>> I don't understand why this needs to be a runtime sanitizer. The
>>> signedness is known at compile time, so I would expect a -W option.
>> 
>> The signedness of the type but not of the value.
>> 
>> But I would not want to have a warning for signed 
>> counter  types by default because I would prefer
>> to use signed types (for various reasons including
>> better overflow detection).
>> 
>>> Or
>>> do you mean you'd split up -fsanitize=bounds between unsigned and signed
>>> indexes? I'd find that kind of awkward for the kernel... but I feel like
>>> I've misunderstood something. :)
>>> 
>>> -Kees
>> 
>> The idea would be to detect at run-time the case
>> if  x->buf  is used at a time where   x->counter 
>> is negative and also when x->counter * sizeof(x->buf[0])
>> overflows or is too big.
>> 
>> This would be similar to
>> 
>> int a[n];
>> 
>> where it is detected at run-time if n is not-positive.
> 
> Right. I guess what I mean to say is that I would expect this case to
> already be caught by -fsanitize=bounds -- I don't see a reason to add an
> additional sanitizer option.
> 
> struct foo {
>   int count;
>   int array[] __counted_by(count);
> };
> 
>   foo->count = 5;
>   foo->array[0] = 1;  // ok
>   foo->array[10] = 1; // -fsanitize=bounds will catch this
>   foo->array[-10] = 1;// -fsanitize=bounds will catch this too
> 
> 

just checked this testing case with my GCC, and YES, -fsanitize=bounds indeed 
caught this error:

ttt_1.c:31:12: runtime error: index 10 out of bounds for type 'char [*]'
ttt_1.c:32:12: runtime error: index -10 out of bounds for type 'char [*]’

Qing


> -- 
> Kees Cook



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-20 Thread Qing Zhao


> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-20 14:38, Qing Zhao wrote:
>> How about the following:
>>   Add one more parameter to __builtin_dynamic_object_size(), i.e
>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>> When we see the structure field has counted_by attribute.
> 
> Or maybe add a barrier preventing any assignments to array_annotated->foo 
> from being reordered below the __bdos call? Basically an __asm__ with 
> array_annotated->foo in the clobber list ought to do it I think.

Maybe just adding the array_annotated->foo to the use list of the call to 
__builtin_dynamic_object_size should be enough?

But I am not sure how to implement this in the TREE level, is there a 
USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the counted_by 
field “array_annotated->foo” to the USE_LIST of the call to __bdos?

This might be the simplest solution?

Qing

> 
> It may not work for something like this though:
> 
> static size_t
> get_size_of (void *ptr)
> {
>  return __bdos (ptr, 1);
> }
> 
> void
> foo (size_t sz)
> {
>  array_annotated = __builtin_malloc (sz);
>  array_annotated = sz;
> 
>  ...
>  __builtin_printf ("%zu\n", get_size_of (array_annotated->foo));
>  ...
> }
> 
> because the call to get_size_of () may not have been inlined that early.
> 
> The more fool-proof alternative may be to put a compile time barrier right 
> below the assignment to array_annotated->foo; I reckon you could do that 
> early in the front end by marking the size identifier and then tracking 
> assignments to that identifier.  That may have a slight runtime performance 
> overhead since it may prevent even legitimate reordering.  I can't think of 
> another alternative at the moment...
> 
> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 3:57 AM, Richard Biener  
> wrote:
> 
> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  wrote:
>>> 
>>> On 2023-10-20 14:38, Qing Zhao wrote:
>>>> How about the following:
>>>>  Add one more parameter to __builtin_dynamic_object_size(), i.e
>>>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>>>> When we see the structure field has counted_by attribute.
>>> 
>>> Or maybe add a barrier preventing any assignments to array_annotated->foo 
>>> from being reordered below the __bdos call? Basically an __asm__ with 
>>> array_annotated->foo in the clobber list ought to do it I think.
>> 
>> Maybe just adding the array_annotated->foo to the use list of the call to 
>> __builtin_dynamic_object_size should be enough?
>> 
>> But I am not sure how to implement this in the TREE level, is there a 
>> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
>> counted_by field “array_annotated->foo” to the USE_LIST of the call to 
>> __bdos?
>> 
>> This might be the simplest solution?
> 
> If the dynamic object size is derived of a field then I think you need to
> put the "load" of that memory location at the point (as argument)
> of the __bos call right at parsing time.  I know that's awkward because
> you try to play tricks "discovering" that field only late, but that's not
> going to work.

Is it better to do this at gimplification phase instead of FE? 

VLA decls are handled in gimplification phase, the size calculation and call to 
alloca are all generated during this phase. (gimplify_vla_decl).

For __bdos calls, we can add an additional argument if the object’s first 
argument’s type include the counted_by attribute, i.e

***During gimplification, 
For a call to __builtin_dynamic_object_size (ptr, type)
Check whether the type of ptr includes counted_by attribute, if so, change the 
call to
__builtin_dynamic_object_size (ptr, type, counted_by field)

Then the correct data dependence should be represented well in the IR.

**During object size phase,

The call to __builtin_dynamic_object_size will become an expression includes 
the counted_by field or -1/0 when we cannot decide the size, the correct data 
dependence will be kept even the call to __builtin_dynamic_object_size is gone. 


> 
> A related issue is that assignment to the field and storage allocation
> are not tied together

Yes, this is different from VLA, in which, the size assignment and the storage 
allocation are generated and tied together by the compiler.

For the flexible array member, the storage allocation and the size assignment 
are all done by the user. So, We need to clarify such requirement  in the 
document to guide user to write correct code.  And also, we might need to 
provide tools (warnings and sanitizer option) to help users to catch such 
coding error.

> - if there's no use of the size data we might
> remove the store of it as dead.

Yes, when __bdos cannot decide the size, we need to remove the dead store to 
the field.
I guess that the compiler should be able to do this automatically?

thanks.

Qing
> 
> Of course I guess __bos then behaves like sizeof ().
> 
> Richard.
> 
>> 
>> Qing
>> 
>>> 
>>> It may not work for something like this though:
>>> 
>>> static size_t
>>> get_size_of (void *ptr)
>>> {
>>> return __bdos (ptr, 1);
>>> }
>>> 
>>> void
>>> foo (size_t sz)
>>> {
>>> array_annotated = __builtin_malloc (sz);
>>> array_annotated = sz;
>>> 
>>> ...
>>> __builtin_printf ("%zu\n", get_size_of (array_annotated->foo));
>>> ...
>>> }
>>> 
>>> because the call to get_size_of () may not have been inlined that early.
>>> 
>>> The more fool-proof alternative may be to put a compile time barrier right 
>>> below the assignment to array_annotated->foo; I reckon you could do that 
>>> early in the front end by marking the size identifier and then tracking 
>>> assignments to that identifier.  That may have a slight runtime performance 
>>> overhead since it may prevent even legitimate reordering.  I can't think of 
>>> another alternative at the moment...
>>> 
>>> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 8:34 AM, Richard Biener  
> wrote:
> 
> On Mon, Oct 23, 2023 at 1:27 PM Siddhesh Poyarekar  
> wrote:
>> 
>> On 2023-10-23 03:57, Richard Biener wrote:
>>> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  
>>>>> wrote:
>>>>> 
>>>>> On 2023-10-20 14:38, Qing Zhao wrote:
>>>>>> How about the following:
>>>>>>   Add one more parameter to __builtin_dynamic_object_size(), i.e
>>>>>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>>>>>> When we see the structure field has counted_by attribute.
>>>>> 
>>>>> Or maybe add a barrier preventing any assignments to array_annotated->foo 
>>>>> from being reordered below the __bdos call? Basically an __asm__ with 
>>>>> array_annotated->foo in the clobber list ought to do it I think.
>>>> 
>>>> Maybe just adding the array_annotated->foo to the use list of the call to 
>>>> __builtin_dynamic_object_size should be enough?
>>>> 
>>>> But I am not sure how to implement this in the TREE level, is there a 
>>>> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
>>>> counted_by field “array_annotated->foo” to the USE_LIST of the call to 
>>>> __bdos?
>>>> 
>>>> This might be the simplest solution?
>>> 
>>> If the dynamic object size is derived of a field then I think you need to
>>> put the "load" of that memory location at the point (as argument)
>>> of the __bos call right at parsing time.  I know that's awkward because
>>> you try to play tricks "discovering" that field only late, but that's not
>>> going to work.
>>> 
>>> A related issue is that assignment to the field and storage allocation
>>> are not tied together - if there's no use of the size data we might
>>> remove the store of it as dead.
>> 
>> Maybe the trick then is to treat the size data as volatile?  That ought
>> to discourage reordering and also prevent elimination of the "dead" store?
> 
> But we are an optimizing compiler, not a static analysis machine, so I
> fail to see how this is a useful suggestion.
> 
> I think Martins suggestion to approach this as a language extension
> is more useful and would make it easier to handle this?

I agree that making this as a language extension is a better and cleaner 
approach.

As we discussed before, the major issues with the language extension approach 
are:
1. Harder to be adopted by the existing source code due to the potential 
ABI/API change.
2. Much more effort and much longer time to be accepted.

In addition to the above issues, I guess the same issue exists even with a 
language extension, 
Since for FMA, it’s the user (not the compiler) to allocate the storage for the 
FMA. (Should we 
Also move this into compiler for the language extension? Then the existing 
source code need to
Be changed a lot to adopt the new language extension).

As a result, the size  and the storage allocation cannot be guaranteed to be 
tied together too.

Qing

> 
> Richard.
> 
>> Thanks,
>> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 11:57 AM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 23.10.2023 um 16:56 schrieb Qing Zhao :
>> 
>> 
>> 
>>> On Oct 23, 2023, at 3:57 AM, Richard Biener  
>>> wrote:
>>> 
>>>> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  
>>>>> wrote:
>>>>> 
>>>>> On 2023-10-20 14:38, Qing Zhao wrote:
>>>>>> How about the following:
>>>>>> Add one more parameter to __builtin_dynamic_object_size(), i.e
>>>>>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>>>>>> When we see the structure field has counted_by attribute.
>>>>> 
>>>>> Or maybe add a barrier preventing any assignments to array_annotated->foo 
>>>>> from being reordered below the __bdos call? Basically an __asm__ with 
>>>>> array_annotated->foo in the clobber list ought to do it I think.
>>>> 
>>>> Maybe just adding the array_annotated->foo to the use list of the call to 
>>>> __builtin_dynamic_object_size should be enough?
>>>> 
>>>> But I am not sure how to implement this in the TREE level, is there a 
>>>> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
>>>> counted_by field “array_annotated->foo” to the USE_LIST of the call to 
>>>> __bdos?
>>>> 
>>>> This might be the simplest solution?
>>> 
>>> If the dynamic object size is derived of a field then I think you need to
>>> put the "load" of that memory location at the point (as argument)
>>> of the __bos call right at parsing time.  I know that's awkward because
>>> you try to play tricks "discovering" that field only late, but that's not
>>> going to work.
>> 
>> Is it better to do this at gimplification phase instead of FE? 
>> 
>> VLA decls are handled in gimplification phase, the size calculation and call 
>> to alloca are all generated during this phase. (gimplify_vla_decl).
>> 
>> For __bdos calls, we can add an additional argument if the object’s first 
>> argument’s type include the counted_by attribute, i.e
>> 
>> ***During gimplification, 
>> For a call to __builtin_dynamic_object_size (ptr, type)
>> Check whether the type of ptr includes counted_by attribute, if so, change 
>> the call to
>> __builtin_dynamic_object_size (ptr, type, counted_by field)
>> 
>> Then the correct data dependence should be represented well in the IR.
>> 
>> **During object size phase,
>> 
>> The call to __builtin_dynamic_object_size will become an expression includes 
>> the counted_by field or -1/0 when we cannot decide the size, the correct 
>> data dependence will be kept even the call to __builtin_dynamic_object_size 
>> is gone. 
> 
> But the whole point of the BOS pass is to derive information that is not 
> available at parsing time, and that’s the cases you are after.  The case 
> where the connection to the field with the length is apparent during parsing 
> is easy - you simply insert a load of the value before the BOS call.

Yes, this is true. 
I prefer to implement this in gimplification phase since I am more familiar 
with the code there.. (I think that implementing it in gimplification should be 
very similar as implementing it in FE? Or do I miss anything here?)

Joseph, if implement this in FE, where in the FE I should look at? 

Thanks a lot for the help.

Qing

>  For the late case there’s no way to invent data flow dependence without 
> inadvertently pessimizing optimization.
> 
> Richard 
> 
>> 
>>> 
>>> A related issue is that assignment to the field and storage allocation
>>> are not tied together
>> 
>> Yes, this is different from VLA, in which, the size assignment and the 
>> storage allocation are generated and tied together by the compiler.
>> 
>> For the flexible array member, the storage allocation and the size 
>> assignment are all done by the user. So, We need to clarify such requirement 
>>  in the document to guide user to write correct code.  And also, we might 
>> need to provide tools (warnings and sanitizer option) to help users to catch 
>> such coding error.
>> 
>>> - if there's no use of the size data we might
>>> remove the store of it as dead.
>> 
>> Yes, when __bdos cannot decide the size, we need to remove the dead store to 
>> 

Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 20, 2023, at 3:54 PM, Martin Uecker  wrote:
> 
> Am Freitag, dem 20.10.2023 um 18:48 + schrieb Qing Zhao:
>> 
>>> On Oct 20, 2023, at 2:34 PM, Kees Cook  wrote:
>>> 
>>> On Fri, Oct 20, 2023 at 11:50:11AM +0200, Martin Uecker wrote:
>>>> Am Donnerstag, dem 19.10.2023 um 16:33 -0700 schrieb Kees Cook:
>>>>> On Wed, Oct 18, 2023 at 09:11:43PM +, Qing Zhao wrote:
>>>>>> As I replied to Martin in another email, I plan to do the following to 
>>>>>> resolve this issue:
>>>>>> 
>>>>>> 1. No specification for signed or unsigned for counted_by field.
>>>>>> 2. Add a sanitizer option -fsanitize=counted-by-bound to catch the cases 
>>>>>> when the size of the counted-by is not positive.
>>>>> 
>>>>> I don't understand why this needs to be a runtime sanitizer. The
>>>>> signedness is known at compile time, so I would expect a -W option.
>>>> 
>>>> The signedness of the type but not of the value.
>>>> 
>>>> But I would not want to have a warning for signed 
>>>> counter  types by default because I would prefer
>>>> to use signed types (for various reasons including
>>>> better overflow detection).
>>>> 
>>>>> Or
>>>>> do you mean you'd split up -fsanitize=bounds between unsigned and signed
>>>>> indexes? I'd find that kind of awkward for the kernel... but I feel like
>>>>> I've misunderstood something. :)
>>>>> 
>>>>> -Kees
>>>> 
>>>> The idea would be to detect at run-time the case
>>>> if  x->buf  is used at a time where   x->counter 
>>>> is negative and also when x->counter * sizeof(x->buf[0])
>>>> overflows or is too big.
>>>> 
>>>> This would be similar to
>>>> 
>>>> int a[n];
>>>> 
>>>> where it is detected at run-time if n is not-positive.
>>> 
>>> Right. I guess what I mean to say is that I would expect this case to
>>> already be caught by -fsanitize=bounds -- I don't see a reason to add an
>>> additional sanitizer option.
>>> 
>>> struct foo {
>>> int count;
>>> int array[] __counted_by(count);
>>> };
>>> 
>>> foo->count = 5;
>>> foo->array[0] = 1;  // ok
>>> foo->array[10] = 1; // -fsanitize=bounds will catch this
>>> foo->array[-10] = 1;// -fsanitize=bounds will catch this too
>>> 
>>> 
>> 
>> just checked this testing case with my GCC, and YES, -fsanitize=bounds 
>> indeed caught this error:
>> 
>> ttt_1.c:31:12: runtime error: index 10 out of bounds for type 'char [*]'
>> ttt_1.c:32:12: runtime error: index -10 out of bounds for type 'char [*]’
>> 
> 
> Yes, but I thought we were discussing the case where count is
> set to a negative value:
> 
> foo->count = -1;
> int x = foo->array[3]; // UBSan should diagnose this
> 
> And also the case when foo->array becomes too big.

Oops, yes, you are right. 

Thanks.

Qing
> 
> Martin



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 2:06 PM, Martin Uecker  wrote:
> 
> Am Montag, dem 23.10.2023 um 16:37 + schrieb Qing Zhao:
>> 
>>> On Oct 23, 2023, at 11:57 AM, Richard Biener  
>>> wrote:
>>> 
>>> 
>>> 
>>>> Am 23.10.2023 um 16:56 schrieb Qing Zhao :
>>>> 
>>>> 
>>>> 
>>>>> On Oct 23, 2023, at 3:57 AM, Richard Biener  
>>>>> wrote:
>>>>> 
>>>>>> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  
>>>>>>> wrote:
>>>>>>> 
>>>>>>> On 2023-10-20 14:38, Qing Zhao wrote:
>>>>>>>> How about the following:
>>>>>>>> Add one more parameter to __builtin_dynamic_object_size(), i.e
>>>>>>>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>>>>>>>> When we see the structure field has counted_by attribute.
>>>>>>> 
>>>>>>> Or maybe add a barrier preventing any assignments to 
>>>>>>> array_annotated->foo from being reordered below the __bdos call? 
>>>>>>> Basically an __asm__ with array_annotated->foo in the clobber list 
>>>>>>> ought to do it I think.
>>>>>> 
>>>>>> Maybe just adding the array_annotated->foo to the use list of the call 
>>>>>> to __builtin_dynamic_object_size should be enough?
>>>>>> 
>>>>>> But I am not sure how to implement this in the TREE level, is there a 
>>>>>> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
>>>>>> counted_by field “array_annotated->foo” to the USE_LIST of the call to 
>>>>>> __bdos?
>>>>>> 
>>>>>> This might be the simplest solution?
>>>>> 
>>>>> If the dynamic object size is derived of a field then I think you need to
>>>>> put the "load" of that memory location at the point (as argument)
>>>>> of the __bos call right at parsing time.  I know that's awkward because
>>>>> you try to play tricks "discovering" that field only late, but that's not
>>>>> going to work.
>>>> 
>>>> Is it better to do this at gimplification phase instead of FE? 
>>>> 
>>>> VLA decls are handled in gimplification phase, the size calculation and 
>>>> call to alloca are all generated during this phase. (gimplify_vla_decl).
>>>> 
>>>> For __bdos calls, we can add an additional argument if the object’s first 
>>>> argument’s type include the counted_by attribute, i.e
>>>> 
>>>> ***During gimplification, 
>>>> For a call to __builtin_dynamic_object_size (ptr, type)
>>>> Check whether the type of ptr includes counted_by attribute, if so, change 
>>>> the call to
>>>> __builtin_dynamic_object_size (ptr, type, counted_by field)
>>>> 
>>>> Then the correct data dependence should be represented well in the IR.
>>>> 
>>>> **During object size phase,
>>>> 
>>>> The call to __builtin_dynamic_object_size will become an expression 
>>>> includes the counted_by field or -1/0 when we cannot decide the size, the 
>>>> correct data dependence will be kept even the call to 
>>>> __builtin_dynamic_object_size is gone. 
>>> 
>>> But the whole point of the BOS pass is to derive information that is not 
>>> available at parsing time, and that’s the cases you are after.  The case 
>>> where the connection to the field with the length is apparent during 
>>> parsing is easy - you simply insert a load of the value before the BOS call.
>> 
>> Yes, this is true. 
>> I prefer to implement this in gimplification phase since I am more familiar 
>> with the code there.. (I think that implementing it in gimplification should 
>> be very similar as implementing it in FE? Or do I miss anything here?)
>> 
>> Joseph, if implement this in FE, where in the FE I should look at? 
>> 
> 
> We should aim for a good integration with the BDOS pass, so
> that it can propagate the information further, e.g. the 
> following should work:
> 
> struct { int L; char buf[] __counted_by(L) } x;
> x.L = N;
> x.buf = ...;
> char *p = &

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 2:31 PM, Martin Uecker  wrote:
> 
> Am Montag, dem 23.10.2023 um 20:06 +0200 schrieb Martin Uecker:
>> Am Montag, dem 23.10.2023 um 16:37 +0000 schrieb Qing Zhao:
>>> 
>>>> On Oct 23, 2023, at 11:57 AM, Richard Biener  
>>>> wrote:
>>>> 
>>>> 
>>>> 
>>>>> Am 23.10.2023 um 16:56 schrieb Qing Zhao :
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Oct 23, 2023, at 3:57 AM, Richard Biener  
>>>>>> wrote:
>>>>>> 
>>>>>>> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> On 2023-10-20 14:38, Qing Zhao wrote:
>>>>>>>>> How about the following:
>>>>>>>>> Add one more parameter to __builtin_dynamic_object_size(), i.e
>>>>>>>>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>>>>>>>>> When we see the structure field has counted_by attribute.
>>>>>>>> 
>>>>>>>> Or maybe add a barrier preventing any assignments to 
>>>>>>>> array_annotated->foo from being reordered below the __bdos call? 
>>>>>>>> Basically an __asm__ with array_annotated->foo in the clobber list 
>>>>>>>> ought to do it I think.
>>>>>>> 
>>>>>>> Maybe just adding the array_annotated->foo to the use list of the call 
>>>>>>> to __builtin_dynamic_object_size should be enough?
>>>>>>> 
>>>>>>> But I am not sure how to implement this in the TREE level, is there a 
>>>>>>> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
>>>>>>> counted_by field “array_annotated->foo” to the USE_LIST of the call to 
>>>>>>> __bdos?
>>>>>>> 
>>>>>>> This might be the simplest solution?
>>>>>> 
>>>>>> If the dynamic object size is derived of a field then I think you need to
>>>>>> put the "load" of that memory location at the point (as argument)
>>>>>> of the __bos call right at parsing time.  I know that's awkward because
>>>>>> you try to play tricks "discovering" that field only late, but that's not
>>>>>> going to work.
>>>>> 
>>>>> Is it better to do this at gimplification phase instead of FE? 
>>>>> 
>>>>> VLA decls are handled in gimplification phase, the size calculation and 
>>>>> call to alloca are all generated during this phase. (gimplify_vla_decl).
>>>>> 
>>>>> For __bdos calls, we can add an additional argument if the object’s first 
>>>>> argument’s type include the counted_by attribute, i.e
>>>>> 
>>>>> ***During gimplification, 
>>>>> For a call to __builtin_dynamic_object_size (ptr, type)
>>>>> Check whether the type of ptr includes counted_by attribute, if so, 
>>>>> change the call to
>>>>> __builtin_dynamic_object_size (ptr, type, counted_by field)
>>>>> 
>>>>> Then the correct data dependence should be represented well in the IR.
>>>>> 
>>>>> **During object size phase,
>>>>> 
>>>>> The call to __builtin_dynamic_object_size will become an expression 
>>>>> includes the counted_by field or -1/0 when we cannot decide the size, the 
>>>>> correct data dependence will be kept even the call to 
>>>>> __builtin_dynamic_object_size is gone. 
>>>> 
>>>> But the whole point of the BOS pass is to derive information that is not 
>>>> available at parsing time, and that’s the cases you are after.  The case 
>>>> where the connection to the field with the length is apparent during 
>>>> parsing is easy - you simply insert a load of the value before the BOS 
>>>> call.
>>> 
>>> Yes, this is true. 
>>> I prefer to implement this in gimplification phase since I am more familiar 
>>> with the code there.. (I think that implementing it in gimplification 
>>> should be very similar as implementing it in FE? 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 2:43 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-23 14:06, Martin Uecker wrote:
>> We should aim for a good integration with the BDOS pass, so
>> that it can propagate the information further, e.g. the
>> following should work:
>> struct { int L; char buf[] __counted_by(L) } x;
>> x.L = N;
>> x.buf = ...;
>> char *p = &x->f;
>> __bdos(p) -> N
>> So we need to be smart on how we provide the size
>> information for x->f to the backend.
>> This would also be desirable for the language extension.
> 
> This is essentially why there need to be frontend rules constraining 
> reordering and reachability semantics of x.L, thus restricting DSE and 
> reordering for it.

My understanding is that Restricting DSE and reordering should be done by the 
proper data flow information, with a new argument added to the BDOS call, this 
correct data flow information could be maintained, and then the DSE and 
reordering will not happen. 

I don’t quite understand what kind of frontend rules should be added to 
constrain reordering and reachability semantics? Can you explain this a little 
bit more? Do you mean to add some rules or requirment to the new attribute that 
the users of the attribute should follow in the source code? 

>  This is not really a __bdos/__bos question, because that bit is trivial; if 
> the structure is visible, the value is simply x.L.  This is also why adding a 
> reference to x.L in __bos/__bdos is not sufficient or even possible in, e.g. 
> the above case you note.

I am a little confused here, are we discussing how to resolve the potential 
reordering issue of the following:

"
struct annotated {
  size_t foo;
  char array[] __attribute__((counted_by (foo)));
};

  p->foo = 10;
  size = __builtin_dynamic_object_size (p->array,1);
“?

Or a bigger issue?

Qing

> 
> Thanks,
> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 3:37 PM, Martin Uecker  wrote:
> 
> Am Montag, dem 23.10.2023 um 19:00 + schrieb Qing Zhao:
>> 
>>> On Oct 23, 2023, at 2:31 PM, Martin Uecker  wrote:
>>> 
>>> Am Montag, dem 23.10.2023 um 20:06 +0200 schrieb Martin Uecker:
>>>> Am Montag, dem 23.10.2023 um 16:37 + schrieb Qing Zhao:
>>>>> 
>>>>>> On Oct 23, 2023, at 11:57 AM, Richard Biener 
>>>>>>  wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> Am 23.10.2023 um 16:56 schrieb Qing Zhao :
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Oct 23, 2023, at 3:57 AM, Richard Biener 
>>>>>>>>  wrote:
>>>>>>>> 
>>>>>>>>> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar 
>>>>>>>>>>  wrote:
>>>>>>>>>> 
>>>>>>>>>> On 2023-10-20 14:38, Qing Zhao wrote:
>>>>>>>>>>> How about the following:
>>>>>>>>>>> Add one more parameter to __builtin_dynamic_object_size(), i.e
>>>>>>>>>>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>>>>>>>>>>> When we see the structure field has counted_by attribute.
>>>>>>>>>> 
>>>>>>>>>> Or maybe add a barrier preventing any assignments to 
>>>>>>>>>> array_annotated->foo from being reordered below the __bdos call? 
>>>>>>>>>> Basically an __asm__ with array_annotated->foo in the clobber list 
>>>>>>>>>> ought to do it I think.
>>>>>>>>> 
>>>>>>>>> Maybe just adding the array_annotated->foo to the use list of the 
>>>>>>>>> call to __builtin_dynamic_object_size should be enough?
>>>>>>>>> 
>>>>>>>>> But I am not sure how to implement this in the TREE level, is there a 
>>>>>>>>> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
>>>>>>>>> counted_by field “array_annotated->foo” to the USE_LIST of the call 
>>>>>>>>> to __bdos?
>>>>>>>>> 
>>>>>>>>> This might be the simplest solution?
>>>>>>>> 
>>>>>>>> If the dynamic object size is derived of a field then I think you need 
>>>>>>>> to
>>>>>>>> put the "load" of that memory location at the point (as argument)
>>>>>>>> of the __bos call right at parsing time.  I know that's awkward because
>>>>>>>> you try to play tricks "discovering" that field only late, but that's 
>>>>>>>> not
>>>>>>>> going to work.
>>>>>>> 
>>>>>>> Is it better to do this at gimplification phase instead of FE? 
>>>>>>> 
>>>>>>> VLA decls are handled in gimplification phase, the size calculation and 
>>>>>>> call to alloca are all generated during this phase. (gimplify_vla_decl).
>>>>>>> 
>>>>>>> For __bdos calls, we can add an additional argument if the object’s 
>>>>>>> first argument’s type include the counted_by attribute, i.e
>>>>>>> 
>>>>>>> ***During gimplification, 
>>>>>>> For a call to __builtin_dynamic_object_size (ptr, type)
>>>>>>> Check whether the type of ptr includes counted_by attribute, if so, 
>>>>>>> change the call to
>>>>>>> __builtin_dynamic_object_size (ptr, type, counted_by field)
>>>>>>> 
>>>>>>> Then the correct data dependence should be represented well in the IR.
>>>>>>> 
>>>>>>> **During object size phase,
>>>>>>> 
>>>>>>> The call to __builtin_dynamic_object_size will become an expression 
>>>>>>> includes the counted_by field or -1/0 when we cannot decide the size, 
>>>

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Qing Zhao
Hi, Sid,

Really appreciate for your example and detailed explanation. Very helpful.
I think that this example is an excellent example to show (almost) all the 
issues we need to consider.

I slightly modified this example to make it to be compilable and run-able, as 
following: 
(but I still cannot make the incorrect reordering or DSE happening, anyway, the 
potential reordering possibility is there…)

  1 #include 
  2 struct A
  3 {
  4  size_t size;
  5  char buf[] __attribute__((counted_by(size)));
  6 };
  7 
  8 static size_t
  9 get_size_from (void *ptr)
 10 {
 11  return __builtin_dynamic_object_size (ptr, 1);
 12 }
 13 
 14 void
 15 foo (size_t sz)
 16 {
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
 21  return;
 22 }
 23 
 24 int main ()
 25 {
 26  foo (20);
 27  return 0;
 28 }

With my GCC, it was compiled and worked:
[opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
[opc@qinzhao-ol8u3-x86 ]$ ./a.out
20
Situation 1: With O1 and above, the routine “get_size_from” was inlined into 
“foo”, therefore, the call to __bdos is in the same routine as the 
instantiation of the object, and the TYPE information and the attached 
counted_by attribute information in the TYPE of the object can be USED by the 
__bdos call to compute the final object size. 

[opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
[opc@qinzhao-ol8u3-x86 ]$ ./a.out
-1
Situation 2: With O0, the routine “get_size_from” was NOT inlined into “foo”, 
therefore, the call to __bdos is Not in the same routine as the instantiation 
of the object, As a result, the TYPE info and the attached counted_by info of 
the object can NOT be USED by the __bdos call. 

Keep in mind of the above 2 situations, we will refer them in below:

1. First,  the problem we are trying to resolve is:

(Your description):

>  the reordering of __bdos w.r.t. initialization of the size parameter but to 
> also account for DSE of the assignment, we can abstract this problem to that 
> of DFA being unable to see implicit use of the size parameter in the __bdos 
> call.

basically is correct.  However, with the following exception:

The implicit use of the size parameter in the __bdos call is not always there, 
it ONLY exists WHEN the __bdos is able to evaluated to an expression of the 
size parameter in the “objsz” phase, i.e., the “Situation 1” of the above 
example. 
 In the “Situation 2”, when the __bdos does not see the TYPE of the real 
object,  it does not see the counted_by information from the TYPE, therefore,  
it is not able to evaluate the size of the object through the counted_by 
information.  As a result, the implicit use of the size parameter in the __bdos 
call does NOT exist at all.  The optimizer can freely reorder the 
initialization of the size parameter with the __bdos call since there is no 
data flow dependency between these two. 

With this exception in mind, we can see that your proposed “option 2” (making 
the type of size “volatile”) is too conservative, it will  disable many 
optimizations  unnecessarily, even though it’s safe and simple to implement. 

As a compiler optimization person for many many years, I really don’t want to 
take this approach at this moment.  -:)

2. Some facts I’d like to mention:

A.  The incorrect reordering (or CSE) potential ONLY exists in the TREE 
optimization stage. During RTL stage,  the __bdos call has already been 
replaced by an expression of the size parameter or a constant, the data 
dependency is explicitly in the IR already.  I believe that the data analysis 
in RTL stage should pick up the data dependency correctly, No special handling 
is needed in RTL.

B. If the __bdos call cannot see the real object , it has no way to get the 
“counted_by” field from the TYPE of the real object. So, if we try to add the 
implicit use of the “counted_by” field to the __bdos call, the object 
instantiation should be in the same routine as the __bdos call.  Both the FE 
and the gimplification phase are too early to do this work. 

2. Then, what’s the best approach to resolve this problem:

There were several suggestions so far:

A.  Add an additional argument, the size parameter,  to __bdos, 
  A.1, during FE;
  A.2, during gimplification phase;
B.  Encode the implicit USE  in the type of size, to make the size “volatile”;
C.  Encode the implicit USE  in the type of buf, then update the optimization 
passes to use this implicit USE encoded in the type of buf.

As I explained in the above, 
** Approach A (both A.1 and A.2) does not work;
** Approach B will have big performance impact, I’d prefer not to take this 
approach at this moment.
** Approach C will be a lot of change in GCC, and also not very necessary since 
the ONLY implicit use of the size parameter is in the __bdos call when __bdos 
can see the real object.

So, all the above pro

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Qing Zhao


> On Oct 24, 2023, at 5:03 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-24 16:30, Qing Zhao wrote:
>> Situation 2: With O0, the routine “get_size_from” was NOT inlined into 
>> “foo”, therefore, the call to __bdos is Not in the same routine as the 
>> instantiation of the object, As a result, the TYPE info and the attached 
>> counted_by info of the object can NOT be USED by the __bdos call.
> 
> But __bos/__bdos are barely useful without optimization; you need a minimum 
> of -O1.  You're right that if the call is never inlined then we don't care 
> because the __bdos call does not get expanded to obj->size.
> 
> However, the point of situation 2 is that the TYPE info cannot be used by the 
> __bdos call *only for a while* (i.e. until the call gets inlined) and that 
> window is an opportunity for the reordering/DSE to break things.

The main point of situation 2 I tried made: there are situations where 
obj->size is not used at all by the __bdos, marking it as volatile is too 
conservative, unnecessarily prevent useful optimizations from happening.  -:)

Qing
> 
> Thanks.
> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Qing Zhao


> On Oct 24, 2023, at 4:38 PM, Martin Uecker  wrote:
> 
> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>> Hi, Sid,
>> 
>> Really appreciate for your example and detailed explanation. Very helpful.
>> I think that this example is an excellent example to show (almost) all the 
>> issues we need to consider.
>> 
>> I slightly modified this example to make it to be compilable and run-able, 
>> as following: 
>> (but I still cannot make the incorrect reordering or DSE happening, anyway, 
>> the potential reordering possibility is there…)
>> 
>>  1 #include 
>>  2 struct A
>>  3 {
>>  4  size_t size;
>>  5  char buf[] __attribute__((counted_by(size)));
>>  6 };
>>  7 
>>  8 static size_t
>>  9 get_size_from (void *ptr)
>> 10 {
>> 11  return __builtin_dynamic_object_size (ptr, 1);
>> 12 }
>> 13 
>> 14 void
>> 15 foo (size_t sz)
>> 16 {
>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>> 18  obj->size = sz;
>> 19  obj->buf[0] = 2;
>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>> 21  return;
>> 22 }
>> 23 
>> 24 int main ()
>> 25 {
>> 26  foo (20);
>> 27  return 0;
>> 28 }
>> 
>> With my GCC, it was compiled and worked:
>> [opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
>> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
>> 20
>> Situation 1: With O1 and above, the routine “get_size_from” was inlined into 
>> “foo”, therefore, the call to __bdos is in the same routine as the 
>> instantiation of the object, and the TYPE information and the attached 
>> counted_by attribute information in the TYPE of the object can be USED by 
>> the __bdos call to compute the final object size. 
>> 
>> [opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
>> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
>> -1
>> Situation 2: With O0, the routine “get_size_from” was NOT inlined into 
>> “foo”, therefore, the call to __bdos is Not in the same routine as the 
>> instantiation of the object, As a result, the TYPE info and the attached 
>> counted_by info of the object can NOT be USED by the __bdos call. 
>> 
>> Keep in mind of the above 2 situations, we will refer them in below:
>> 
>> 1. First,  the problem we are trying to resolve is:
>> 
>> (Your description):
>> 
>>> the reordering of __bdos w.r.t. initialization of the size parameter but to 
>>> also account for DSE of the assignment, we can abstract this problem to 
>>> that of DFA being unable to see implicit use of the size parameter in the 
>>> __bdos call.
>> 
>> basically is correct.  However, with the following exception:
>> 
>> The implicit use of the size parameter in the __bdos call is not always 
>> there, it ONLY exists WHEN the __bdos is able to evaluated to an expression 
>> of the size parameter in the “objsz” phase, i.e., the “Situation 1” of the 
>> above example. 
>> In the “Situation 2”, when the __bdos does not see the TYPE of the real 
>> object,  it does not see the counted_by information from the TYPE, 
>> therefore,  it is not able to evaluate the size of the object through the 
>> counted_by information.  As a result, the implicit use of the size parameter 
>> in the __bdos call does NOT exist at all.  The optimizer can freely reorder 
>> the initialization of the size parameter with the __bdos call since there is 
>> no data flow dependency between these two. 
>> 
>> With this exception in mind, we can see that your proposed “option 2” 
>> (making the type of size “volatile”) is too conservative, it will  disable 
>> many optimizations  unnecessarily, even though it’s safe and simple to 
>> implement. 
>> 
>> As a compiler optimization person for many many years, I really don’t want 
>> to take this approach at this moment.  -:)
>> 
>> 2. Some facts I’d like to mention:
>> 
>> A.  The incorrect reordering (or CSE) potential ONLY exists in the TREE 
>> optimization stage. During RTL stage,  the __bdos call has already been 
>> replaced by an expression of the size parameter or a constant, the data 
>> dependency is explicitly in the IR already.  I believe that the data 
>> analysis in RTL stage should pick up the data dependency correctly, No 
>> special handling is needed in RTL.
>> 
>> B. If the __bdos call cannot see the real object , it has no way to get the 
>> “counted_by” field from the TYPE of the real object. So, if we try to add 
>

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 24, 2023, at 7:56 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-24 18:51, Qing Zhao wrote:
>> Thanks for the proposal!
>> So what you suggested is:
>> For every x.buf,  change it as a __builtin_with_size(x.buf, x.L) in the FE, 
>> then the call to the _bdos (x.buf, 1) will
>> Become:
>>_bdos(__builtin_with_size(x.buf, x.L), 1)?
>> Then the implicit use of x.L in _bdos(x.buf.1) will become explicit?
> 
> Oops, I think Martin and I fell off-list in a subthread.  I clarified that my 
> comment was that any such annotation at object reference is probably too late 
> and hence not the right place for it; basically it has the same problems as 
> the option A in your comment.  A better place to reinforce such a 
> relationship would be the allocation+initialization site instead.

I think Martin’s proposal might work, it’s different than the option A:

A.  Add an additional argument, the size parameter,  to __bdos, 
 A.1, during FE;
 A.2, during gimplification phase;

Option A targets on the __bdos call, try to encode the implicit use to the 
call, this will not work when the real object has not been instantiation at the 
call site.

However, Martin’s proposal targets on the FMA array itself, it will enhance the 
FAM access naturally with the size information. And such FAM access with size 
info will propagated to the __bdos site later through inlining, etc. and then 
tree-object-size can use the size information at that point. At the same time, 
the implicit use of the size is recorded correctly. 

So, I think that this proposal is natural and reasonable.

Qing
> 
> Thanks,
> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 25, 2023, at 6:39 AM, Martin Uecker  wrote:
> 
> Am Mittwoch, dem 25.10.2023 um 12:25 +0200 schrieb Richard Biener:
>> 
>>> Am 25.10.2023 um 10:16 schrieb Martin Uecker :
>>> 
>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>>>> 
>>>>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>>>>> 
>>>>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>>>>>> Hi, Sid,
>>>>>> 
>>>>>> Really appreciate for your example and detailed explanation. Very 
>>>>>> helpful.
>>>>>> I think that this example is an excellent example to show (almost) all 
>>>>>> the issues we need to consider.
>>>>>> 
>>>>>> I slightly modified this example to make it to be compilable and 
>>>>>> run-able, as following: 
>>>>>> (but I still cannot make the incorrect reordering or DSE happening, 
>>>>>> anyway, the potential reordering possibility is there…)
>>>>>> 
>>>>>> 1 #include 
>>>>>> 2 struct A
>>>>>> 3 {
>>>>>> 4  size_t size;
>>>>>> 5  char buf[] __attribute__((counted_by(size)));
>>>>>> 6 };
>>>>>> 7 
>>>>>> 8 static size_t
>>>>>> 9 get_size_from (void *ptr)
>>>>>> 10 {
>>>>>> 11  return __builtin_dynamic_object_size (ptr, 1);
>>>>>> 12 }
>>>>>> 13 
>>>>>> 14 void
>>>>>> 15 foo (size_t sz)
>>>>>> 16 {
>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>>>>> sizeof(char));
>>>>>> 18  obj->size = sz;
>>>>>> 19  obj->buf[0] = 2;
>>>>>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>>>>> 21  return;
>>>>>> 22 }
>>>>>> 23 
>>>>>> 24 int main ()
>>>>>> 25 {
>>>>>> 26  foo (20);
>>>>>> 27  return 0;
>>>>>> 28 }
>>>>>> 
>>>>>> With my GCC, it was compiled and worked:
>>>>>> [opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
>>>>>> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
>>>>>> 20
>>>>>> Situation 1: With O1 and above, the routine “get_size_from” was inlined 
>>>>>> into “foo”, therefore, the call to __bdos is in the same routine as the 
>>>>>> instantiation of the object, and the TYPE information and the attached 
>>>>>> counted_by attribute information in the TYPE of the object can be USED 
>>>>>> by the __bdos call to compute the final object size. 
>>>>>> 
>>>>>> [opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
>>>>>> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
>>>>>> -1
>>>>>> Situation 2: With O0, the routine “get_size_from” was NOT inlined into 
>>>>>> “foo”, therefore, the call to __bdos is Not in the same routine as the 
>>>>>> instantiation of the object, As a result, the TYPE info and the attached 
>>>>>> counted_by info of the object can NOT be USED by the __bdos call. 
>>>>>> 
>>>>>> Keep in mind of the above 2 situations, we will refer them in below:
>>>>>> 
>>>>>> 1. First,  the problem we are trying to resolve is:
>>>>>> 
>>>>>> (Your description):
>>>>>> 
>>>>>>> the reordering of __bdos w.r.t. initialization of the size parameter 
>>>>>>> but to also account for DSE of the assignment, we can abstract this 
>>>>>>> problem to that of DFA being unable to see implicit use of the size 
>>>>>>> parameter in the __bdos call.
>>>>>> 
>>>>>> basically is correct.  However, with the following exception:
>>>>>> 
>>>>>> The implicit use of the size parameter in the __bdos call is not always 
>>>>>> there, it ONLY exists WHEN the __bdos is able to evaluated to an 
>>>>>> expression of the size parameter in the “objsz” phase, i.e., the 
>>>>>> “Situation 1” of the above example. 
>>>>>> In the “Situation 2”, when the __bdos does no

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 25, 2023, at 7:13 AM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
>> 
>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>>>> On 2023-10-25 04:16, Martin Uecker wrote:
>>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>>>>> 
>>>>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>>>>>> 
>>>>>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>>>>>>> Hi, Sid,
>>>>>>> 
>>>>>>> Really appreciate for your example and detailed explanation. Very 
>>>>>>> helpful.
>>>>>>> I think that this example is an excellent example to show (almost) all 
>>>>>>> the issues we need to consider.
>>>>>>> 
>>>>>>> I slightly modified this example to make it to be compilable and 
>>>>>>> run-able, as following:
>>>>>>> (but I still cannot make the incorrect reordering or DSE happening, 
>>>>>>> anyway, the potential reordering possibility is there…)
>>>>>>> 
>>>>>>> 1 #include 
>>>>>>> 2 struct A
>>>>>>> 3 {
>>>>>>> 4  size_t size;
>>>>>>> 5  char buf[] __attribute__((counted_by(size)));
>>>>>>> 6 };
>>>>>>> 7
>>>>>>> 8 static size_t
>>>>>>> 9 get_size_from (void *ptr)
>>>>>>> 10 {
>>>>>>> 11  return __builtin_dynamic_object_size (ptr, 1);
>>>>>>> 12 }
>>>>>>> 13
>>>>>>> 14 void
>>>>>>> 15 foo (size_t sz)
>>>>>>> 16 {
>>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>>>>>> sizeof(char));
>>>>>>> 18  obj->size = sz;
>>>>>>> 19  obj->buf[0] = 2;
>>>>>>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>>>>>> 21  return;
>>>>>>> 22 }
>>>>>>> 23
>>>>>>> 24 int main ()
>>>>>>> 25 {
>>>>>>> 26  foo (20);
>>>>>>> 27  return 0;
>>>>>>> 28 }
>>>>>>> 
>>> 
>>> 
>>> 
>>>>> When it’s set I suppose.  Turn
>>>>> 
>>>>> X.l = n;
>>>>> 
>>>>> Into
>>>>> 
>>>>> X.l = __builtin_with_size (x.buf, n);
>>>> 
>>>> It would turn
>>>> 
>>>> some_variable = (&) x.buf
>>>> 
>>>> into
>>>> 
>>>> some_variable = __builtin_with_size ( (&) x.buf. x.len)
>>>> 
>>>> 
>>>> So the later access to x.buf and not the initialization
>>>> of a member of the struct (which is too early).
>>>> 
>>> 
>>> Hmm, so with Qing's example above, are you suggesting the transformation 
>>> be to foo like so:
>>> 
>>> 14 void
>>> 15 foo (size_t sz)
>>> 16 {
>>> 16.5  void * _1;
>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>> 18  obj->size = sz;
>>> 19  obj->buf[0] = 2;
>>> 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
>>> 20  __builtin_printf (“%d\n", get_size_from (_1));
>>> 21  return;
>>> 22 }
>>> 
>>> If yes then this could indeed work.  I think I got thrown off by the 
>>> reference to __bdos.
>> 
>> Yes. I think it is important not to evaluate the size at the
>> access to buf and not the allocation, because the point is to 
>> recover it from the size member even when the compiler can't 
>> see the original allocation.
> 
> But if the access is through a pointer without the attribute visible even the 
> Frontend cannot recover?  We’d need to force type correctness and give up on 
> indirecting through an int * when it can refer to two diffenent container 
> types.

Might need issue warnings when this happens?

>  The best we can do I think is mark allocation sites and hope for some basic 
> code hygiene (not clobbering size or array pointer through pointers without 
> the appropriately attributed type)
I guess that we need to clarify the requirement in the documentation, and also 
issue warnings when the source code has such issues.

Qing
> 
>> Evaluating at this point requires that the size is correctly set
>> before the access to the FAM and the user has to make sure 
>> this is the case. But to me this requirement would make sense.
>> 
>> Semantically, it could aöso make sense to evaluate the size at a
>> later time.  But then the reordering becomes problematic again.
>> 
>> Also I think this would make this feature generally more useful.
>> For example, it could work also for others pointers in the struct
>> and not just for FAMs.  In this case, the struct may already be
>> freed when  BDOS is called, so it might also not possible to
>> access the size member at a later time.
>> 
>> Martin
>> 
>> 
>>> 
>> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 25, 2023, at 10:50 AM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-25 09:27, Qing Zhao wrote:
>>> On Oct 24, 2023, at 7:56 PM, Siddhesh Poyarekar  wrote:
>>> 
>>> On 2023-10-24 18:51, Qing Zhao wrote:
>>>> Thanks for the proposal!
>>>> So what you suggested is:
>>>> For every x.buf,  change it as a __builtin_with_size(x.buf, x.L) in the 
>>>> FE, then the call to the _bdos (x.buf, 1) will
>>>> Become:
>>>>_bdos(__builtin_with_size(x.buf, x.L), 1)?
>>>> Then the implicit use of x.L in _bdos(x.buf.1) will become explicit?
>>> 
>>> Oops, I think Martin and I fell off-list in a subthread.  I clarified that 
>>> my comment was that any such annotation at object reference is probably too 
>>> late and hence not the right place for it; basically it has the same 
>>> problems as the option A in your comment.  A better place to reinforce such 
>>> a relationship would be the allocation+initialization site instead.
>> I think Martin’s proposal might work, it’s different than the option A:
>> A.  Add an additional argument, the size parameter,  to __bdos,
>>  A.1, during FE;
>>  A.2, during gimplification phase;
>> Option A targets on the __bdos call, try to encode the implicit use to the 
>> call, this will not work when the real object has not been instantiation at 
>> the call site.
>> However, Martin’s proposal targets on the FMA array itself, it will enhance 
>> the FAM access naturally with the size information. And such FAM access with 
>> size info will propagated to the __bdos site later through inlining, etc. 
>> and then tree-object-size can use the size information at that point. At the 
>> same time, the implicit use of the size is recorded correctly.
>> So, I think that this proposal is natural and reasonable.
> 
> Ack, we discussed this later in the thread and I agree[1].  Richard still has 
> concerns[2] that I think may be addressed by putting __builtin_with_size at 
> the point where the reference to x.buf escapes, but I'm not very sure about 
> that.
> 
> Oh, and Martin suggested using __builtin_with_size more generally[3] in 
> bugzilla to address attribute inlining issues and we have high level 
> consensus for a __builtin_with_access instead, which associates access type 
> in addition to size with the target object.  For the purposes of counted_by, 
> access type could simply be -1.

Yes, I read all the discussions in the comments of PR96503 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503), and I do agree that this 
is a good idea. 

I prefer the name for the new builtin as:  
__builtin_with_access_and_size
Instead of 
__builtin_with_access

All the attributes, “alloca_size”, “access”, and the new “counted_by” for FMA, 
could be converted to this builtin consistently, and even the later new 
extension, for example, “counted_by” attribute for general pointers, could use 
the same builtin. 

SOMETYPE *ptr = __builtin_with_access_and_size (SOMETYPE *ptr, size_t size, int 
access)

In the above, 

1. SOMETYPE will be the type of the pointee of “ptr”, it could be a real type 
or void.

2. “size”

If SOMETYPE is a real type, the “size” will be the number of elements of the 
type;
If SOMETYPE is void, the “size” will be the number of bytes.   

3. “access”

-1: Unknown access semantics
0: none
1: read_only
2: write_only
3: read_write

For the “counted_by” and “alloca_size” attribute, the “access” will be -1. 

Qing
> 
> Thanks,
> Sid
> 
> 
> [1] 
> https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#m4f3cafa489493180e258fd62aca0196a5f244039
> 
> [2] 
> https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#mcf226f891621db8b640deaedd8942bb8519010f3
> 
> [3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503#c6



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 25, 2023, at 11:38 AM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 25.10.2023 um 16:50 schrieb Siddhesh Poyarekar :
>> 
>> On 2023-10-25 09:27, Qing Zhao wrote:
>>>>> On Oct 24, 2023, at 7:56 PM, Siddhesh Poyarekar  
>>>>> wrote:
>>>> 
>>>> On 2023-10-24 18:51, Qing Zhao wrote:
>>>>> Thanks for the proposal!
>>>>> So what you suggested is:
>>>>> For every x.buf,  change it as a __builtin_with_size(x.buf, x.L) in the 
>>>>> FE, then the call to the _bdos (x.buf, 1) will
>>>>> Become:
>>>>>   _bdos(__builtin_with_size(x.buf, x.L), 1)?
>>>>> Then the implicit use of x.L in _bdos(x.buf.1) will become explicit?
>>>> 
>>>> Oops, I think Martin and I fell off-list in a subthread.  I clarified that 
>>>> my comment was that any such annotation at object reference is probably 
>>>> too late and hence not the right place for it; basically it has the same 
>>>> problems as the option A in your comment.  A better place to reinforce 
>>>> such a relationship would be the allocation+initialization site instead.
>>> I think Martin’s proposal might work, it’s different than the option A:
>>> A.  Add an additional argument, the size parameter,  to __bdos,
>>> A.1, during FE;
>>> A.2, during gimplification phase;
>>> Option A targets on the __bdos call, try to encode the implicit use to the 
>>> call, this will not work when the real object has not been instantiation at 
>>> the call site.
>>> However, Martin’s proposal targets on the FMA array itself, it will enhance 
>>> the FAM access naturally with the size information. And such FAM access 
>>> with size info will propagated to the __bdos site later through inlining, 
>>> etc. and then tree-object-size can use the size information at that point. 
>>> At the same time, the implicit use of the size is recorded correctly.
>>> So, I think that this proposal is natural and reasonable.
>> 
>> Ack, we discussed this later in the thread and I agree[1].  Richard still 
>> has concerns[2] that I think may be addressed by putting __builtin_with_size 
>> at the point where the reference to x.buf escapes, but I'm not very sure 
>> about that.
>> 
>> Oh, and Martin suggested using __builtin_with_size more generally[3] in 
>> bugzilla to address attribute inlining issues and we have high level 
>> consensus for a __builtin_with_access instead, which associates access type 
>> in addition to size with the target object.  For the purposes of counted_by, 
>> access type could simply be -1.
> 
> Btw, I’d like to see some hard numbers on the amount of extra false positives 
> this will cause a well as the effect on generated code before putting this in 
> mainline and effectively needing to support it forever. 

What do you mean by the “extra false positives”? 

For the code generation impact:

turning the original  x.buf 
to a builtin function call
__builtin_with_access_and_size(x,buf, x.L,-1)

might inhibit some optimizations from happening before the builtin is evaluated 
into object size info (phase  .objsz1).  I guess there might be some 
performance impact. 

However, if we mark this builtin as PURE, NOTRROW, etc, then the negative 
performance impact will be reduced to minimum? 

Qing

> 
> Richard 
> 
>> Thanks,
>> Sid
>> 
>> 
>> [1] 
>> https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#m4f3cafa489493180e258fd62aca0196a5f244039
>> 
>> [2] 
>> https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#mcf226f891621db8b640deaedd8942bb8519010f3
>> 
>> [3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503#c6



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 25, 2023, at 6:06 PM, Kees Cook  wrote:
> 
> On Wed, Oct 25, 2023 at 01:27:29PM +0000, Qing Zhao wrote:
>> A.  Add an additional argument, the size parameter,  to __bdos, 
>> A.1, during FE;
>> A.2, during gimplification phase;
> 
> I just wanted to clarify that this is all just an "internal" detail,
> yes?

YES!

> i.e. the __bdos() used by in C code is unchanged?

there should be no change to the user interface. 

> 
> For example, the Linux kernel can still use __bdos() without knowing
> the count member ahead of time (otherwise it kind of defeats the purpose).
Don’t quite understand this, could you clarify? 

(Anyway, the bottom line is no change to the user interface, we just discuss 
the internal implementation inside GCC) -:)

Qing
> 
> -- 
> Kees Cook



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao


> On Oct 26, 2023, at 1:21 AM, Jakub Jelinek  wrote:
> 
> On Wed, Oct 25, 2023 at 07:03:43PM +0000, Qing Zhao wrote:
>> For the code generation impact:
>> 
>> turning the original  x.buf 
>> to a builtin function call
>> __builtin_with_access_and_size(x,buf, x.L,-1)
>> 
>> might inhibit some optimizations from happening before the builtin is
>> evaluated into object size info (phase  .objsz1).  I guess there might be
>> some performance impact.
>> 
>> However, if we mark this builtin as PURE, NOTRROW, etc, then the negative
>> performance impact will be reduced to minimum?
> 
> You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
> be able to figure out the dynamic sizes in case of normal (non-early)
> inlining - caller takes address of a counted_by array, passes it down
> to callee which is only inlined late and uses __bdos, or callee takes address
> and returns it and caller uses __bdos, etc. - so it would need to be objsz2.

I guess that I didn’t say it very clear previously. Let me explain again:

My understanding is, there are “early_objsz” phase and then later “objsz1” 
phase for -O[1|2|3]. 
For -Og, there are “early_objsz” and then later “objsz2”. 

So, the “objsz1” I mentioned (for the case -O[1|2|3])  should be the same as 
the “objsz2” you mentioned above?  -:)
It’s the second objsz phase. 

In the second objsz phase, I believe that all the inlining (including early 
inlining and IPA inlining) are all applied?
> 
> And while the builtin (or if it is an internal detail rather than user
> accessible builtin an internal function)

Okay, will use an “internal function” instead of “ builtin function”. 

> could be even const/nothrow/leaf if
> the arguments contain the loads from the structure 2 fields, I'm afraid it
> will still have huge code generation impact, prevent tons of pre-IPA
> optimizations.  And it will need some work to handle it properly during
> inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't gimple
> values, so it wouldn't be just the builtin/internal-fn call to be ignored,
> but also the count load from memory.

Are you worrying about the potential additional LOADs will change the inlining 
decision
 since the inlining heuristic depends on the # of loads from memory? 

In additional to the # of loads, the # of instructions and the # of calls of 
the function 
might be increased too, will these have impact on inlining decision? 

In addition to inlining decision, any other impact to other IPA optimizations? 

thanks.

Qing


> 
>   Jakub
> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao


> On Oct 26, 2023, at 4:56 AM, Richard Biener  
> wrote:
> 
> On Thu, Oct 26, 2023 at 7:22 AM Jakub Jelinek  wrote:
>> 
>> On Wed, Oct 25, 2023 at 07:03:43PM +, Qing Zhao wrote:
>>> For the code generation impact:
>>> 
>>> turning the original  x.buf
>>> to a builtin function call
>>> __builtin_with_access_and_size(x,buf, x.L,-1)
>>> 
>>> might inhibit some optimizations from happening before the builtin is
>>> evaluated into object size info (phase  .objsz1).  I guess there might be
>>> some performance impact.
>>> 
>>> However, if we mark this builtin as PURE, NOTRROW, etc, then the negative
>>> performance impact will be reduced to minimum?
>> 
>> You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
>> be able to figure out the dynamic sizes in case of normal (non-early)
>> inlining - caller takes address of a counted_by array, passes it down
>> to callee which is only inlined late and uses __bdos, or callee takes address
>> and returns it and caller uses __bdos, etc. - so it would need to be objsz2.
>> 
>> And while the builtin (or if it is an internal detail rather than user
>> accessible builtin an internal function) could be even const/nothrow/leaf if
>> the arguments contain the loads from the structure 2 fields, I'm afraid it
>> will still have huge code generation impact, prevent tons of pre-IPA
>> optimizations.  And it will need some work to handle it properly during
>> inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't gimple
>> values, so it wouldn't be just the builtin/internal-fn call to be ignored,
>> but also the count load from memory.
> 
> I think we want to track the value, not the "memory" in the builtin call,
> so GIMPLE would be
> 
> _1 = x.L;
> .. = __builtin_with_access_and_size (&x.buf, _1, -1);

Before adding the __builtin_with_access_and_size, the code is:

&x.buf

After inserting the built-in, it becomes:

_1 = x.L;
__builtin_with_access_and_size (&x.buf, _1, -1).


So, the # of total instructions, the # of LOADs, and the # of calls will all be 
increased.
There will be impact to the inlining decision definitely.

> 
> also please make sure to use an internal function for
> __builtin_with_access_and_size,
> I don't think we want to expose this to users - it's an implementation detail.

Okay, will define it as an internal function (add it to internal-fn.def). -:)

Qing
> 
> Richard.
> 
>> 
>>Jakub
>> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao


> On Oct 26, 2023, at 5:20 AM, Martin Uecker  wrote:
> 
> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
>> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
>>> 
>>> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>>>> 
>>>>> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
>>>>> 
>>>>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>>>>>>> On 2023-10-25 04:16, Martin Uecker wrote:
>>>>>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>>>>>>>> 
>>>>>>>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>>>>>>>>> 
>>>>>>>>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>>>>>>>>>> Hi, Sid,
>>>>>>>>>> 
>>>>>>>>>> Really appreciate for your example and detailed explanation. Very 
>>>>>>>>>> helpful.
>>>>>>>>>> I think that this example is an excellent example to show (almost) 
>>>>>>>>>> all the issues we need to consider.
>>>>>>>>>> 
>>>>>>>>>> I slightly modified this example to make it to be compilable and 
>>>>>>>>>> run-able, as following:
>>>>>>>>>> (but I still cannot make the incorrect reordering or DSE happening, 
>>>>>>>>>> anyway, the potential reordering possibility is there…)
>>>>>>>>>> 
>>>>>>>>>> 1 #include 
>>>>>>>>>> 2 struct A
>>>>>>>>>> 3 {
>>>>>>>>>> 4  size_t size;
>>>>>>>>>> 5  char buf[] __attribute__((counted_by(size)));
>>>>>>>>>> 6 };
>>>>>>>>>> 7
>>>>>>>>>> 8 static size_t
>>>>>>>>>> 9 get_size_from (void *ptr)
>>>>>>>>>> 10 {
>>>>>>>>>> 11  return __builtin_dynamic_object_size (ptr, 1);
>>>>>>>>>> 12 }
>>>>>>>>>> 13
>>>>>>>>>> 14 void
>>>>>>>>>> 15 foo (size_t sz)
>>>>>>>>>> 16 {
>>>>>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>>>>>>>>> sizeof(char));
>>>>>>>>>> 18  obj->size = sz;
>>>>>>>>>> 19  obj->buf[0] = 2;
>>>>>>>>>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>>>>>>>>> 21  return;
>>>>>>>>>> 22 }
>>>>>>>>>> 23
>>>>>>>>>> 24 int main ()
>>>>>>>>>> 25 {
>>>>>>>>>> 26  foo (20);
>>>>>>>>>> 27  return 0;
>>>>>>>>>> 28 }
>>>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>>> When it’s set I suppose.  Turn
>>>>>>>> 
>>>>>>>> X.l = n;
>>>>>>>> 
>>>>>>>> Into
>>>>>>>> 
>>>>>>>> X.l = __builtin_with_size (x.buf, n);
>>>>>>> 
>>>>>>> It would turn
>>>>>>> 
>>>>>>> some_variable = (&) x.buf
>>>>>>> 
>>>>>>> into
>>>>>>> 
>>>>>>> some_variable = __builtin_with_size ( (&) x.buf. x.len)
>>>>>>> 
>>>>>>> 
>>>>>>> So the later access to x.buf and not the initialization
>>>>>>> of a member of the struct (which is too early).
>>>>>>> 
>>>>>> 
>>>>>> Hmm, so with Qing's example above, are you suggesting the transformation
>>>>>> be to foo like so:
>>>>>> 
>>>>>> 14 void
>>>>>> 15 foo (size_t sz)
>>>>>> 16 {
>>>>>> 16.5  void * _1;
>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>>>&g

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao


> On Oct 26, 2023, at 10:05 AM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 26.10.2023 um 12:14 schrieb Martin Uecker :
>> 
>> Am Donnerstag, dem 26.10.2023 um 11:20 +0200 schrieb Martin Uecker:
>>>> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
>>>> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
>>>>> 
>>>>> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>>>>>> 
>>>>>>> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
>>>>>>> 
>>>>>>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>>>>>>>>> On 2023-10-25 04:16, Martin Uecker wrote:
>>>>>>>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>>>>>>>>>> 
>>>>>>>>>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>>>>>>>>>>> 
>>>>>>>>>>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>>>>>>>>>>>> Hi, Sid,
>>>>>>>>>>>> 
>>>>>>>>>>>> Really appreciate for your example and detailed explanation. Very 
>>>>>>>>>>>> helpful.
>>>>>>>>>>>> I think that this example is an excellent example to show (almost) 
>>>>>>>>>>>> all the issues we need to consider.
>>>>>>>>>>>> 
>>>>>>>>>>>> I slightly modified this example to make it to be compilable and 
>>>>>>>>>>>> run-able, as following:
>>>>>>>>>>>> (but I still cannot make the incorrect reordering or DSE 
>>>>>>>>>>>> happening, anyway, the potential reordering possibility is there…)
>>>>>>>>>>>> 
>>>>>>>>>>>> 1 #include 
>>>>>>>>>>>> 2 struct A
>>>>>>>>>>>> 3 {
>>>>>>>>>>>> 4  size_t size;
>>>>>>>>>>>> 5  char buf[] __attribute__((counted_by(size)));
>>>>>>>>>>>> 6 };
>>>>>>>>>>>> 7
>>>>>>>>>>>> 8 static size_t
>>>>>>>>>>>> 9 get_size_from (void *ptr)
>>>>>>>>>>>> 10 {
>>>>>>>>>>>> 11  return __builtin_dynamic_object_size (ptr, 1);
>>>>>>>>>>>> 12 }
>>>>>>>>>>>> 13
>>>>>>>>>>>> 14 void
>>>>>>>>>>>> 15 foo (size_t sz)
>>>>>>>>>>>> 16 {
>>>>>>>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>>>>>>>>>>> sizeof(char));
>>>>>>>>>>>> 18  obj->size = sz;
>>>>>>>>>>>> 19  obj->buf[0] = 2;
>>>>>>>>>>>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>>>>>>>>>>> 21  return;
>>>>>>>>>>>> 22 }
>>>>>>>>>>>> 23
>>>>>>>>>>>> 24 int main ()
>>>>>>>>>>>> 25 {
>>>>>>>>>>>> 26  foo (20);
>>>>>>>>>>>> 27  return 0;
>>>>>>>>>>>> 28 }
>>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> When it’s set I suppose.  Turn
>>>>>>>>>> 
>>>>>>>>>> X.l = n;
>>>>>>>>>> 
>>>>>>>>>> Into
>>>>>>>>>> 
>>>>>>>>>> X.l = __builtin_with_size (x.buf, n);
>>>>>>>>> 
>>>>>>>>> It would turn
>>>>>>>>> 
>>>>>>>>> some_variable = (&) x.buf
>>>>>>>>> 
>>>>>>>>> into
>>>>>>>>> 
>>>>>>>>> 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao


> On Oct 26, 2023, at 1:05 PM, Martin Uecker  wrote:
> 
> Am Donnerstag, dem 26.10.2023 um 16:41 + schrieb Qing Zhao:
>> 
>>> On Oct 26, 2023, at 5:20 AM, Martin Uecker  wrote:
>>> 
>>> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
>>>> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
>>>>> 
>>>>> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>>>>>> 
>>>>>>> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
>>>>>>> 
>>>>>>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>>>>>>>>> On 2023-10-25 04:16, Martin Uecker wrote:
>>>>>>>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>>>>>>>>>> 
>>>>>>>>>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>>>>>>>>>>> 
>>>>>>>>>>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>>>>>>>>>>>> Hi, Sid,
>>>>>>>>>>>> 
>>>>>>>>>>>> Really appreciate for your example and detailed explanation. Very 
>>>>>>>>>>>> helpful.
>>>>>>>>>>>> I think that this example is an excellent example to show (almost) 
>>>>>>>>>>>> all the issues we need to consider.
>>>>>>>>>>>> 
>>>>>>>>>>>> I slightly modified this example to make it to be compilable and 
>>>>>>>>>>>> run-able, as following:
>>>>>>>>>>>> (but I still cannot make the incorrect reordering or DSE 
>>>>>>>>>>>> happening, anyway, the potential reordering possibility is there…)
>>>>>>>>>>>> 
>>>>>>>>>>>> 1 #include 
>>>>>>>>>>>> 2 struct A
>>>>>>>>>>>> 3 {
>>>>>>>>>>>> 4  size_t size;
>>>>>>>>>>>> 5  char buf[] __attribute__((counted_by(size)));
>>>>>>>>>>>> 6 };
>>>>>>>>>>>> 7
>>>>>>>>>>>> 8 static size_t
>>>>>>>>>>>> 9 get_size_from (void *ptr)
>>>>>>>>>>>> 10 {
>>>>>>>>>>>> 11  return __builtin_dynamic_object_size (ptr, 1);
>>>>>>>>>>>> 12 }
>>>>>>>>>>>> 13
>>>>>>>>>>>> 14 void
>>>>>>>>>>>> 15 foo (size_t sz)
>>>>>>>>>>>> 16 {
>>>>>>>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>>>>>>>>>>> sizeof(char));
>>>>>>>>>>>> 18  obj->size = sz;
>>>>>>>>>>>> 19  obj->buf[0] = 2;
>>>>>>>>>>>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>>>>>>>>>>> 21  return;
>>>>>>>>>>>> 22 }
>>>>>>>>>>>> 23
>>>>>>>>>>>> 24 int main ()
>>>>>>>>>>>> 25 {
>>>>>>>>>>>> 26  foo (20);
>>>>>>>>>>>> 27  return 0;
>>>>>>>>>>>> 28 }
>>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> When it’s set I suppose.  Turn
>>>>>>>>>> 
>>>>>>>>>> X.l = n;
>>>>>>>>>> 
>>>>>>>>>> Into
>>>>>>>>>> 
>>>>>>>>>> X.l = __builtin_with_size (x.buf, n);
>>>>>>>>> 
>>>>>>>>> It would turn
>>>>>>>>> 
>>>>>>>>> some_variable = (&) x.buf
>>>>>>>>> 
>>>>>>>>> into
>>>>>>>>> 
>>>>>>>>> some_variable = _

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao
I guess that what Kees wanted, ""fill the array without knowing the actual 
final size" code pattern”, as following:

>>  struct foo *f;
>>  char *p;
>>  int i;
>> 
>>  f = alloc(maximum_possible);
>>  f->count = 0;
>>  p = f->buf;
>> 
>>  for (i; data_is_available() && i < maximum_possible; i++) {
>>  f->count ++;
>>  p[i] = next_data_item();
>>  }

actually is a dynamic array, or more accurately, Bounded-size dynamic array: ( 
but not a dynamic allocated array as we discussed so far)

https://en.wikipedia.org/wiki/Dynamic_array

This dynamic array, also is called growable array, or resizable array, whose 
size can 
be changed during the lifetime. 

For VLA or FAM, I believe that they are both dynamic allocated array, i.e, even 
though the size is not know at the compilation time, but the size
will be fixed after the array is allocated. 

I am not sure whether C has support to such Dynamic array? Or whether it’s easy 
to provide dynamic array support in C?

Qing


> On Oct 26, 2023, at 12:45 PM, Martin Uecker  wrote:
> 
> Am Donnerstag, dem 26.10.2023 um 09:13 -0700 schrieb Kees Cook:
>> On Thu, Oct 26, 2023 at 10:15:10AM +0200, Martin Uecker wrote:
>>> but not this:
>>> 
> 
> x->count = 11;
>>> char *p = &x->buf;
>>> x->count = 1;
>>> p[10] = 1; // !
>> 
>> This seems fine to me -- it's how I'd expect it to work: "10" is beyond
>> "1".
> 
> Note that the store would be allowed.
> 
>> 
>>> (because the pointer is passed around the
>>> store to the counter)
>>> 
>>> and also here the second store is then irrelevant
>>> for the access:
>>> 
>>> x->count = 10;
>>> char* p = &x->buf;
>>> ...
>>> x->count = 1; // somewhere else
>>> 
>>> p[9] = 1; // ok, because count matter when buf was accesssed.
>> 
>> This is less great, but I can understand why it happens. "p" loses the
>> association with "x". It'd be nice if "p" had to way to retain that it
>> was just an alias for x->buf, so future p access would check count.
> 
> The problem is not to discover that p is an alias to x->buf, 
> but that it seems difficult to make sure that stores to 
> x->count are not reordered relative to the final access to
> p[i] you want to check, so that you then get the right value.
> 
>> 
>> But this appears to be an existing limitation in other areas where an
>> assignment will cause the loss of object association. (I've run into
>> this before.) It's just more surprising in the above example because in
>> the past the loss of association would cause __bdos() to revert back to
>> "SIZE_MAX" results ("I don't know the size") rather than an "outdated"
>> size, which may get us into unexpected places...
>> 
>>> IMHO this makes sense also from the user side and
>>> are the desirable semantics we discussed before.
>>> 
>>> But can you take a look at this?
>>> 
>>> 
>>> This should simulate it fairly well:
>>> https://godbolt.org/z/xq89aM7Gr
>>> 
>>> (the call to the noinline function would go away,
>>> but not necessarily its impact on optimization)
>> 
>> Yeah, this example should be a very rare situation: a leaf function is
>> changing the characteristics of the struct but returning a buffer within
>> it to the caller. The more likely glitch would be from:
>> 
>> int main()
>> {
>>  struct foo *f = foo_alloc(7);
>>  char *p = FAM_ACCESS(f, size, buf);
>> 
>>  printf("%ld\n", __builtin_dynamic_object_size(p, 0));
>>  test1(f); // or just "f->count = 10;" no function call needed
>>  printf("%ld\n", __builtin_dynamic_object_size(p, 0));
>> 
>>  return 0;
>> }
>> 
>> which reports:
>> 7
>> 7
>> 
>> instead of:
>> 7
>> 10
>> 
>> This kind of "get an alias" situation is pretty common in the kernel
>> as a way to have a convenient "handle" to the array. In the case of a
>> "fill the array without knowing the actual final size" code pattern,
>> things would immediately break:
>> 
>>  struct foo *f;
>>  char *p;
>>  int i;
>> 
>>  f = alloc(maximum_possible);
>>  f->count = 0;
>>  p = f->buf;
>> 
>>  for (i; data_is_available() && i < maximum_possible; i++) {
>>  f->count ++;
>>  p[i] = next_data_item();
>>  }
>> 
>> Now perhaps the problem here is that "count" cannot be used for a count
>> of "logically valid members in the array" but must always be a count of
>> "allocated member space in the array", which I guess is tolerable, but
>> isn't ideal -- I'd like to catch logic bugs in addition to allocation
>> bugs, but the latter is certainly much more important to catch.
> 
> Maybe we could have a warning when f->buf is not directly
> accessed.
> 
> Martin
> 
>> 
> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Qing Zhao


> On Oct 27, 2023, at 3:21 AM, Martin Uecker  wrote:
> 
> Am Donnerstag, dem 26.10.2023 um 19:57 + schrieb Qing Zhao:
>> I guess that what Kees wanted, ""fill the array without knowing the actual 
>> final size" code pattern”, as following:
>> 
>>>>struct foo *f;
>>>>char *p;
>>>>int i;
>>>> 
>>>>f = alloc(maximum_possible);
>>>>f->count = 0;
>>>>p = f->buf;
>>>> 
>>>>for (i; data_is_available() && i < maximum_possible; i++) {
>>>>f->count ++;
>>>>p[i] = next_data_item();
>>>>}
>> 
>> actually is a dynamic array, or more accurately, Bounded-size dynamic array: 
>> ( but not a dynamic allocated array as we discussed so far)
>> 
>> https://en.wikipedia.org/wiki/Dynamic_array
>> 
>> This dynamic array, also is called growable array, or resizable array, whose 
>> size can 
>> be changed during the lifetime. 
>> 
>> For VLA or FAM, I believe that they are both dynamic allocated array, i.e, 
>> even though the size is not know at the compilation time, but the size
>> will be fixed after the array is allocated. 
>> 
>> I am not sure whether C has support to such Dynamic array? Or whether it’s 
>> easy to provide dynamic array support in C?
> 
> It is possible to support dynamic arrays in C even with
> good checking, but not safely using the pattern above
> where you derive a pointer which you later use independently.
> 
> While we could track the connection to the original struct,
> the necessary synchronization between the counter and the
> access to the buffer is difficult.  I do not see how this
> could be supported with reasonable effort and cost.
> 
> 
> But with this restriction in mind, we can do a lot in C.
> For example, see my experimental (!) container library
> which has vector type.
> https://github.com/uecker/noplate/blob/main/test.c
> You can get an array view for the vector (which then
> also can decay to a pointer), so it interoperates nicely
> with C but you can get good bounds checking.
> 
> 
> But once you derive a pointer and pass it on, it gets
> difficult.  But if you want safety, you just have to 
> to simply avoid this in code. 

So, for the following modified code: (without the additional pointer “p”)

struct foo
{
 size_t count;
 char buf[] __attribute__((counted_by(count)));
};

struct foo *f;
int i;  

f = alloc(maximum_possible);
f->count = 0;

for (i; data_is_available() && i < maximum_possible; i++) {
  f->count ++;  
  f->buf[i] = next_data_item();
}   

The support for dynamic array should be possible? 


> 
> What we could potentially do is add restrictions so 
> that the access to buf always has to go via x->buf 
> or you get at least a warning.

Are the following two restrictions to the user enough:

1. The access to buf should always go via x->buf, 
no assignment to another independent pointer 
and access buf through this new pointer.
2.  User need to keep the synchronization between
  the counter and the access to the buffer all the time.


Qing
> 
> Martin
> 
> 
> 
> 
>> 
>> Qing
>> 
>> 
>>> On Oct 26, 2023, at 12:45 PM, Martin Uecker  wrote:
>>> 
>>> Am Donnerstag, dem 26.10.2023 um 09:13 -0700 schrieb Kees Cook:
>>>> On Thu, Oct 26, 2023 at 10:15:10AM +0200, Martin Uecker wrote:
>>>>> but not this:
>>>>> 
>>> 
>>> x->count = 11;
>>>>> char *p = &x->buf;
>>>>> x->count = 1;
>>>>> p[10] = 1; // !
>>>> 
>>>> This seems fine to me -- it's how I'd expect it to work: "10" is beyond
>>>> "1".
>>> 
>>> Note that the store would be allowed.
>>> 
>>>> 
>>>>> (because the pointer is passed around the
>>>>> store to the counter)
>>>>> 
>>>>> and also here the second store is then irrelevant
>>>>> for the access:
>>>>> 
>>>>> x->count = 10;
>>>>> char* p = &x->buf;
>>>>> ...
>>>>> x->count = 1; // somewhere else
>>>>> 
>>>>> p[9] = 1; // ok, because count matter when buf was accesssed.
>>>> 
>>>> This is less great, but I can understand why it happens. "p" loses the
>>>> association with "x". It'd be nice if "p" had to way to retain that it
>>>>

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Qing Zhao


> On Oct 27, 2023, at 10:53 AM, Martin Uecker  wrote:
> 
> Am Freitag, dem 27.10.2023 um 14:32 + schrieb Qing Zhao:
>> 
>>> On Oct 27, 2023, at 3:21 AM, Martin Uecker  wrote:
>>> 
>>> Am Donnerstag, dem 26.10.2023 um 19:57 + schrieb Qing Zhao:
>>>> I guess that what Kees wanted, ""fill the array without knowing the actual 
>>>> final size" code pattern”, as following:
>>>> 
>>>>>>  struct foo *f;
>>>>>>  char *p;
>>>>>>  int i;
>>>>>> 
>>>>>>  f = alloc(maximum_possible);
>>>>>>  f->count = 0;
>>>>>>  p = f->buf;
>>>>>> 
>>>>>>  for (i; data_is_available() && i < maximum_possible; i++) {
>>>>>>  f->count ++;
>>>>>>  p[i] = next_data_item();
>>>>>>  }
>>>> 
>>>> actually is a dynamic array, or more accurately, Bounded-size dynamic 
>>>> array: ( but not a dynamic allocated array as we discussed so far)
>>>> 
>>>> https://en.wikipedia.org/wiki/Dynamic_array
>>>> 
>>>> This dynamic array, also is called growable array, or resizable array, 
>>>> whose size can 
>>>> be changed during the lifetime. 
>>>> 
>>>> For VLA or FAM, I believe that they are both dynamic allocated array, i.e, 
>>>> even though the size is not know at the compilation time, but the size
>>>> will be fixed after the array is allocated. 
>>>> 
>>>> I am not sure whether C has support to such Dynamic array? Or whether it’s 
>>>> easy to provide dynamic array support in C?
>>> 
>>> It is possible to support dynamic arrays in C even with
>>> good checking, but not safely using the pattern above
>>> where you derive a pointer which you later use independently.
>>> 
>>> While we could track the connection to the original struct,
>>> the necessary synchronization between the counter and the
>>> access to the buffer is difficult.  I do not see how this
>>> could be supported with reasonable effort and cost.
>>> 
>>> 
>>> But with this restriction in mind, we can do a lot in C.
>>> For example, see my experimental (!) container library
>>> which has vector type.
>>> https://github.com/uecker/noplate/blob/main/test.c
>>> You can get an array view for the vector (which then
>>> also can decay to a pointer), so it interoperates nicely
>>> with C but you can get good bounds checking.
>>> 
>>> 
>>> But once you derive a pointer and pass it on, it gets
>>> difficult.  But if you want safety, you just have to 
>>> to simply avoid this in code. 
>> 
>> So, for the following modified code: (without the additional pointer “p”)
>> 
>> struct foo
>> {
>> size_t count;
>> char buf[] __attribute__((counted_by(count)));
>> };
>> 
>> struct foo *f;
>> int i;  
>> 
>> f = alloc(maximum_possible);
>> f->count = 0;
>> 
>> for (i; data_is_available() && i < maximum_possible; i++) {
>>  f->count ++;  
>>  f->buf[i] = next_data_item();
>> }   
>> 
>> The support for dynamic array should be possible? 
> 
> With the design we discussed this should work because
> __builtin_with_access (or whatever) it reads:
> 
> f = alloc(maximum_possible);
> f->count = 0;
> 
> for (i; data_is_available() && i < maximum_possible; i++) {
>  f->count ++;  
>  __builtin_with_access(f->buf, f->count)[i] = next_data_item();
> }   
> 

Yes, with the data flow, f->count should get the latest value of f->count. 
>> 
>> 
>>> 
>>> What we could potentially do is add restrictions so 
>>> that the access to buf always has to go via x->buf 
>>> or you get at least a warning.
>> 
>> Are the following two restrictions to the user enough:
>> 
>> 1. The access to buf should always go via x->buf, 
>>no assignment to another independent pointer 
>>and access buf through this new pointer.
> 
> Yes, maybe. One could also try to be smarter.
> 
> For example, one warn only when &f->buf is
> assigned to another pointer and one of the
> following conditions is fulfilled:
> 
> - the pointer escapes from the local context 
> 
> - there is a store to f->counter in the
> local context that does not dominate &f->buf.
> 
> Then Kees' example would work too in most cases.

I guess that we might need to come up with the list of concrete restrictions to 
the user, 
and list these restrictions in the user documentation.

Since  the dynamic array support is quite important to the kernel (is this 
true, Kees? ),
We might need to include such support into our design in the beginning. 

> 
> But I would probably wait until we have some
> initial experience with this feature.

You mean after we have an initial implementation of the “builtin_with_size”?
Yes, at this moment, I think that the “builtin_with_size” approach is the best 
one.
Just some details need more thinking before the real implementation.  -:)

Qing
> 
> Martin
> 
>> 2.  User need to keep the synchronization between
>>  the counter and the access to the buffer all the time.



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Qing Zhao
About where we should insert the new __builtin_with_access_and_size:

> On Oct 26, 2023, at 2:54 PM, Qing Zhao  wrote:
> 
> 
> 
>> On Oct 26, 2023, at 10:05 AM, Richard Biener  
>> wrote:
>> 
>> 
>> 
>>> Am 26.10.2023 um 12:14 schrieb Martin Uecker :
>>> 
>>> Am Donnerstag, dem 26.10.2023 um 11:20 +0200 schrieb Martin Uecker:
>>>>> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
>>>>> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
>>>>>> 
>>>>>> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>>>>>>> 
>>>>>>>> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
>>>>>>>> 
>>>>>>>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>>>>>>>>>> On 2023-10-25 04:16, Martin Uecker wrote:
>>>>>>>>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>>>>>>>>>>> 
>>>>>>>>>>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>>>>>>>>>>>> 
>>>>>>>>>>>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>>>>>>>>>>>>> Hi, Sid,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Really appreciate for your example and detailed explanation. Very 
>>>>>>>>>>>>> helpful.
>>>>>>>>>>>>> I think that this example is an excellent example to show 
>>>>>>>>>>>>> (almost) all the issues we need to consider.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I slightly modified this example to make it to be compilable and 
>>>>>>>>>>>>> run-able, as following:
>>>>>>>>>>>>> (but I still cannot make the incorrect reordering or DSE 
>>>>>>>>>>>>> happening, anyway, the potential reordering possibility is there…)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 1 #include 
>>>>>>>>>>>>> 2 struct A
>>>>>>>>>>>>> 3 {
>>>>>>>>>>>>> 4  size_t size;
>>>>>>>>>>>>> 5  char buf[] __attribute__((counted_by(size)));
>>>>>>>>>>>>> 6 };
>>>>>>>>>>>>> 7
>>>>>>>>>>>>> 8 static size_t
>>>>>>>>>>>>> 9 get_size_from (void *ptr)
>>>>>>>>>>>>> 10 {
>>>>>>>>>>>>> 11  return __builtin_dynamic_object_size (ptr, 1);
>>>>>>>>>>>>> 12 }
>>>>>>>>>>>>> 13
>>>>>>>>>>>>> 14 void
>>>>>>>>>>>>> 15 foo (size_t sz)
>>>>>>>>>>>>> 16 {
>>>>>>>>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>>>>>>>>>>>> sizeof(char));
>>>>>>>>>>>>> 18  obj->size = sz;
>>>>>>>>>>>>> 19  obj->buf[0] = 2;
>>>>>>>>>>>>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>>>>>>>>>>>> 21  return;
>>>>>>>>>>>>> 22 }
>>>>>>>>>>>>> 23
>>>>>>>>>>>>> 24 int main ()
>>>>>>>>>>>>> 25 {
>>>>>>>>>>>>> 26  foo (20);
>>>>>>>>>>>>> 27  return 0;
>>>>>>>>>>>>> 28 }
>>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>> When it’s set I suppose.  Turn
>>>>>>>>>>> 
>>>>>>>>>>> X.l = n;
>>>>>>>>>>> 
>>>>>>>>>>> Into
>>>>>>>>>&g

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Qing Zhao
Okay, thanks for the explanation.
We will keep this in mind.

Qing

> On Oct 27, 2023, at 1:19 PM, Kees Cook  wrote:
> 
> On Fri, Oct 27, 2023 at 03:10:22PM +0000, Qing Zhao wrote:
>> Since  the dynamic array support is quite important to the kernel (is this 
>> true, Kees? ),
>> We might need to include such support into our design in the beginning. 
> 
> tl;dr: We don't need "dynamic array support" in the 1st version of 
> __counted_by
> 
> I'm not sure it's as strong as "quite important", but it is a code
> pattern that exists. The vast majority of FAM usage is run-time fixed,
> in the sense that the allocation matches the usage. Only sometimes do we
> over-allocate and then slowly fill it up like I've shown.
> 
> So really my thoughts on this are to bring light to the usage pattern
> in the hopes that we don't make it an impossible thing to do. And if
> it's a limitation of the initial version of __counted_by, the kernel can
> still use it: it will just need to use __counted_by strictly for
> allocation sizes, not "usage" size:
> 
> struct foo {
>   int allocated;
>   int used;
>   int array[] __counted_by(allocated); // would nice to use "used"
> };
> 
>   struct foo *p;
> 
>   p = alloc(sizeof(*p) + sizeof(*p->array) * max_items);
>   p->allocated = max_items;
>   p->used = 0;
> 
>   while (data_available())
>   p->array[++p->used] = next_datum();
> 
> With this, we'll still catch p->array accesses beyond "allocated",
> but other code in the kernel won't catch "invalid data" accesses for
> p->array beyond "used". (i.e. we still have memory corruption protection,
> just not logic error protection.)
> 
> We can deal with aliasing in the future if we want to expand to catching
> logic errors.
> 
> I should not that we don't get logic error protection from things like
> ARM's Memory Tagging Extension either -- it only tracks allocation size
> (and is very expensive to change as the "used" part of an allocation
> grows), so this isn't an unreasonable condition for __counted_by to
> require as well.
> 
> -- 
> Kees Cook



RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-10-31 Thread Qing Zhao
Hi, 

I wrote a summary based on our extensive discussion, hopefully this can be 
served as an informal proposal. 

Please take a look at it and let me know any comment or suggestion.

There are some (???) in the section 3.2 and 3.6, those are my questions seeking 
for help.  -:)

Thanks again for all the help.

Qing.


Represent the missing dependence for the "counted_by" attribute and its 
consumers 

Qing Zhao

10/30/2023
==

The whole discussion is at:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633783.html

1. The problem

There is a data dependency between the size assignment and the implicit use of 
the size information in the __builtin_dynamic_object_size that is missing in 
the IL (line 11 and line 13 in the below example). Such information missing 
will result incorrect code reordering and other code transformations. 

  1 struct A
  2 {
  3  size_t size;
  4  char buf[] __attribute__((counted_by(size)));
  5 };
  6 
  7 size_t 
  8 foo (size_t sz)
  9 {
 10  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
 11  obj->size = sz;
 12  obj->buf[0] = 2;
 13  return __builtin_dynamic_object_size (obj->buf, 1);
 14 }
  
Please see a more complicate example in the Appendex 1.

We need to represent such data dependency correctly in the IL. 

2. The solution:

2.1 Summary

* Add a new internal function "ACCESS_WITH_SIZE" to carry the size information 
for every FAM field access;
* In C FE, Replace every FAM field access whose TYPE has the "counted_by" 
attribute with the new internal function "ACCESS_WITH_SIZE";
* In every consumer of the size information, for example, BDOS or array bound 
sanitizer, query the size information or ACCESS_MODE information from the new 
internal function;
* When the size information and the "ACCESS_MODE" information are not used 
anymore, possibly at the 2nd object size phase, replace the internal function 
with the actual FAM field access; 
* Some adjustment to inlining heuristic and some SSA passes to mitigate the 
impact to the optimizer and code generation. 

2.2 The new internal function 

  .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)

INTERNAL_FN (ACCESS_WITH_SIZE, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)

which returns the "PTR" same as the 1st argument;

1st argument "PTR": Pointer to the object;
2nd argument "SIZE": The size of the pointed object, 
  if the pointee of the "PTR" has a
* real type, it's the number of the elements of the type;
* void type, it's the number of bytes; 
3rd argument "ACCESS_MODE": 
  -1: Unknown access semantics
   0: none
   1: read_only
   2: write_only
   3: read_write

NOTEs, 
  A. This new internal function is intended for a more general use from all the 
3 attributes, "access", "alloc_size", and the new "counted_by", to encode the 
"size" and "access_mode" information to the corresponding pointer. (in order to 
resolve PR96503, etc. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503)
  B. For "counted_by" and "alloc_size" attributes, the 3rd argument will be -1. 
  
  C. In this wrieup, we focus on the implementation details for the 
"counted_by" attribute. However, this function should be ready to be used by 
"access" and "alloc_size" without issue. 

2.3 A new semantic requirement in the user documentation of "counted_by"

For the following structure including a FAM with a counted_by attribute:

  struct A
  {
   size_t size;
   char buf[] __attribute__((counted_by(size)));
  };

for any object with such type:

  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));

The setting to the size field should be done before the first reference to the 
FAM field.

Such requirement to the user will guarantee that the first reference to the FAM 
knows the size of the FAM.  

We need to add this additional requirement to the user document.

2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE

In C FE:

for every reference to a FAM, for example, "obj->buf" in the small example,
  check whether the corresponding FIELD_DECL has a "counted_by" attribute?
  if YES, replace the reference to "obj->buf" with a call to
  .ACCESS_WITH_SIZE (obj->buf, obj->size, -1); 

2.5 Query the size info 

There are multiple consumers of the size info (and ACCESS_MODE info):

  * __builtin_dynamic_object_size;
  * array bound sanitizer;

in these consumers, get the size info from the 2nd argument of the call to
ACCESS_WITH_SIZE (PTR, SIZE, -1)

2.6 Eliminate the internal function when not useful anymore

After the last consumer of the size information in the ACCESS_WITH_SIZE, We 
should replace the internal call wi

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-10-31 Thread Qing Zhao


> On Oct 31, 2023, at 1:35 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-31 12:26, Qing Zhao wrote:
>> Hi,
>> I wrote a summary based on our extensive discussion, hopefully this can be 
>> served as an informal proposal.
>> Please take a look at it and let me know any comment or suggestion.
>> There are some (???) in the section 3.2 and 3.6, those are my questions 
>> seeking for help.  -:)
>> Thanks again for all the help.
>> Qing.
>> 
>> Represent the missing dependence for the "counted_by" attribute and its 
>> consumers
>> Qing Zhao
>> 10/30/2023
>> ==
>> The whole discussion is at:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633783.html
>> 1. The problem
>> There is a data dependency between the size assignment and the implicit use 
>> of the size information in the __builtin_dynamic_object_size that is missing 
>> in the IL (line 11 and line 13 in the below example). Such information 
>> missing will result incorrect code reordering and other code transformations.
>>   1 struct A
>>   2 {
>>   3  size_t size;
>>   4  char buf[] __attribute__((counted_by(size)));
>>   5 };
>>   6
>>   7 size_t
>>   8 foo (size_t sz)
>>   9 {
>>  10  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>  11  obj->size = sz;
>>  12  obj->buf[0] = 2;
>>  13  return __builtin_dynamic_object_size (obj->buf, 1);
>>  14 }
>>   Please see a more complicate example in the Appendex 1.
>> We need to represent such data dependency correctly in the IL.
>> 2. The solution:
>> 2.1 Summary
>> * Add a new internal function "ACCESS_WITH_SIZE" to carry the size 
>> information for every FAM field access;
>> * In C FE, Replace every FAM field access whose TYPE has the "counted_by" 
>> attribute with the new internal function "ACCESS_WITH_SIZE";
>> * In every consumer of the size information, for example, BDOS or array 
>> bound sanitizer, query the size information or ACCESS_MODE information from 
>> the new internal function;
>> * When the size information and the "ACCESS_MODE" information are not used 
>> anymore, possibly at the 2nd object size phase, replace the internal 
>> function with the actual FAM field access;
>> * Some adjustment to inlining heuristic and some SSA passes to mitigate the 
>> impact to the optimizer and code generation.
>> 2.2 The new internal function
>>   .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
>> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>> which returns the "PTR" same as the 1st argument;
>> 1st argument "PTR": Pointer to the object;
>> 2nd argument "SIZE": The size of the pointed object,
>>   if the pointee of the "PTR" has a
>> * real type, it's the number of the elements of the type;
>> * void type, it's the number of bytes;
>> 3rd argument "ACCESS_MODE":
>>   -1: Unknown access semantics
>>0: none
>>1: read_only
>>2: write_only
>>3: read_write
>> NOTEs,
>>   A. This new internal function is intended for a more general use from all 
>> the 3 attributes, "access", "alloc_size", and the new "counted_by", to 
>> encode the "size" and "access_mode" information to the corresponding 
>> pointer. (in order to resolve PR96503, etc. 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503)
>>   B. For "counted_by" and "alloc_size" attributes, the 3rd argument will be 
>> -1.
>>   C. In this wrieup, we focus on the implementation details for the 
>> "counted_by" attribute. However, this function should be ready to be used by 
>> "access" and "alloc_size" without issue.
>> 2.3 A new semantic requirement in the user documentation of "counted_by"
>> For the following structure including a FAM with a counted_by attribute:
>>   struct A
>>   {
>>size_t size;
>>char buf[] __attribute__((counted_by(size)));
>>   };
>> for any object with such type:
>>   struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>> The setting to the size field should be done before the first reference to 
>> the FAM field.
> 
> A more flexible specification could be stating that validation for a 
> reference to the FAM field will use the latest value assigned t

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-01 Thread Qing Zhao


> On Oct 31, 2023, at 6:14 PM, Joseph Myers  wrote:
> 
> On Tue, 31 Oct 2023, Qing Zhao wrote:
> 
>> 2.3 A new semantic requirement in the user documentation of "counted_by"
>> 
>> For the following structure including a FAM with a counted_by attribute:
>> 
>>  struct A
>>  {
>>   size_t size;
>>   char buf[] __attribute__((counted_by(size)));
>>  };
>> 
>> for any object with such type:
>> 
>>  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>> 
>> The setting to the size field should be done before the first reference 
>> to the FAM field.
>> 
>> Such requirement to the user will guarantee that the first reference to 
>> the FAM knows the size of the FAM.
>> 
>> We need to add this additional requirement to the user document.
> 
> Make sure the manual is very specific about exactly when size is 
> considered to be an accurate representation of the space available for buf 
> (given that, after malloc or realloc, it's going to be temporarily 
> inaccurate).  If the intent is that inaccurate size at such a time means 
> undefined behavior, say so explicitly.

Yes, good point. We need to define this clearly in the beginning. 
We need to explicit say that 

the size of the FAM is defined by the latest “counted_by” value. And it’s an 
undefined behavior when the size field is not defined when the FAM is 
referenced.

Is the above good enough?


> 
>> 2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
>> 
>> In C FE:
>> 
>> for every reference to a FAM, for example, "obj->buf" in the small example,
>>  check whether the corresponding FIELD_DECL has a "counted_by" attribute?
>>  if YES, replace the reference to "obj->buf" with a call to
>>  .ACCESS_WITH_SIZE (obj->buf, obj->size, -1); 
> 
> This seems plausible - but you should also consider the case of static 
> initializers - remember the GNU extension for statically allocated objects 
> with flexible array members (unless you're not allowing it with 
> counted_by).
> 
> static struct A x = { sizeof "hello", "hello" };
> static char *y = &x.buf;
> 
> I'd expect that to be valid - and unless you say such a usage is invalid, 

At this moment, I think that this should be valid.

I,e, the following:

struct A
{
 size_t size;
 char buf[] __attribute__((counted_by(size)));
};

static struct A x = {sizeof "hello", "hello”};

Should be valid, and x.size represents the number of elements of x.buf. 
Both x.size and x.buf are initialized statically. 

> you should avoid the replacement in such a static initializer context when 
> the FAM reference is to an object with a constant address (if 
> .ACCESS_WITH_SIZE would not act as an lvalue whose address is a constant 
> expression; if it works fine as a constant-address lvalue, then the 
> replacement would be OK).

Then if such usage for the “counted_by” is valid, we need to replace the FAM 
reference by a call to  .ACCESS_WITH_SIZE as well.
Otherwise the “counted_by” relationship will be lost to the Middle end. 

With the current definition of .ACCESS_WITH_SIZE

PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)

Isn’t the PTR (return value of the call) a LVALUE? 

Qing
> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com



Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-01 Thread Qing Zhao


> On Nov 1, 2023, at 11:00 AM, Martin Uecker  wrote:
> 
> Am Mittwoch, dem 01.11.2023 um 14:47 + schrieb Qing Zhao:
>> 
>>> On Oct 31, 2023, at 6:14 PM, Joseph Myers  wrote:
>>> 
>>> On Tue, 31 Oct 2023, Qing Zhao wrote:
>>> 
>>>> 2.3 A new semantic requirement in the user documentation of "counted_by"
>>>> 
>>>> For the following structure including a FAM with a counted_by attribute:
>>>> 
>>>> struct A
>>>> {
>>>>  size_t size;
>>>>  char buf[] __attribute__((counted_by(size)));
>>>> };
>>>> 
>>>> for any object with such type:
>>>> 
>>>> struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>>> 
>>>> The setting to the size field should be done before the first reference 
>>>> to the FAM field.
>>>> 
>>>> Such requirement to the user will guarantee that the first reference to 
>>>> the FAM knows the size of the FAM.
>>>> 
>>>> We need to add this additional requirement to the user document.
>>> 
>>> Make sure the manual is very specific about exactly when size is 
>>> considered to be an accurate representation of the space available for buf 
>>> (given that, after malloc or realloc, it's going to be temporarily 
>>> inaccurate).  If the intent is that inaccurate size at such a time means 
>>> undefined behavior, say so explicitly.
>> 
>> Yes, good point. We need to define this clearly in the beginning. 
>> We need to explicit say that 
>> 
>> the size of the FAM is defined by the latest “counted_by” value. And it’s an 
>> undefined behavior when the size field is not defined when the FAM is 
>> referenced.
> 
> It is defined by the latest "counted_by" value before x.buf
> is referenced, but not the latest before x.buf is dereferenced.

Then:

The size of the FAM is defined by the latest “counted_by” value before the FAM 
is referenced. 
It’s an undefined behavior when the “counted_by” value is not initialized 
before the FAM is referenced. 

> 
>> 
>> Is the above good enough?
>> 
>> 
>>> 
>>>> 2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
>>>> 
>>>> In C FE:
>>>> 
>>>> for every reference to a FAM, for example, "obj->buf" in the small example,
>>>> check whether the corresponding FIELD_DECL has a "counted_by" attribute?
>>>> if YES, replace the reference to "obj->buf" with a call to
>>>> .ACCESS_WITH_SIZE (obj->buf, obj->size, -1); 
>>> 
>>> This seems plausible - but you should also consider the case of static 
>>> initializers - remember the GNU extension for statically allocated objects 
>>> with flexible array members (unless you're not allowing it with 
>>> counted_by).
>>> 
>>> static struct A x = { sizeof "hello", "hello" };
>>> static char *y = &x.buf;
>>> 
>>> I'd expect that to be valid - and unless you say such a usage is invalid, 
>> 
>> At this moment, I think that this should be valid.
>> 
>> I,e, the following:
>> 
>> struct A
>> {
>> size_t size;
>> char buf[] __attribute__((counted_by(size)));
>> };
>> 
>> static struct A x = {sizeof "hello", "hello”};
>> 
>> Should be valid, and x.size represents the number of elements of x.buf. 
>> Both x.size and x.buf are initialized statically. 
> 
> Joseph is talking about the compile-time initialization of y.

Okay, -:) 
so, this is the point where the x.buf is referenced,
 and I think that replacing this reference to a call to .ACCESS_WITH_SIZE is 
still needed.
Otherwise, the “counted_by” relationship will NOT be seen by the middle-end 
anymore.


> 
>> 
>>> you should avoid the replacement in such a static initializer context when 
>>> the FAM reference is to an object with a constant address (if 
>>> .ACCESS_WITH_SIZE would not act as an lvalue whose address is a constant 
>>> expression; if it works fine as a constant-address lvalue, then the 
>>> replacement would be OK).
>> 
>> Then if such usage for the “counted_by” is valid, we need to replace the FAM 
>> reference by a call to  .ACCESS_WITH_SIZE as well.
>> Otherwise the “counted_by” relationship will be lost to the Middle end. 
>> 
>> With the current definition of .ACCESS_WITH_SIZE
>> 
>> PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
>> 
>> Isn’t the PTR (return value of the call) a LVALUE? 
> 
> The question is whether we get an address constant
> that can be used for compile-time initialization.

Oh, I see.

So, now, PTR is already an constant at FE, the replacement will be

.ACCESS_WITH_SIZE( CONSTANT_ADDRESS, SIZE, ACCESS_MODE)

This looks awkward….
Should we allow this?

If not allowed, then the “counted_by” attribute will not work for the static 
initialization. 

> 
> I think it would be good to collect a list of test
> cases and to include this example.

Yes, I will put this into the testing case list.

Qing
> 
> Martin
> 
>> 
>> Qing
>>> 
>>> -- 
>>> Joseph S. Myers
>>> jos...@codesourcery.com



Help: which routine in C FE I should look at for the reference to a FAM field?

2023-11-01 Thread Qing Zhao
Joseph and Martin,

For the task to replace every reference to a FAM field with an call to 
.ACCESS_WITH_SIZE, 
Where in the C FE I should look at?

Thanks a lot for the help.


Qing

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Qing Zhao


> On Nov 2, 2023, at 3:57 AM, Richard Biener  wrote:
> 
> On Wed, Nov 1, 2023 at 3:47 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Oct 31, 2023, at 6:14 PM, Joseph Myers  wrote:
>>> 
>>> On Tue, 31 Oct 2023, Qing Zhao wrote:
>>> 
>>>> 2.3 A new semantic requirement in the user documentation of "counted_by"
>>>> 
>>>> For the following structure including a FAM with a counted_by attribute:
>>>> 
>>>> struct A
>>>> {
>>>>  size_t size;
>>>>  char buf[] __attribute__((counted_by(size)));
>>>> };
>>>> 
>>>> for any object with such type:
>>>> 
>>>> struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>>> 
>>>> The setting to the size field should be done before the first reference
>>>> to the FAM field.
>>>> 
>>>> Such requirement to the user will guarantee that the first reference to
>>>> the FAM knows the size of the FAM.
>>>> 
>>>> We need to add this additional requirement to the user document.
>>> 
>>> Make sure the manual is very specific about exactly when size is
>>> considered to be an accurate representation of the space available for buf
>>> (given that, after malloc or realloc, it's going to be temporarily
>>> inaccurate).  If the intent is that inaccurate size at such a time means
>>> undefined behavior, say so explicitly.
>> 
>> Yes, good point. We need to define this clearly in the beginning.
>> We need to explicit say that
>> 
>> the size of the FAM is defined by the latest “counted_by” value. And it’s an 
>> undefined behavior when the size field is not defined when the FAM is 
>> referenced.
>> 
>> Is the above good enough?
>> 
>> 
>>> 
>>>> 2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
>>>> 
>>>> In C FE:
>>>> 
>>>> for every reference to a FAM, for example, "obj->buf" in the small example,
>>>> check whether the corresponding FIELD_DECL has a "counted_by" attribute?
>>>> if YES, replace the reference to "obj->buf" with a call to
>>>> .ACCESS_WITH_SIZE (obj->buf, obj->size, -1);
>>> 
>>> This seems plausible - but you should also consider the case of static
>>> initializers - remember the GNU extension for statically allocated objects
>>> with flexible array members (unless you're not allowing it with
>>> counted_by).
>>> 
>>> static struct A x = { sizeof "hello", "hello" };
>>> static char *y = &x.buf;
>>> 
>>> I'd expect that to be valid - and unless you say such a usage is invalid,
>> 
>> At this moment, I think that this should be valid.
>> 
>> I,e, the following:
>> 
>> struct A
>> {
>> size_t size;
>> char buf[] __attribute__((counted_by(size)));
>> };
>> 
>> static struct A x = {sizeof "hello", "hello”};
>> 
>> Should be valid, and x.size represents the number of elements of x.buf.
>> Both x.size and x.buf are initialized statically.
>> 
>>> you should avoid the replacement in such a static initializer context when
>>> the FAM reference is to an object with a constant address (if
>>> .ACCESS_WITH_SIZE would not act as an lvalue whose address is a constant
>>> expression; if it works fine as a constant-address lvalue, then the
>>> replacement would be OK).
>> 
>> Then if such usage for the “counted_by” is valid, we need to replace the FAM
>> reference by a call to  .ACCESS_WITH_SIZE as well.
>> Otherwise the “counted_by” relationship will be lost to the Middle end.
>> 
>> With the current definition of .ACCESS_WITH_SIZE
>> 
>> PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
>> 
>> Isn’t the PTR (return value of the call) a LVALUE?
> 
> You probably want to specify that when a pointer to the array is taken the
> pointer has to be to the first array element (or do we want to mangle the
> 'size' accordingly for the instrumentation?).

Yes. Will add this into the user documentation.

>  You also want to specify that
> the 'size' associated with such pointer is assumed to be unchanging and
> after changing the size such pointer has to be re-obtained.

What do you mean by “re-obtained”? 

>  Plus that
> changes to the allocated object/size have to be performed through an
> lvalue where the containing type and thus the 'counted_by' attribute is
> visible.

Through an lvalue with the containing type?

Yes, will add this too. 


>  That is,
> 
> size_t *s = &a.size;
> *s = 1;
> 
> is invoking undefined behavior,

right.

> likewise modifying 'buf' (makes it a bit
> awkward since for example that wouldn't support using posix_memalign
> for allocation, though aligned_alloc would be fine).
Is there a small example for the undefined behavior for this?

Qing
> 
> Richard.
> 
>> Qing
>>> 
>>> --
>>> Joseph S. Myers
>>> jos...@codesourcery.com
>> 



Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Qing Zhao


> On Nov 2, 2023, at 9:54 AM, Richard Biener  wrote:
> 
> On Thu, Nov 2, 2023 at 2:50 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Nov 2, 2023, at 3:57 AM, Richard Biener  
>>> wrote:
>>> 
>>> On Wed, Nov 1, 2023 at 3:47 PM Qing Zhao  wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Oct 31, 2023, at 6:14 PM, Joseph Myers  wrote:
>>>>> 
>>>>> On Tue, 31 Oct 2023, Qing Zhao wrote:
>>>>> 
>>>>>> 2.3 A new semantic requirement in the user documentation of "counted_by"
>>>>>> 
>>>>>> For the following structure including a FAM with a counted_by attribute:
>>>>>> 
>>>>>> struct A
>>>>>> {
>>>>>> size_t size;
>>>>>> char buf[] __attribute__((counted_by(size)));
>>>>>> };
>>>>>> 
>>>>>> for any object with such type:
>>>>>> 
>>>>>> struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>>>>> 
>>>>>> The setting to the size field should be done before the first reference
>>>>>> to the FAM field.
>>>>>> 
>>>>>> Such requirement to the user will guarantee that the first reference to
>>>>>> the FAM knows the size of the FAM.
>>>>>> 
>>>>>> We need to add this additional requirement to the user document.
>>>>> 
>>>>> Make sure the manual is very specific about exactly when size is
>>>>> considered to be an accurate representation of the space available for buf
>>>>> (given that, after malloc or realloc, it's going to be temporarily
>>>>> inaccurate).  If the intent is that inaccurate size at such a time means
>>>>> undefined behavior, say so explicitly.
>>>> 
>>>> Yes, good point. We need to define this clearly in the beginning.
>>>> We need to explicit say that
>>>> 
>>>> the size of the FAM is defined by the latest “counted_by” value. And it’s 
>>>> an undefined behavior when the size field is not defined when the FAM is 
>>>> referenced.
>>>> 
>>>> Is the above good enough?
>>>> 
>>>> 
>>>>> 
>>>>>> 2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
>>>>>> 
>>>>>> In C FE:
>>>>>> 
>>>>>> for every reference to a FAM, for example, "obj->buf" in the small 
>>>>>> example,
>>>>>> check whether the corresponding FIELD_DECL has a "counted_by" attribute?
>>>>>> if YES, replace the reference to "obj->buf" with a call to
>>>>>>.ACCESS_WITH_SIZE (obj->buf, obj->size, -1);
>>>>> 
>>>>> This seems plausible - but you should also consider the case of static
>>>>> initializers - remember the GNU extension for statically allocated objects
>>>>> with flexible array members (unless you're not allowing it with
>>>>> counted_by).
>>>>> 
>>>>> static struct A x = { sizeof "hello", "hello" };
>>>>> static char *y = &x.buf;
>>>>> 
>>>>> I'd expect that to be valid - and unless you say such a usage is invalid,
>>>> 
>>>> At this moment, I think that this should be valid.
>>>> 
>>>> I,e, the following:
>>>> 
>>>> struct A
>>>> {
>>>> size_t size;
>>>> char buf[] __attribute__((counted_by(size)));
>>>> };
>>>> 
>>>> static struct A x = {sizeof "hello", "hello”};
>>>> 
>>>> Should be valid, and x.size represents the number of elements of x.buf.
>>>> Both x.size and x.buf are initialized statically.
>>>> 
>>>>> you should avoid the replacement in such a static initializer context when
>>>>> the FAM reference is to an object with a constant address (if
>>>>> .ACCESS_WITH_SIZE would not act as an lvalue whose address is a constant
>>>>> expression; if it works fine as a constant-address lvalue, then the
>>>>> replacement would be OK).
>>>> 
>>>> Then if such usage for the “counted_by” is valid, we need to replace the 
>>>> FAM
>>>

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Qing Zhao
Thanks a lot for raising these issues. 

If I understand correctly,  the major question we need to answer is:

For the following example: (Jakub mentioned this  in an early message)

  1 struct S { int a; char b __attribute__((counted_by (a))) []; };
  2 struct S s;
  3 s.a = 5;
  4 char *p = &s.b[2];
  5 int i1 = __builtin_dynamic_object_size (p, 0);
  6 s.a = 3;
  7 int i2 = __builtin_dynamic_object_size (p, 0);

Should the 2nd __bdos call (line 7) get
A. the latest value of s.a (line 6) for it’s size? 
Or  B. the value when the s.b was referenced (line 3, line 4)?

A should be more convenient for the user to use the dynamic array feature.
With B, the user has to modify the source code (to add code to “re-obtain” 
the pointer after the size was adjusted at line 6) as mentioned by Richard. 

This depends on how we design the new internal function .ACCESS_WITH_SIZE

1. Size is passed by value to .ACCESS_WITH_SIZE as we currently designed. 

PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)

2. Size is passed by reference to .ACCESS_WITH_SIZE as Jakub suggested.

PTR = .ACCESS_WITH_SIZE(PTR, &SIZE, TYPEOFSIZE, ACCESS_MODE)

With 1, We can only provide B, the user needs to modify the source code to get 
the full feature of dynamic array;
With 2, We can provide  A, the user will get full support to the dynamic array 
without restrictions in the source code. 

However, We have to pay additional cost for supporting A by using 2, which 
includes:

1. .ACCESS_WITH_SIZE will become an escape point, which will further impact the 
IPA optimizations, more runtime overhead. 
Then .ACCESS_WTH_SIZE will not be CONST, right? But it will still be PURE?

2. __builtin_dynamic_object_size will NOT be LEAF anymore.  This will also 
impact some IPA optimizations, more runtime overhead. 

I think the following are the factors that make the decision:

1. How big the performance impact?
2. How important the dynamic array feature? Is adding some user restrictions as 
Richard mentioned feasible to support this feature?

Maybe we can implement 1 first, if the full support to the dynamic array is 
needed, we can add 2 then? 
Or, we can implement both, and compare the performance difference, then decide?

Qing




> On Nov 2, 2023, at 8:09 AM, Jakub Jelinek  wrote:
> 
> On Thu, Nov 02, 2023 at 12:52:50PM +0100, Richard Biener wrote:
>>> What I meant is to emit
>>> tmp_4 = .ACCESS_WITH_SIZE (&s.b[0], &s.a, (typeof (&s.a)) 0);
>>> p_5 = &tmp_4[2];
>>> i.e. don't associate the pointer with a value of the size, but with
>>> an address where to find the size (plus how large it is), basically escape
>>> pointer to the size at that point.  And __builtin_dynamic_object_size is 
>>> pure,
>>> so supposedly it can depend on what the escaped pointer points to.
>> 
>> Well, yeah - that would work but depend on .ACCESS_WITH_SIZE being an
>> escape point (quite bad IMHO)
> 
> That is why I've said we need to decide what cost we want to suffer because
> of that.
> 
>> and __builtin_dynamic_object_size being
>> non-const (that's probably not too bad).
> 
> It is already pure,leaf,nothrow (unlike __builtin_object_size which is 
> obviously
> const,leaf,nothrow).  Because under the hood, it can read memory when
> expanded.
> 
>>> We'd see that a particular pointer is size associated with &s.a address
>>> and would use that address cast to the type of the third argument (to
>>> preserve the exact pointer type on INTEGER_CST, though not sure, wouldn't
>>> VN CSE it anyway if one has say
>>> union U { struct S { int a; char b __attribute__((counted_by (a))) []; } s;
>>>  struct T { char c, d, e, f; char g __attribute__((counted_by (c))) 
>>> []; } t; };
>>> and
>>> .ACCESS_WITH_SIZE (&v.s.b[0], &v.s.a, (int *) 0);
>>> ...
>>> .ACCESS_WITH_SIZE (&v.t.g[0], &v.t.c, (int *) 0);
>>> ?
>> 
>> We'd probably CSE that - the usual issue of address-with-same-value.
>> 
>>> It would mean though that counted_by wouldn't be allowed to be a
>>> bit-field...
>> 
>> Yup.  We could also pass a pointer to the container though, that's good 
>> enough
>> for the escape, and pass the size by value in addition to that.
> 
> I was wondering about stuff like _BitInt.  But sure, counted_by is just an
> extension, we can just refuse counting by _BitInt in addition to counting by
> floating point, pointers, aggregates, bit-fields, or we could somehow encode
> all the needed type's properties numerically into an integral constant.
> Similarly for alias set (unless it uses 0 for reads).
> 
>   Jakub
> 



Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Qing Zhao


> On Nov 2, 2023, at 7:52 AM, Richard Biener  wrote:
> 
> On Thu, Nov 2, 2023 at 11:40 AM Jakub Jelinek  wrote:
>> 
>> On Thu, Nov 02, 2023 at 11:18:09AM +0100, Richard Biener wrote:
 Or, if we want to pay further price, .ACCESS_WITH_SIZE could take as one of
 the arguments not the size value, but its address.  Then at __bdos time
 we would dereference that pointer to get the size.
 So,
 struct S { int a; char b __attribute__((counted_by (a))) []; };
 struct S s;
 s.a = 5;
 char *p = &s.b[2];
 int i1 = __builtin_dynamic_object_size (p, 0);
 s.a = 3;
 int i2 = __builtin_dynamic_object_size (p, 0);
 would then yield 3 and 1 rather than 3 and 3.
>>> 
>>> I fail to see how we can get the __builtin_dynamic_object_size call
>>> data dependent on s.a, thus avoid re-ordering or even DSE of the
>>> store.
>> 
>> If &s.b[2] is lowered as
>> sz_1 = s.a;
>> tmp_2 = .ACCESS_WITH_SIZE (&s.b[0], sz_1);
>> p_3 = &tmp_2[2];
>> then sure, there is no way, you get the size from that point.
>> tree-object-size.cc tracking then determines that in a particular
>> case the pointer is size associated with sz_1 and use that value
>> as the size (with the usual adjustments for pointer arithmetics and the
>> like).
>> 
>> What I meant is to emit
>> tmp_4 = .ACCESS_WITH_SIZE (&s.b[0], &s.a, (typeof (&s.a)) 0);
>> p_5 = &tmp_4[2];
>> i.e. don't associate the pointer with a value of the size, but with
>> an address where to find the size (plus how large it is), basically escape
>> pointer to the size at that point.  And __builtin_dynamic_object_size is 
>> pure,
>> so supposedly it can depend on what the escaped pointer points to.
> 
> Well, yeah - that would work but depend on .ACCESS_WITH_SIZE being an
> escape point (quite bad IMHO) and __builtin_dynamic_object_size being
> non-const (that's probably not too bad).
> 
>> We'd see that a particular pointer is size associated with &s.a address
>> and would use that address cast to the type of the third argument (to
>> preserve the exact pointer type on INTEGER_CST, though not sure, wouldn't
>> VN CSE it anyway if one has say
>> union U { struct S { int a; char b __attribute__((counted_by (a))) []; } s;
>>  struct T { char c, d, e, f; char g __attribute__((counted_by (c))) 
>> []; } t; };
>> and
>> .ACCESS_WITH_SIZE (&v.s.b[0], &v.s.a, (int *) 0);
>> ...
>> .ACCESS_WITH_SIZE (&v.t.g[0], &v.t.c, (int *) 0);
>> ?
> 
> We'd probably CSE that - the usual issue of address-with-same-value.
> 
>> It would mean though that counted_by wouldn't be allowed to be a
>> bit-field...
> 
> Yup.  We could also pass a pointer to the container though, that's good enough
> for the escape, and pass the size by value in addition to that.
Could you explain a little bit more here? Then the .ACCESS_WITH_SIZE will become

PTR = .ACCESS_WITH_SIZE (PTR, &PTR’s Container, SIZE, ACCESS_MODE)

??

> 
>>Jakub
>> 



Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Qing Zhao



> On Nov 2, 2023, at 8:09 AM, Jakub Jelinek  wrote:
> 
> On Thu, Nov 02, 2023 at 12:52:50PM +0100, Richard Biener wrote:
>>> What I meant is to emit
>>> tmp_4 = .ACCESS_WITH_SIZE (&s.b[0], &s.a, (typeof (&s.a)) 0);
>>> p_5 = &tmp_4[2];
>>> i.e. don't associate the pointer with a value of the size, but with
>>> an address where to find the size (plus how large it is), basically escape
>>> pointer to the size at that point.  And __builtin_dynamic_object_size is 
>>> pure,
>>> so supposedly it can depend on what the escaped pointer points to.
>> 
>> Well, yeah - that would work but depend on .ACCESS_WITH_SIZE being an
>> escape point (quite bad IMHO)
> 
> That is why I've said we need to decide what cost we want to suffer because
> of that.
> 
>> and __builtin_dynamic_object_size being
>> non-const (that's probably not too bad).
> 
> It is already pure,leaf,nothrow (unlike __builtin_object_size which is 
> obviously
> const,leaf,nothrow).  Because under the hood, it can read memory when
> expanded.
> 
>>> We'd see that a particular pointer is size associated with &s.a address
>>> and would use that address cast to the type of the third argument (to
>>> preserve the exact pointer type on INTEGER_CST, though not sure, wouldn't
>>> VN CSE it anyway if one has say
>>> union U { struct S { int a; char b __attribute__((counted_by (a))) []; } s;
>>>  struct T { char c, d, e, f; char g __attribute__((counted_by (c))) 
>>> []; } t; };
>>> and
>>> .ACCESS_WITH_SIZE (&v.s.b[0], &v.s.a, (int *) 0);
>>> ...
>>> .ACCESS_WITH_SIZE (&v.t.g[0], &v.t.c, (int *) 0);
>>> ?
>> 
>> We'd probably CSE that - the usual issue of address-with-same-value.
>> 
>>> It would mean though that counted_by wouldn't be allowed to be a
>>> bit-field...
>> 
>> Yup.  We could also pass a pointer to the container though, that's good 
>> enough
>> for the escape, and pass the size by value in addition to that.
> 
> I was wondering about stuff like _BitInt.  But sure, counted_by is just an
> extension, we can just refuse counting by _BitInt in addition to counting by
> floating point, pointers, aggregates, bit-fields, or we could somehow encode
> all the needed type's properties numerically into an integral constant.
> Similarly for alias set (unless it uses 0 for reads).

counted_by currently is limited to INTEGER type. This should resolve this 
issue, right?

Qing
> 
>   Jakub
> 



Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-03 Thread Qing Zhao


> On Nov 3, 2023, at 2:22 AM, Jakub Jelinek  wrote:
> 
> On Fri, Nov 03, 2023 at 07:07:36AM +0100, Martin Uecker wrote:
>> Am Donnerstag, dem 02.11.2023 um 17:28 -0700 schrieb Bill Wendling:
>>> On Thu, Nov 2, 2023 at 1:36 PM Qing Zhao  wrote:
>>>> 
>>>> Thanks a lot for raising these issues.
>>>> 
>>>> If I understand correctly,  the major question we need to answer is:
>>>> 
>>>> For the following example: (Jakub mentioned this  in an early message)
>>>> 
>>>>  1 struct S { int a; char b __attribute__((counted_by (a))) []; };
>>>>  2 struct S s;
>>>>  3 s.a = 5;
>>>>  4 char *p = &s.b[2];
>>>>  5 int i1 = __builtin_dynamic_object_size (p, 0);
>>>>  6 s.a = 3;
>>>>  7 int i2 = __builtin_dynamic_object_size (p, 0);
>>>> 
>>>> Should the 2nd __bdos call (line 7) get
>>>>A. the latest value of s.a (line 6) for it’s size?
>>>> Or  B. the value when the s.b was referenced (line 3, line 4)?
>>>> 
>>> I personally think it should be (A). The user is specifically
>>> indicating that the size has somehow changed, and the compiler should
>>> behave accordingly.
>> 
>> 
>> One potential problem for A apart from the potential impact on
>> optimization is that the information may get lost more
>> easily. Consider:
>> 
>> char *p = &s.b[2];
>> f(&s);
>> int i = __bdos(p, 0);
>> 
>> If the compiler can not see into 'f', the information is lost
>> because f may have changed the size.
> 
> Why?  It doesn't really matter.  The options are
> A. p is at &s.b[2] associated with &s.a and int type (or size of int
>   or whatever); .ACCESS_WITH_SIZE can't be pure,

.ACCESS_WITH_SIZE will only load the size from its address, no any write to 
memory.
It still can be PURE, right? (It will not be CONST anymore).

> but sure, for aliasing
>   POV we can describe it with more detail that it doesn't modify anything
>   in the pointed structure, just escapes the pointer;

If we need to do this, where in the gcc code we need to add these details?

> __bdos can stay
>   leaf I believe;

That’s good!  (I thought now _bdos will call .ACCESS_WITH_SIZE?)

Qing

> and when expanding __bdos later on, it would just
>   dereference the associated pointer at that point (note, __bdos is
>   pure, so it has vuse but not vdef and can load from memory); if
>   f changes s.a, no problem, __bdos will load the changed value in there
> B. if .ACCESS_WITH_SIZE associates the pointer with the s.a value from that
>   point, .ACCESS_WITH_SIZE can be const, but obviously if f changes s.a,
>   __bdos later will use s.a value from the &s.b[2] spot
> 
>   Jakub
> 



Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-03 Thread Qing Zhao


> On Nov 3, 2023, at 10:46 AM, Jakub Jelinek  wrote:
> 
> On Fri, Nov 03, 2023 at 02:32:04PM +0000, Qing Zhao wrote:
>>> Why?  It doesn't really matter.  The options are
>>> A. p is at &s.b[2] associated with &s.a and int type (or size of int
>>>  or whatever); .ACCESS_WITH_SIZE can't be pure,
>> 
>> .ACCESS_WITH_SIZE will only load the size from its address, no any write to 
>> memory.
>> It still can be PURE, right? (It will not be CONST anymore).
> 
> No, it can't be pure.  Because for the IL purposes, it needs to be treated
> as if it saves that address of the counter into some unnamed global variable
> somewhere.

Okay. I see.

>> 
>>> but sure, for aliasing
>>>  POV we can describe it with more detail that it doesn't modify anything
>>>  in the pointed structure, just escapes the pointer;
>> 
>> If we need to do this, where in the gcc code we need to add these details?
> 
> I think ref_maybe_used_by_call_p_1/call_may_clobber_ref_p_1, but Richi is
> expert here.

Just checked these routines, looks like that some other non-pure internal 
functions are handled here too.
For example, 
  case IFN_UBSAN_BOUNDS:
  case IFN_UBSAN_VPTR:
  case IFN_UBSAN_OBJECT_SIZE:
  case IFN_UBSAN_PTR:
  case IFN_ASAN_CHECK:

Looks like the correct place to adjust the new .ACCESS_WITH_SIZE. 
> 
>>> __bdos can stay
>>>  leaf I believe;
>> 
>> That’s good!  (I thought now _bdos will call .ACCESS_WITH_SIZE?)
> 
> No, it shouldn't call it obviously.  If tree-object-size.cc discovery tracks
> something to a pointer initialized by .ACCESS_WITH_SIZE call, then it should
> I believe recurse on the first argument of that call (say if one has
>  ptr_3 = malloc (sz_1);
>  ptr_2 = .ACCESS_WITH_SIZE (ptr_3, &ptr_3[4], ...);
> then supposedly __bdos later on should e.g. for 0/1 modes take minimum from
> ptr_3 (the size actually allocated)) and the the counter.

Yes, this is the situation in my mind too. 
I thought this might eliminate the LEAF feature from __bdos. -:) if not, that’s 
good.

Qing
> 
>   Jakub
> 



Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-03 Thread Qing Zhao
So, based on the discussion so far, We will define the .ACCESS_WITH_SIZE as 
following:

 .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, ACCESS_MODE)

INTERNAL_FN (ACCESS_WITH_SIZE,  ECF_LEAF | ECF_NOTHROW, NULL)

which returns the “REF_TO_OBJ" same as the 1st argument;

1st argument “REF_TO_OBJ": Reference to the object;
2nd argument “REF_TO_SIZE”:  Reference to size of the object referenced by the 
1st argument, 
 if the object that the “REF_TO_OBJ” refered has a
   * real type, the SIZE that the “REF_TO_SIZE” referred is the number of the 
elements of the type;
   * void type, the SIZE that the “REF_TO_SIZE” referred is number of bytes; 
3rd argument "ACCESS_MODE": 
 -1: Unknown access semantics
  0: none
  1: read_only
  2: write_only
  3: read_write

NOTEs, 
 A. This new internal function is intended for a more general use from all the 
3 attributes, "access", "alloc_size", and the new "counted_by", to encode the 
"size" and "access_mode" information to the corresponding pointer. (in order to 
resolve PR96503, etc. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503)
 B. For "counted_by" and "alloc_size" attributes, the 3rd argument will be -1.  
 
 C. In this wrieup, we focus on the implementation details for the "counted_by" 
attribute. However, this function should be ready to be used by "access" and 
"alloc_size" without issue. 

Although .ACCESS_WITH_SIZE is not PURE anymore, but it’s only read from the 2nd 
argument, and not modify anything in the pointed objects. So, we can adjust the 
IPA alias analysis phase with this details 
(ref_maybe_used_by_call_p_1/call_may_clobber_ref_p_1).

One more note: only the integer type is allowed for the SIZE, and in 
tree_object_size.cc, all the SIZE
 (in attributes “access”, “alloc_size”, etc) are converted to “sizetype”.  So, 
we don’t need to specify
The type of the size for “REF_TO_SIZE” since it’s always integer types and 
always converted to “sizetype” internally. 

Let me know any more comment or suggestion. 

Qing


On Nov 3, 2023, at 2:32 AM, Martin Uecker  wrote:
> 
> 
> Am Freitag, dem 03.11.2023 um 07:22 +0100 schrieb Jakub Jelinek:
>> On Fri, Nov 03, 2023 at 07:07:36AM +0100, Martin Uecker wrote:
>>> Am Donnerstag, dem 02.11.2023 um 17:28 -0700 schrieb Bill Wendling:
>>>> On Thu, Nov 2, 2023 at 1:36 PM Qing Zhao  wrote:
>>>>> 
>>>>> Thanks a lot for raising these issues.
>>>>> 
>>>>> If I understand correctly,  the major question we need to answer is:
>>>>> 
>>>>> For the following example: (Jakub mentioned this  in an early message)
>>>>> 
>>>>>  1 struct S { int a; char b __attribute__((counted_by (a))) []; };
>>>>>  2 struct S s;
>>>>>  3 s.a = 5;
>>>>>  4 char *p = &s.b[2];
>>>>>  5 int i1 = __builtin_dynamic_object_size (p, 0);
>>>>>  6 s.a = 3;
>>>>>  7 int i2 = __builtin_dynamic_object_size (p, 0);
>>>>> 
>>>>> Should the 2nd __bdos call (line 7) get
>>>>>A. the latest value of s.a (line 6) for it’s size?
>>>>> Or  B. the value when the s.b was referenced (line 3, line 4)?
>>>>> 
>>>> I personally think it should be (A). The user is specifically
>>>> indicating that the size has somehow changed, and the compiler should
>>>> behave accordingly.
>>> 
>>> 
>>> One potential problem for A apart from the potential impact on
>>> optimization is that the information may get lost more
>>> easily. Consider:
>>> 
>>> char *p = &s.b[2];
>>> f(&s);
>>> int i = __bdos(p, 0);
>>> 
>>> If the compiler can not see into 'f', the information is lost
>>> because f may have changed the size.
>> 
>> Why?  It doesn't really matter.  The options are
>> A. p is at &s.b[2] associated with &s.a and int type (or size of int
>>   or whatever); .ACCESS_WITH_SIZE can't be pure, but sure, for aliasing
>>   POV we can describe it with more detail that it doesn't modify anything
>>   in the pointed structure, just escapes the pointer; __bdos can stay
>>   leaf I believe; and when expanding __bdos later on, it would just
>>   dereference the associated pointer at that point (note, __bdos is
>>   pure, so it has vuse but not vdef and can load from memory); if
>>   f changes s.a, no problem, __bdos will load the changed value in there
> 
> Ah, I right. Because of the reload it doesn't matter. 
> Thank you for the explanation!
> 
> Martin
> 
>> B. if .ACCESS_WITH_SIZE associates the pointer with the s.a value from that
>>   point, .ACCESS_WITH_SIZE can be const, but obviously if f changes s.a,
>>   __bdos later will use s.a value from the &s.b[2] spot



Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-03 Thread Qing Zhao


> On Nov 3, 2023, at 12:30 PM, Jakub Jelinek  wrote:
> 
> On Fri, Nov 03, 2023 at 04:20:57PM +0000, Qing Zhao wrote:
>> So, based on the discussion so far, We will define the .ACCESS_WITH_SIZE as 
>> following:
>> 
>> .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, ACCESS_MODE)
>> 
>> INTERNAL_FN (ACCESS_WITH_SIZE,  ECF_LEAF | ECF_NOTHROW, NULL)
>> 
>> which returns the “REF_TO_OBJ" same as the 1st argument;
>> 
>> 1st argument “REF_TO_OBJ": Reference to the object;
>> 2nd argument “REF_TO_SIZE”:  Reference to size of the object referenced by 
>> the 1st argument, 
>> if the object that the “REF_TO_OBJ” refered has a
>>   * real type, the SIZE that the “REF_TO_SIZE” referred is the number of the 
>> elements of the type;
>>   * void type, the SIZE that the “REF_TO_SIZE” referred is number of bytes; 
> 
> No, you can't do this.  Conversions between pointers are mostly useless in
> GIMPLE, , so you can't make decisions based on TREE_TYPE (TREE_TYPE (fnarg))
> as it could have some random completely unrelated type.
> So, the multiplication factor needs to be encoded in the arguments rather
> than derived from REF_TO_OBJ's type, and similarly the size of what
> REF_TO_SIZE points to needs to be encoded somewhere.

Okay, I see, so 2 more arguments to the new function.

Qing
> 
>> 3rd argument "ACCESS_MODE": 
>> -1: Unknown access semantics
>>  0: none
>>  1: read_only
>>  2: write_only
>>  3: read_write
> 
>   Jakub
> 



Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-03 Thread Qing Zhao


> On Nov 2, 2023, at 8:13 PM, Bill Wendling  wrote:
> 
> On Thu, Nov 2, 2023 at 1:00 AM Richard Biener
>  wrote:
>> 
>> On Wed, Nov 1, 2023 at 3:47 PM Qing Zhao  wrote:
>>> 
>>> 
>>> 
>>>> On Oct 31, 2023, at 6:14 PM, Joseph Myers  wrote:
>>>> 
>>>> On Tue, 31 Oct 2023, Qing Zhao wrote:
>>>> 
>>>>> 2.3 A new semantic requirement in the user documentation of "counted_by"
>>>>> 
>>>>> For the following structure including a FAM with a counted_by attribute:
>>>>> 
>>>>> struct A
>>>>> {
>>>>>  size_t size;
>>>>>  char buf[] __attribute__((counted_by(size)));
>>>>> };
>>>>> 
>>>>> for any object with such type:
>>>>> 
>>>>> struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>>>> 
>>>>> The setting to the size field should be done before the first reference
>>>>> to the FAM field.
>>>>> 
>>>>> Such requirement to the user will guarantee that the first reference to
>>>>> the FAM knows the size of the FAM.
>>>>> 
>>>>> We need to add this additional requirement to the user document.
>>>> 
>>>> Make sure the manual is very specific about exactly when size is
>>>> considered to be an accurate representation of the space available for buf
>>>> (given that, after malloc or realloc, it's going to be temporarily
>>>> inaccurate).  If the intent is that inaccurate size at such a time means
>>>> undefined behavior, say so explicitly.
>>> 
>>> Yes, good point. We need to define this clearly in the beginning.
>>> We need to explicit say that
>>> 
>>> the size of the FAM is defined by the latest “counted_by” value. And it’s 
>>> an undefined behavior when the size field is not defined when the FAM is 
>>> referenced.
>>> 
>>> Is the above good enough?
>>> 
>>> 
>>>> 
>>>>> 2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
>>>>> 
>>>>> In C FE:
>>>>> 
>>>>> for every reference to a FAM, for example, "obj->buf" in the small 
>>>>> example,
>>>>> check whether the corresponding FIELD_DECL has a "counted_by" attribute?
>>>>> if YES, replace the reference to "obj->buf" with a call to
>>>>> .ACCESS_WITH_SIZE (obj->buf, obj->size, -1);
>>>> 
>>>> This seems plausible - but you should also consider the case of static
>>>> initializers - remember the GNU extension for statically allocated objects
>>>> with flexible array members (unless you're not allowing it with
>>>> counted_by).
>>>> 
>>>> static struct A x = { sizeof "hello", "hello" };
>>>> static char *y = &x.buf;
>>>> 
>>>> I'd expect that to be valid - and unless you say such a usage is invalid,
>>> 
>>> At this moment, I think that this should be valid.
>>> 
>>> I,e, the following:
>>> 
>>> struct A
>>> {
>>> size_t size;
>>> char buf[] __attribute__((counted_by(size)));
>>> };
>>> 
>>> static struct A x = {sizeof "hello", "hello”};
>>> 
>>> Should be valid, and x.size represents the number of elements of x.buf.
>>> Both x.size and x.buf are initialized statically.
>>> 
>>>> you should avoid the replacement in such a static initializer context when
>>>> the FAM reference is to an object with a constant address (if
>>>> .ACCESS_WITH_SIZE would not act as an lvalue whose address is a constant
>>>> expression; if it works fine as a constant-address lvalue, then the
>>>> replacement would be OK).
>>> 
>>> Then if such usage for the “counted_by” is valid, we need to replace the FAM
>>> reference by a call to  .ACCESS_WITH_SIZE as well.
>>> Otherwise the “counted_by” relationship will be lost to the Middle end.
>>> 
>>> With the current definition of .ACCESS_WITH_SIZE
>>> 
>>> PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
>>> 
>>> Isn’t the PTR (return value of the call) a LVALUE?
>> 
>> You probably want to specify that when a pointer to the array is taken the
>> pointer has to be 

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-03 Thread Qing Zhao
Yes, after today’s discussion, I think we agreed on 

1. Passing the size field by reference to .ACCESS_WITH_SIZE as jakub suggested.
2. Then the compiler should be able to always use the latest value of size 
field for the reference to FAM.

As a result, no need to add code for pointer re-obtaining purpose in the source 
code. 

I will update the proposal one more time.

thanks.

Qing

> On Nov 2, 2023, at 8:28 PM, Bill Wendling  wrote:
> 
> On Thu, Nov 2, 2023 at 1:36 PM Qing Zhao  wrote:
>> 
>> Thanks a lot for raising these issues.
>> 
>> If I understand correctly,  the major question we need to answer is:
>> 
>> For the following example: (Jakub mentioned this  in an early message)
>> 
>>  1 struct S { int a; char b __attribute__((counted_by (a))) []; };
>>  2 struct S s;
>>  3 s.a = 5;
>>  4 char *p = &s.b[2];
>>  5 int i1 = __builtin_dynamic_object_size (p, 0);
>>  6 s.a = 3;
>>  7 int i2 = __builtin_dynamic_object_size (p, 0);
>> 
>> Should the 2nd __bdos call (line 7) get
>>A. the latest value of s.a (line 6) for it’s size?
>> Or  B. the value when the s.b was referenced (line 3, line 4)?
>> 
> I personally think it should be (A). The user is specifically
> indicating that the size has somehow changed, and the compiler should
> behave accordingly.
> 
>> A should be more convenient for the user to use the dynamic array feature.
>> With B, the user has to modify the source code (to add code to “re-obtain”
>> the pointer after the size was adjusted at line 6) as mentioned by Richard.
>> 
>> This depends on how we design the new internal function .ACCESS_WITH_SIZE
>> 
>> 1. Size is passed by value to .ACCESS_WITH_SIZE as we currently designed.
>> 
>> PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
>> 
>> 2. Size is passed by reference to .ACCESS_WITH_SIZE as Jakub suggested.
>> 
>> PTR = .ACCESS_WITH_SIZE(PTR, &SIZE, TYPEOFSIZE, ACCESS_MODE)
>> 
>> With 1, We can only provide B, the user needs to modify the source code to 
>> get the full feature of dynamic array;
>> With 2, We can provide  A, the user will get full support to the dynamic 
>> array without restrictions in the source code.
>> 
> My understanding of ACCESS_WITH_SIZE is that it's there to add an
> explicit reference to SIZE so that the optimizers won't reorder the
> code incorrectly. If that's the case, then it should act as if
> ACCESS_WITH_SIZE wasn't even there (i.e. it's just a pointer
> dereference into the FAM). We get that with (2) it appears. It would
> be a major headache to make the user go throughout their code base to
> ensure that SIZE was either unmodified, or if it was that extra code
> must be added to ensure the expected behavior.
> 
>> However, We have to pay additional cost for supporting A by using 2, which 
>> includes:
>> 
>> 1. .ACCESS_WITH_SIZE will become an escape point, which will further impact 
>> the IPA optimizations, more runtime overhead.
>>Then .ACCESS_WTH_SIZE will not be CONST, right? But it will still be PURE?
>> 
>> 2. __builtin_dynamic_object_size will NOT be LEAF anymore.  This will also 
>> impact some IPA optimizations, more runtime overhead.
>> 
>> I think the following are the factors that make the decision:
>> 
>> 1. How big the performance impact?
>> 2. How important the dynamic array feature? Is adding some user restrictions 
>> as Richard mentioned feasible to support this feature?
>> 
>> Maybe we can implement 1 first, if the full support to the dynamic array is 
>> needed, we can add 2 then?
>> Or, we can implement both, and compare the performance difference, then 
>> decide?
>> 
>> Qing
>> 



[PATCH][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-22 Thread Qing Zhao
__builtin_clear_padding(&object) will clear all the padding bits of the object.
actually, it doesn't involve any use of an user variable. Therefore, users do
not expect any uninitialized warning from it. It's reasonable to suppress
uninitialized warnings for all new created uses from __builtin_clear_padding
folding.

The patch has been bootstrapped and regress tested on both x86 and aarch64.

Okay for trunk?

Thanks.

Qing

==
>From cf6620005f55d4a1f782332809445c270d22cf86 Mon Sep 17 00:00:00 2001
From: qing zhao 
Date: Mon, 21 Feb 2022 16:38:31 +
Subject: [PATCH] Suppress uninitialized warnings for new created uses from
 __builtin_clear_padding folding [PR104550]

__builtin_clear_padding(&object) will clear all the padding bits of the object.
actually, it doesn't involve any use of an user variable. Therefore, users do
not expect any uninitialized warning from it. It's reasonable to suppress
uninitialized warnings for all new created uses from __builtin_clear_padding
folding.

PR middle-end/104550

gcc/ChangeLog:

* gimple-fold.cc (clear_padding_flush): Suppress warnings for new
created uses.
(clear_padding_emit_loop): Likewise.
(clear_padding_type): Likewise.
(gimple_fold_builtin_clear_padding): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/auto-init-pr104550-1.c: New test.
* gcc.dg/auto-init-pr104550-2.c: New test.
* gcc.dg/auto-init-pr104550-3.c: New test.
---
 gcc/gimple-fold.cc  | 31 +++--
 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 +++
 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 
 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 
 4 files changed, 55 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 16f02c2d098..1e18ba3465a 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -4296,6 +4296,7 @@ clear_padding_flush (clear_padding_struct *buf, bool full)
 build_int_cst (buf->alias_type,
buf->off + padding_end
- padding_bytes));
+ suppress_warning (dst, OPT_Wuninitialized);
  gimple *g = gimple_build_assign (dst, src);
  gimple_set_location (g, buf->loc);
  gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
@@ -4341,6 +4342,7 @@ clear_padding_flush (clear_padding_struct *buf, bool full)
  tree dst = build2_loc (buf->loc, MEM_REF, atype,
 buf->base,
 build_int_cst (buf->alias_type, off));
+ suppress_warning (dst, OPT_Wuninitialized);
  gimple *g = gimple_build_assign (dst, src);
  gimple_set_location (g, buf->loc);
  gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
@@ -4370,6 +4372,7 @@ clear_padding_flush (clear_padding_struct *buf, bool full)
atype = build_aligned_type (type, buf->align);
  tree dst = build2_loc (buf->loc, MEM_REF, atype, buf->base,
 build_int_cst (buf->alias_type, off));
+ suppress_warning (dst, OPT_Wuninitialized);
  tree src;
  gimple *g;
  if (all_ones
@@ -4420,6 +4423,7 @@ clear_padding_flush (clear_padding_struct *buf, bool full)
 build_int_cst (buf->alias_type,
buf->off + end
- padding_bytes));
+ suppress_warning (dst, OPT_Wuninitialized);
  gimple *g = gimple_build_assign (dst, src);
  gimple_set_location (g, buf->loc);
  gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
@@ -4620,14 +4624,18 @@ clear_padding_emit_loop (clear_padding_struct *buf, 
tree type,
   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
   clear_padding_type (buf, type, buf->sz, for_auto_init);
   clear_padding_flush (buf, true);
-  g = gimple_build_assign (buf->base, POINTER_PLUS_EXPR, buf->base,
-  size_int (buf->sz));
+  tree rhs = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (buf->base),
+ buf->base, size_int (buf->sz));
+  suppress_warning (rhs, OPT_Wuninitialized);
+  g = gimple_build_assign (buf->base, rhs);
   gimple_set_location (g, buf->loc);
   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
   g = gimple_build_label (l2);
   gimple_set_location (g, buf->loc);
   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
-  g = gimple_

Re: [PATCH][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-23 Thread Qing Zhao
Hi, Richard,

> On Feb 23, 2022, at 1:38 AM, Richard Biener  wrote:
> 
> On Tue, 22 Feb 2022, Qing Zhao wrote:
> 
>> __builtin_clear_padding(&object) will clear all the padding bits of the 
>> object.
>> actually, it doesn't involve any use of an user variable. Therefore, users do
>> not expect any uninitialized warning from it. It's reasonable to suppress
>> uninitialized warnings for all new created uses from __builtin_clear_padding
>> folding.
>> 
>> The patch has been bootstrapped and regress tested on both x86 and aarch64.
>> 
>> Okay for trunk?
>> 
>> Thanks.
>> 
>> Qing
>> 
>> ==
>> From cf6620005f55d4a1f782332809445c270d22cf86 Mon Sep 17 00:00:00 2001
>> From: qing zhao 
>> Date: Mon, 21 Feb 2022 16:38:31 +
>> Subject: [PATCH] Suppress uninitialized warnings for new created uses from
>> __builtin_clear_padding folding [PR104550]
>> 
>> __builtin_clear_padding(&object) will clear all the padding bits of the 
>> object.
>> actually, it doesn't involve any use of an user variable. Therefore, users do
>> not expect any uninitialized warning from it. It's reasonable to suppress
>> uninitialized warnings for all new created uses from __builtin_clear_padding
>> folding.
>> 
>>  PR middle-end/104550
>> 
>> gcc/ChangeLog:
>> 
>>  * gimple-fold.cc (clear_padding_flush): Suppress warnings for new
>>  created uses.
>>  (clear_padding_emit_loop): Likewise.
>>  (clear_padding_type): Likewise.
>>  (gimple_fold_builtin_clear_padding): Likewise.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.dg/auto-init-pr104550-1.c: New test.
>>  * gcc.dg/auto-init-pr104550-2.c: New test.
>>  * gcc.dg/auto-init-pr104550-3.c: New test.
>> ---
>> gcc/gimple-fold.cc  | 31 +++--
>> gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 +++
>> gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 
>> gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 
>> 4 files changed, 55 insertions(+), 8 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
>> 
>> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
>> index 16f02c2d098..1e18ba3465a 100644
>> --- a/gcc/gimple-fold.cc
>> +++ b/gcc/gimple-fold.cc
>> @@ -4296,6 +4296,7 @@ clear_padding_flush (clear_padding_struct *buf, bool 
>> full)
>>   build_int_cst (buf->alias_type,
>>  buf->off + padding_end
>>  - padding_bytes));
>> +  suppress_warning (dst, OPT_Wuninitialized);
>>gimple *g = gimple_build_assign (dst, src);
>>gimple_set_location (g, buf->loc);
>>gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
>> @@ -4341,6 +4342,7 @@ clear_padding_flush (clear_padding_struct *buf, bool 
>> full)
>>tree dst = build2_loc (buf->loc, MEM_REF, atype,
>>   buf->base,
>>   build_int_cst (buf->alias_type, off));
>> +  suppress_warning (dst, OPT_Wuninitialized);
>>gimple *g = gimple_build_assign (dst, src);
>>gimple_set_location (g, buf->loc);
>>gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
>> @@ -4370,6 +4372,7 @@ clear_padding_flush (clear_padding_struct *buf, bool 
>> full)
>>  atype = build_aligned_type (type, buf->align);
>>tree dst = build2_loc (buf->loc, MEM_REF, atype, buf->base,
>>   build_int_cst (buf->alias_type, off));
>> +  suppress_warning (dst, OPT_Wuninitialized);
>>tree src;
>>gimple *g;
>>if (all_ones
>> @@ -4420,6 +4423,7 @@ clear_padding_flush (clear_padding_struct *buf, bool 
>> full)
>>   build_int_cst (buf->alias_type,
>>  buf->off + end
>>  - padding_bytes));
>> +  suppress_warning (dst, OPT_Wuninitialized);
>>gimple *g = gimple_build_assign (dst, src);
>>gimple_set_location (g, buf->loc);
>>gsi_insert_before (buf->gsi, g, 

Re: [PATCH][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-23 Thread Qing Zhao


> On Feb 23, 2022, at 11:49 AM, Jakub Jelinek  wrote:
> 
> On Wed, Feb 23, 2022 at 05:33:57PM +0000, Qing Zhao wrote:
>> From my understanding, __builtin_clear_padding (&object), does not _use_ any 
>> variable,
>> therefore, no uninitialized usage warning should be emitted for it. 
> 
> __builtin_clear_padding (&object)
> sometimes expands to roughly:
> *(int *)((char *)&object + 32) = 0;
> etc., in that case it shouldn't be suppressed in any way, it doesn't read
> anything, only stores.
> Or at other times it is:
> *(int *)((char *)&object + 32) &= 0xfec7dab1;
> etc., in that case it reads bytes from the object which can be
> uninitialized, we mask some bits off and store.

Okay, I see. 
So, only the MEM_REF that will be used to read first should be suppressed 
warning. Then there is only one (out of 4) MEM_REF
should be suppressed warning, that’s the following one (line 4371 and then line 
4382):

4371   tree dst = build2_loc (buf->loc, MEM_REF, atype, buf->base,
4372  build_int_cst (buf->alias_type, off));
4373   tree src;
4374   gimple *g;
4375   if (all_ones
4376   && nonzero_first == start
4377   && nonzero_last == start + eltsz)
4378 src = build_zero_cst (type);
4379   else
4380 {
4381   src = make_ssa_name (type);
4382   g = gimple_build_assign (src, unshare_expr (dst));
4383   gimple_set_location (g, buf->loc);
4384   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
4385   tree mask = native_interpret_expr (type,
4386  buf->buf + i + start,
4387  eltsz);
4388   gcc_assert (mask && TREE_CODE (mask) == INTEGER_CST);
4389   mask = fold_build1 (BIT_NOT_EXPR, type, mask);
4390   tree src_masked = make_ssa_name (type);
4391   g = gimple_build_assign (src_masked, BIT_AND_EXPR,
4392src, mask);
4393   gimple_set_location (g, buf->loc);
4394   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
4395   src = src_masked;
4396 }
4397   g = gimple_build_assign (dst, src);


All the other 3 MEM_REFs are not read. So, we can just exclude them from 
suppressing warning, right?
Another question, for the above MEM_REF, should I suppress warning for line 
4371 “dst”? Or shall I 
Suppress warning for line 4382 (for the “unshared_expr(dst)”)?

I think that we should suppress warning for the latter, i.e 
“unshared_expr(dst)” at line 4382, right?

> 
> It is similar to what object.bitfld = 3; expands to,
> but usually only after the uninit pass.  Though, we have the
> optimize_bit_field_compare optimization, that is done very early
> and I wonder what uninit does about that.  Perhaps it ignores
> BIT_FIELD_REFs, I'd need to check that.

Yes, I see that uninitialized warning specially handles BIT_INSERT_EXPR as: 
(tree-ssa-uninit.cc)

 573   /* Do not warn if the result of the access is then used for
 574  a BIT_INSERT_EXPR. */
 575   if (lhs && TREE_CODE (lhs) == SSA_NAME)
 576 FOR_EACH_IMM_USE_FAST (luse_p, liter, lhs)
 577   {
 578 gimple *use_stmt = USE_STMT (luse_p);
 579 /* BIT_INSERT_EXPR first operand should not be considered
 580a use for the purpose of uninit warnings.  */
 
> 
> Anyway, if we want to disable uninit warnings for __builtin_clear_padding,
> we should do that with suppress_warning on the read stmts that load
> a byte (or more adjacent ones) before they are masked off and stored again,
> so that we don't warn about that.

IN addition to this read stmts, shall we suppress warnings for the following:

/* Emit a runtime loop:
   for (; buf.base != end; buf.base += sz)
 __builtin_clear_padding (buf.base);  */

static void
clear_padding_emit_loop (clear_padding_struct *buf, tree type,
 tree end, bool for_auto_init)
{

i.e, should we suppress warnings for the above “buf.base != end”, “buf.base += 
sz”?

No need to suppress warning for them since they just read the address of the 
object, not the object itself?

thanks.

Qing

> 
>   Jakub
> 



Fwd: [PATCH 1/2][middle-end/102276] Don't emit switch-unreachable warnings for -ftrivial-auto-var-init (PR102276)

2022-02-23 Thread Qing Zhao
Ping.

Qing

Begin forwarded message:

From: Qing Zhao via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>
Subject: [PATCH 1/2][middle-end/102276] Don't emit switch-unreachable warnings 
for -ftrivial-auto-var-init (PR102276)
Date: February 19, 2022 at 10:22:43 AM CST
To: richard Biener mailto:rguent...@suse.de>>, jakub Jelinek 
mailto:ja...@redhat.com>>
Cc: gcc-patches Paul A Clarke via 
mailto:gcc-patches@gcc.gnu.org>>, kees Cook 
mailto:keesc...@chromium.org>>
Reply-To: Qing Zhao mailto:qing.z...@oracle.com>>

Hi,

Per our discussion in the bug report 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102276

We decided to go with the following solution:

1. avoid emitting switch-unreachable warnings for -ftrivial-auto-var-init;
2. adding a new option -Wtrivial-auto-var-init to emit warnings for the 
switch-unreadable cases to suggest the user modify the source code;
3. update documentation of -ftrivial-auto-var-init for the limitation on 
switch-unreachable cases and introduce the new option -Wtrivial-auto-var-init

with the above 1, we can resolve the current immediate issue of spurious 
warnings of using -ftrivial-auto-var-init to make kernel build succeed;
with the above 2, we provide the user a way to know that 
-ftrivial-auto-var-init has limitation on the switch-unreachable cases, and 
user should modify the source code to avoid this problem;
with the above 3, we will provide the user a clear documentation of the 
-ftrivial-auto-var-init and also provide suggestions how to resolve this issue.

There are two patches included for this bug.  This is the first one.

The patches has been bootstrapped and regression tested on both x86 and aarch64.

Okay for commit?

Thanks.

Qing.

===

>From 65bc9607ff35ad49e5501ec5c392293c5b6358d0 Mon Sep 17 00:00:00 2001
From: Qing Zhao mailto:qing.z...@oracle.com>>
Date: Fri, 18 Feb 2022 15:35:53 +
Subject: [PATCH 1/2] Don't emit switch-unreachable warnings for
-ftrivial-auto-var-init (PR102276)

for the following testing case:
 1 int g(int *);
 2 int f1()
 3 {
 4 switch (0) {
 5 int x;
 6 default:
 7 return g(&x);
 8 }
 9 }
compiling with -O -ftrivial-auto-var-init causes spurious warning:
warning: statement will never be executed [-Wswitch-unreachable]
   5 | int x;
 | ^
This is due to the compiler-generated initialization at the point of
the declaration.

We could avoid the warning by adjusting the routine
"maybe_warn_switch_unreachable" to exclude the following cases:

when
flag_auto_var_init > AUTO_INIT_UNINITIALIZED
And
call to .DEFERRED_INIT

2022-02-18 Qing Zhao  mailto:qing.z...@oracle.com>>
gcc/ChangeLog:

* gimplify.cc<http://gimplify.cc> (maybe_warn_switch_unreachable): Don't warn 
for compiler
-generated initializations for -ftrivial-auto-var-init.

gcc/testsuite/ChangeLog:

* gcc.dg/auto-init-pr102276-1.c: New test.
* gcc.dg/auto-init-pr102276-2.c: New test.
---
gcc/gimplify.cc<http://gimplify.cc> |  8 -
gcc/testsuite/gcc.dg/auto-init-pr102276-1.c | 38 +
gcc/testsuite/gcc.dg/auto-init-pr102276-2.c | 38 +
3 files changed, 83 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-1.c
create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-2.c

diff --git a/gcc/gimplify.cc<http://gimplify.cc> 
b/gcc/gimplify.cc<http://gimplify.cc>
index f570daa015a..4e3bbf5314d 100644
--- a/gcc/gimplify.cc<http://gimplify.cc>
+++ b/gcc/gimplify.cc<http://gimplify.cc>
@@ -2103,7 +2103,13 @@ maybe_warn_switch_unreachable (gimple_seq seq)
 && TREE_CODE (gimple_goto_dest (stmt)) == LABEL_DECL
 && DECL_ARTIFICIAL (gimple_goto_dest (stmt)))
/* Don't warn for compiler-generated gotos.  These occur
-   in Duff's devices, for example.  */;
+   in Duff's devices, for example.  */
+ ;
+  else if ((flag_auto_var_init > AUTO_INIT_UNINITIALIZED)
+ && (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)))
+ /* Don't warn for compiler-generated initializations for
+  -ftrivial-auto-var-init.  */
+ ;
  else
warning_at (gimple_location (stmt), OPT_Wswitch_unreachable,
   "statement will never be executed");
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr102276-1.c 
b/gcc/testsuite/gcc.dg/auto-init-pr102276-1.c
new file mode 100644
index 000..d574926e0c8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr102276-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wall -ftrivial-auto-var-init=zero" } */
+
+int g(int *);
+int f()
+{
+switch (0) {
+int x;  /* { dg-bogus "statement will never be executed" } */
+default:
+return g(&x);
+}
+}
+
+int g1(int);
+int f1()
+{
+switch (0) {
+int x; /* { dg-bogus "statement will never 

Fwd: [PATCH 2/2][middle-end/102276] Adding -Wtrivial-auto-var-init and update documentation.

2022-02-23 Thread Qing Zhao
Ping...

Qing

Begin forwarded message:

From: Qing Zhao via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>
Subject: [PATCH 2/2][middle-end/102276] Adding -Wtrivial-auto-var-init and 
update documentation.
Date: February 19, 2022 at 10:24:09 AM CST
To: richard Biener mailto:rguent...@suse.de>>, Jakub Jelinek 
mailto:ja...@redhat.com>>
Cc: gcc-patches Paul A Clarke via 
mailto:gcc-patches@gcc.gnu.org>>, kees Cook 
mailto:keesc...@chromium.org>>
Reply-To: Qing Zhao mailto:qing.z...@oracle.com>>

Hi,

This is the 2nd patch for fixing pr102276.

Adding -Wtrivial-auto-var-init and update documentation.

Adding a new warning option -Wtrivial-auto-var-init to report cases when
-ftrivial-auto-var-init cannot initialize the auto variable. At the same
time, update documentation for -ftrivial-auto-var-init to connect it with
the new warning option -Wtrivial-auto-var-init,  and add documentation
for -Wtrivial-auto-var-init.

Bootstraped and regression tested on both x86 and aarch64.

Okay for committing?

thanks.

Qing.

==
>From 4346890b8f4258489c4841f1992ba3ce816d7689 Mon Sep 17 00:00:00 2001
From: Qing Zhao mailto:qing.z...@oracle.com>>
Date: Fri, 18 Feb 2022 15:53:15 +
Subject: [PATCH 2/2] Adding -Wtrivial-auto-var-init and update documentation.

Adding a new warning option -Wtrivial-auto-var-init to report cases when
-ftrivial-auto-var-init cannot initialize the auto variable. At the same
time, update documentation for -ftrivial-auto-var-init to connect it with
the new warning option -Wtrivial-auto-var-init,  and add documentation
for -Wtrivial-auto-var-init.

2022-02-18 Qing Zhao  mailto:qing.z...@oracle.com>>
gcc/ChangeLog:

* common.opt (-Wtrivial-auto-var-init): New option.
* doc/invoke.texi (-Wtrivial-auto-var-init): Document new option.
(-ftrivial-auto-var-init): Update option;
* gimplify.cc<http://gimplify.cc> (maybe_warn_switch_unreachable): Rename...
(maybe_warn_switch_unreachable_and_auto_init): ...to this.
(gimplify_switch_expr): Call new function.

gcc/testsuite/ChangeLog:

* gcc.dg/auto-init-pr102276-3.c: New test.
* gcc.dg/auto-init-pr102276-4.c: New test.
---
gcc/common.opt  |   4 +
gcc/doc/invoke.texi |  14 ++-
gcc/gimplify.cc<http://gimplify.cc> | 100 
+++-
gcc/testsuite/gcc.dg/auto-init-pr102276-3.c |  40 
gcc/testsuite/gcc.dg/auto-init-pr102276-4.c |  40 
5 files changed, 175 insertions(+), 23 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-3.c
create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c

diff --git a/gcc/common.opt b/gcc/common.opt
index c21e5273ae3..22c95dbfa49 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -801,6 +801,10 @@ Wtrampolines
Common Var(warn_trampolines) Warning
Warn whenever a trampoline is generated.

+Wtrivial-auto-var-init
+Common Var(warn_trivial_auto_var_init) Warning Init(0)
+Warn about where -ftrivial-auto-var-init cannot initialize the auto variable.
+
Wtype-limits
Common Var(warn_type_limits) Warning EnabledBy(Wextra)
Warn if a comparison is always true or always false due to the limited range of 
the data type.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e1a00c80307..c61a5b4b4a5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -399,7 +399,7 @@ Objective-C and Objective-C++ Dialects}.
-Wswitch  -Wno-switch-bool  -Wswitch-default  -Wswitch-enum @gol
-Wno-switch-outside-range  -Wno-switch-unreachable  -Wsync-nand @gol
-Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
--Wtsan -Wtype-limits  -Wundef @gol
+-Wtrivial-auto-var-init -Wtsan -Wtype-limits  -Wundef @gol
-Wuninitialized  -Wunknown-pragmas @gol
-Wunsuffixed-float-constants  -Wunused @gol
-Wunused-but-set-parameter  -Wunused-but-set-variable @gol
@@ -6953,6 +6953,14 @@ This warning is enabled by default for C and C++ 
programs.
Warn when @code{__sync_fetch_and_nand} and @code{__sync_nand_and_fetch}
built-in functions are used.  These functions changed semantics in GCC 4.4.

+@item -Wtrivial-auto-var-init
+@opindex Wtrivial-auto-var-init
+@opindex Wno-trivial-auto-var-init
+Warn when @code{-ftrivial-auto-var-init} cannot initialize the automatic
+variable.  A common situation is an automatic variable that is declared
+between the controlling expression and the first case lable of a @code{switch}
+statement.
+
@item -Wunused-but-set-parameter
@opindex Wunused-but-set-parameter
@opindex Wno-unused-but-set-parameter
@@ -12314,6 +12322,10 @@ initializer as uninitialized, @option{-Wuninitialized} 
and
warning messages on such automatic variables.
With this option, GCC will also initialize any padding of automatic variables
that have structure or union types to zeroes.
+However, the current implementation cannot initialize automatic variables that
+are declared between the controlling expression and the first case of a
+

Re: [PATCH][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Qing Zhao


> On Feb 24, 2022, at 2:46 AM, Richard Biener  wrote:
> 
> On Wed, 23 Feb 2022, Qing Zhao wrote:
> 
>> 
>> 
>>> On Feb 23, 2022, at 11:49 AM, Jakub Jelinek  wrote:
>>> 
>>> On Wed, Feb 23, 2022 at 05:33:57PM +, Qing Zhao wrote:
>>>> From my understanding, __builtin_clear_padding (&object), does not _use_ 
>>>> any variable,
>>>> therefore, no uninitialized usage warning should be emitted for it. 
>>> 
>>> __builtin_clear_padding (&object)
>>> sometimes expands to roughly:
>>> *(int *)((char *)&object + 32) = 0;
>>> etc., in that case it shouldn't be suppressed in any way, it doesn't read
>>> anything, only stores.
>>> Or at other times it is:
>>> *(int *)((char *)&object + 32) &= 0xfec7dab1;
>>> etc., in that case it reads bytes from the object which can be
>>> uninitialized, we mask some bits off and store.
>> 
>> Okay, I see. 
>> So, only the MEM_REF that will be used to read first should be suppressed 
>> warning. Then there is only one (out of 4) MEM_REF
>> should be suppressed warning, that’s the following one (line 4371 and then 
>> line 4382):
>> 
>> 4371   tree dst = build2_loc (buf->loc, MEM_REF, atype, 
>> buf->base,
>> 4372  build_int_cst (buf->alias_type, 
>> off));
>> 4373   tree src;
>> 4374   gimple *g;
>> 4375   if (all_ones
>> 4376   && nonzero_first == start
>> 4377   && nonzero_last == start + eltsz)
>> 4378 src = build_zero_cst (type);
>> 4379   else
>> 4380 {
>> 4381   src = make_ssa_name (type);
>> 4382   g = gimple_build_assign (src, unshare_expr (dst));
>> 4383   gimple_set_location (g, buf->loc);
>> 4384   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
>> 4385   tree mask = native_interpret_expr (type,
>> 4386  buf->buf + i + 
>> start,
>> 4387  eltsz);
>> 4388   gcc_assert (mask && TREE_CODE (mask) == INTEGER_CST);
>> 4389   mask = fold_build1 (BIT_NOT_EXPR, type, mask);
>> 4390   tree src_masked = make_ssa_name (type);
>> 4391   g = gimple_build_assign (src_masked, BIT_AND_EXPR,
>> 4392src, mask);
>> 4393   gimple_set_location (g, buf->loc);
>> 4394   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
>> 4395   src = src_masked;
>> 4396 }
>> 4397   g = gimple_build_assign (dst, src);
>> 
>> 
>> All the other 3 MEM_REFs are not read. So, we can just exclude them from 
>> suppressing warning, right?
>> Another question, for the above MEM_REF, should I suppress warning for line 
>> 4371 “dst”? Or shall I 
>> Suppress warning for line 4382 (for the “unshared_expr(dst)”)?
>> 
>> I think that we should suppress warning for the latter, i.e 
>> “unshared_expr(dst)” at line 4382, right?
> 
> Yes, the one that's put into the GIMPLE stmt.

Okay.
> 
>>> 
>>> It is similar to what object.bitfld = 3; expands to,
>>> but usually only after the uninit pass.  Though, we have the
>>> optimize_bit_field_compare optimization, that is done very early
>>> and I wonder what uninit does about that.  Perhaps it ignores
>>> BIT_FIELD_REFs, I'd need to check that.
>> 
>> Yes, I see that uninitialized warning specially handles BIT_INSERT_EXPR as: 
>> (tree-ssa-uninit.cc)
>> 
>> 573   /* Do not warn if the result of the access is then used for
>> 574  a BIT_INSERT_EXPR. */
>> 575   if (lhs && TREE_CODE (lhs) == SSA_NAME)
>> 576 FOR_EACH_IMM_USE_FAST (luse_p, liter, lhs)
>> 577   {
>> 578 gimple *use_stmt = USE_STMT (luse_p);
>> 579 /* BIT_INSERT_EXPR first operand should not be considered
>> 580a use for the purpose of uninit warnings.  */
> 
> That follows the COMPLEX_EXPR handling I think.
> 
>>> 
>>> Anyway, if we want to disable uninit warnings for __builtin_clear_padding,
>>> we should do that with suppress_warning on the read stmts that load
>>> a byte (or more adjacent ones) before they are masked off and stored again,
>>> so that we don't warn about that.
>> 
>> IN addition to this read stmts, shall we suppress warnings for the following:
>> 
>> /* Emit a runtime loop:
>>   for (; buf.base != end; buf.base += sz)
>> __builtin_clear_padding (buf.base);  */
>> 
>> static void
>> clear_padding_emit_loop (clear_padding_struct *buf, tree type,
>> tree end, bool for_auto_init)
>> {
>> 
>> i.e, should we suppress warnings for the above “buf.base != end”, “buf.base 
>> += sz”?
>> 
>> No need to suppress warning for them since they just read the address of the 
>> object, not the object itself?
> 
> No need to supporess those indeed.

agreed.

thanks.

Will send out the new version soon.

Qing
> 
> Richard.



[PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Qing Zhao
Hi, 

This is the 2nd version for this bug per our discussion.

Compared to the previous patch, this patch ONLY suppresses warnings for the 
fake read that was introduced with folding. 
The patch has been bootstrapped and regress tested on both x86 and aarch64.
Okay for trunk?

Thanks.

Qing

==
>From a858be0fd979e05a6f54ac340e34bf85ddbc7067 Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Wed, 23 Feb 2022 23:45:10 +
Subject: [PATCH] Suppress uninitialized warnings for new created uses from 
 __builtin_clear_padding folding [PR104550]

__builtin_clear_padding(&object) will clear all the padding bits of the object.
actually, it doesn't involve any use of an user variable. Therefore, users do
not expect any uninitialized warning from it. It's reasonable to suppress
uninitialized warnings for all new created uses from __builtin_clear_padding
folding.

PR middle-end/104550

gcc/ChangeLog:

* gimple-fold.cc (clear_padding_flush): Suppress warnings for new
created uses.

gcc/testsuite/ChangeLog:

* gcc.dg/auto-init-pr104550-1.c: New test.
* gcc.dg/auto-init-pr104550-2.c: New test.
* gcc.dg/auto-init-pr104550-3.c: New test.
---
 gcc/gimple-fold.cc  |  7 ++-
 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
 4 files changed, 38 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 16f02c2d098..e11a775ad9f 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -4379,7 +4379,12 @@ clear_padding_flush (clear_padding_struct *buf, bool 
full)
  else
{
  src = make_ssa_name (type);
- g = gimple_build_assign (src, unshare_expr (dst));
+ tree tmp_dst = unshare_expr (dst);
+ /* The folding introduces a read from the tmp_dst, we should
+prevent uninitialized warning analysis from issuing warning
+for such fake read.  */
+ suppress_warning (tmp_dst, OPT_Wuninitialized);
+ g = gimple_build_assign (src, tmp_dst);
  gimple_set_location (g, buf->loc);
  gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
  tree mask = native_interpret_expr (type,
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
new file mode 100644
index 000..a08110c3a17
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
@@ -0,0 +1,10 @@
+/* PR 104550*/
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } */
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
new file mode 100644
index 000..2c395b32d32
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
@@ -0,0 +1,11 @@
+/* PR 104550 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=zero" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; 
+ __builtin_clear_padding (&info);  /* { dg-bogus "info" "is used 
uninitialized" } */ 
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
new file mode 100644
index 000..9893e37f12d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
@@ -0,0 +1,11 @@
+/* PR 104550 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info;   /* { dg-bogus "info" "is used uninitialized" } 
*/
+ __builtin_clear_padding (&info);  /* { dg-bogus "info" "is used 
uninitialized" } */ 
+}
-- 
2.27.0



Re: [PATCH 1/2][middle-end/102276] Don't emit switch-unreachable warnings for -ftrivial-auto-var-init (PR102276)

2022-02-24 Thread Qing Zhao



> On Feb 24, 2022, at 4:10 AM, Richard Biener  wrote:
> 
> On Sat, 19 Feb 2022, Qing Zhao wrote:
> 
>> Hi,
>> 
>> Per our discussion in the bug report 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102276
>> 
>> We decided to go with the following solution:
>> 
>> 1. avoid emitting switch-unreachable warnings for -ftrivial-auto-var-init;
>> 2. adding a new option -Wtrivial-auto-var-init to emit warnings for the 
>> switch-unreadable cases to suggest the user modify the source code;
>> 3. update documentation of -ftrivial-auto-var-init for the limitation on 
>> switch-unreachable cases and introduce the new option -Wtrivial-auto-var-init
>> 
>> with the above 1, we can resolve the current immediate issue of spurious 
>> warnings of using -ftrivial-auto-var-init to make kernel build succeed;
>> with the above 2, we provide the user a way to know that 
>> -ftrivial-auto-var-init has limitation on the switch-unreachable cases, and 
>> user should modify the source code to avoid this problem;
>> with the above 3, we will provide the user a clear documentation of the 
>> -ftrivial-auto-var-init and also provide suggestions how to resolve this 
>> issue. 
>> 
>> There are two patches included for this bug.  This is the first one.
>> 
>> The patches has been bootstrapped and regression tested on both x86 and 
>> aarch64.
>> 
>> Okay for commit?
>> 
>> Thanks.
>> 
>> Qing.
>> 
>> ===
>> 
>> From 65bc9607ff35ad49e5501ec5c392293c5b6358d0 Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Fri, 18 Feb 2022 15:35:53 +
>> Subject: [PATCH 1/2] Don't emit switch-unreachable warnings for
>> -ftrivial-auto-var-init (PR102276)
>> 
>> for the following testing case:
>>  1 int g(int *);
>>  2 int f1()
>>  3 {
>>  4 switch (0) {
>>  5 int x;
>>  6 default:
>>  7 return g(&x);
>>  8 }
>>  9 }
>> compiling with -O -ftrivial-auto-var-init causes spurious warning:
>> warning: statement will never be executed [-Wswitch-unreachable]
>>5 | int x;
>>  |     ^
>> This is due to the compiler-generated initialization at the point of
>> the declaration.
>> 
>> We could avoid the warning by adjusting the routine
>> "maybe_warn_switch_unreachable" to exclude the following cases:
>> 
>> when
>> flag_auto_var_init > AUTO_INIT_UNINITIALIZED
>> And
>> call to .DEFERRED_INIT
>> 
>> 2022-02-18 Qing Zhao  
>> gcc/ChangeLog:
>> 
>>  * gimplify.cc (maybe_warn_switch_unreachable): Don't warn for compiler
>>  -generated initializations for -ftrivial-auto-var-init.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.dg/auto-init-pr102276-1.c: New test.
>>  * gcc.dg/auto-init-pr102276-2.c: New test.
>> ---
>> gcc/gimplify.cc |  8 -
>> gcc/testsuite/gcc.dg/auto-init-pr102276-1.c | 38 +
>> gcc/testsuite/gcc.dg/auto-init-pr102276-2.c | 38 +
>> 3 files changed, 83 insertions(+), 1 deletion(-)
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-2.c
>> 
>> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
>> index f570daa015a..4e3bbf5314d 100644
>> --- a/gcc/gimplify.cc
>> +++ b/gcc/gimplify.cc
>> @@ -2103,7 +2103,13 @@ maybe_warn_switch_unreachable (gimple_seq seq)
>>&& TREE_CODE (gimple_goto_dest (stmt)) == LABEL_DECL
>>&& DECL_ARTIFICIAL (gimple_goto_dest (stmt)))
>>  /* Don't warn for compiler-generated gotos.  These occur
>> -   in Duff's devices, for example.  */;
>> +   in Duff's devices, for example.  */
>> +;
>> +  else if ((flag_auto_var_init > AUTO_INIT_UNINITIALIZED)
>> +&& (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)))
>> +/* Don't warn for compiler-generated initializations for
>> +  -ftrivial-auto-var-init.  */
>> +;
> 
> I think you want to instead skip these in warn_switch_unreachable_r
> since otherwise a .DEFERRED_INIT can silence the warning for a real
> stmt following it that is not reachable.

Oh, yeah, you are right.
Will fix this.

Thanks.

Qing



Re: [PATCH 2/2][middle-end/102276] Adding -Wtrivial-auto-var-init and update documentation.

2022-02-24 Thread Qing Zhao



> On Feb 24, 2022, at 4:16 AM, Richard Biener  wrote:
> 
> On Sat, 19 Feb 2022, Qing Zhao wrote:
> 
>> Hi,
>> 
>> This is the 2nd patch for fixing pr102276.
>> 
>> Adding -Wtrivial-auto-var-init and update documentation.
>> 
>> Adding a new warning option -Wtrivial-auto-var-init to report cases when
>> -ftrivial-auto-var-init cannot initialize the auto variable. At the same
>> time, update documentation for -ftrivial-auto-var-init to connect it with
>> the new warning option -Wtrivial-auto-var-init,  and add documentation
>> for -Wtrivial-auto-var-init.
>> 
>> Bootstraped and regression tested on both x86 and aarch64.
>> 
>> Okay for committing?
>> 
>> thanks.
>> 
>> Qing.
>> 
>> ==
>> From 4346890b8f4258489c4841f1992ba3ce816d7689 Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Fri, 18 Feb 2022 15:53:15 +
>> Subject: [PATCH 2/2] Adding -Wtrivial-auto-var-init and update documentation.
>> 
>> Adding a new warning option -Wtrivial-auto-var-init to report cases when
>> -ftrivial-auto-var-init cannot initialize the auto variable. At the same
>> time, update documentation for -ftrivial-auto-var-init to connect it with
>> the new warning option -Wtrivial-auto-var-init,  and add documentation
>> for -Wtrivial-auto-var-init.
>> 
>> 2022-02-18 Qing Zhao  
>> gcc/ChangeLog:
>> 
>>  * common.opt (-Wtrivial-auto-var-init): New option.
>>  * doc/invoke.texi (-Wtrivial-auto-var-init): Document new option.
>>  (-ftrivial-auto-var-init): Update option;
>>  * gimplify.cc (maybe_warn_switch_unreachable): Rename...
>>  (maybe_warn_switch_unreachable_and_auto_init): ...to this.
>>  (gimplify_switch_expr): Call new function.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.dg/auto-init-pr102276-3.c: New test.
>>  * gcc.dg/auto-init-pr102276-4.c: New test.
>> ---
>> gcc/common.opt  |   4 +
>> gcc/doc/invoke.texi |  14 ++-
>> gcc/gimplify.cc | 100 +++-
>> gcc/testsuite/gcc.dg/auto-init-pr102276-3.c |  40 
>> gcc/testsuite/gcc.dg/auto-init-pr102276-4.c |  40 
>> 5 files changed, 175 insertions(+), 23 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-3.c
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c
>> 
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index c21e5273ae3..22c95dbfa49 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -801,6 +801,10 @@ Wtrampolines
>> Common Var(warn_trampolines) Warning
>> Warn whenever a trampoline is generated.
>> 
>> +Wtrivial-auto-var-init
>> +Common Var(warn_trivial_auto_var_init) Warning Init(0)
>> +Warn about where -ftrivial-auto-var-init cannot initialize the auto 
>> variable.
>> +
> 
> Warn about cases where ... initialize a variable.

Okay. 

> 
>> Wtype-limits
>> Common Var(warn_type_limits) Warning EnabledBy(Wextra)
>> Warn if a comparison is always true or always false due to the limited range 
>> of the data type.
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index e1a00c80307..c61a5b4b4a5 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -399,7 +399,7 @@ Objective-C and Objective-C++ Dialects}.
>> -Wswitch  -Wno-switch-bool  -Wswitch-default  -Wswitch-enum @gol
>> -Wno-switch-outside-range  -Wno-switch-unreachable  -Wsync-nand @gol
>> -Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
>> --Wtsan -Wtype-limits  -Wundef @gol
>> +-Wtrivial-auto-var-init -Wtsan -Wtype-limits  -Wundef @gol
>> -Wuninitialized  -Wunknown-pragmas @gol
>> -Wunsuffixed-float-constants  -Wunused @gol
>> -Wunused-but-set-parameter  -Wunused-but-set-variable @gol
>> @@ -6953,6 +6953,14 @@ This warning is enabled by default for C and C++ 
>> programs.
>> Warn when @code{__sync_fetch_and_nand} and @code{__sync_nand_and_fetch}
>> built-in functions are used.  These functions changed semantics in GCC 4.4.
>> 
>> +@item -Wtrivial-auto-var-init
>> +@opindex Wtrivial-auto-var-init
>> +@opindex Wno-trivial-auto-var-init
>> +Warn when @code{-ftrivial-auto-var-init} cannot initialize the automatic
>> +variable.  A common situation is an automatic variable that is declared
>> +between the controlling expression and the first case lable of a 
>> @code{switch}
>> +statement.
>> +
>> @item -Wunused-but-set-paramete

Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Qing Zhao
I briefly checked all the usages of suppress_warning within the current gcc, 
and see that most of them are not guarded by any condition. 

So, the current change should be fine without introducing new issues. -:)

Another thing is, if we use “warning_enable_at” to guard, I just checked, 
this routine is not used by any routine right now, so it might be possible that 
this 
routine has some bug itself.  And now it’s the stage 4, we might need to be
conservative. 

Based on this, I think that it might be better to put the change in as it right 
now. 
If we think that all the “suppress_warning” need to be guarded by a specific
condition, we can do it in gcc13 earlier stage.

What’s your opinion?

Qing


> On Feb 24, 2022, at 9:13 AM, Jakub Jelinek  wrote:
> 
> On Thu, Feb 24, 2022 at 04:00:33PM +0100, Richard Biener wrote:
 --- a/gcc/gimple-fold.cc
 +++ b/gcc/gimple-fold.cc
 @@ -4379,7 +4379,12 @@ clear_padding_flush (clear_padding_struct *buf, 
 bool full)
  else
{
  src = make_ssa_name (type);
 -g = gimple_build_assign (src, unshare_expr (dst));
 +tree tmp_dst = unshare_expr (dst);
 +/* The folding introduces a read from the tmp_dst, we should
 +   prevent uninitialized warning analysis from issuing warning
 +   for such fake read.  */
 +suppress_warning (tmp_dst, OPT_Wuninitialized);
>>> 
>>> I wonder if we shouldn't guard the suppress_warning call on
>>>   if (warn_uninitialized || warn_maybe_uninitialized)
>>> because those warnings aren't on by default and the suppress_warning stuff,
>>> especially when it could be done for many loads from the builtin means
>>> populating hash tables with those.
>> 
>> Maybe that's something suppress_warning should do then?  OTOH you
> 
> Well, OPT_Wuninitialized is an argument why it can't.  The suppression
> is using a single OPT_W*, but there are multiple different warnings
> that care about that suppression, and suppress_warning can't know about it.
> 
>> don't know whether you're suppressing a warning in a region with
>> -Wno-uninitialized but that's inlined into a -Wuninitialized
>> function where then the false diagnostic pops up if we didn't
>> suppress the warning ...
> 
> I think both -Wuninitialized and -Wmaybe-uninitialized aren't
> Optimization or PerFunction, so they are global options.
> On the other side, they can be locally changed through pragmas.
> 
> Maybe we could use
>  if (warning_enabled_at (buf->loc, OPT_Wuninitialized)
>  || warning_enabled_at (buf->loc, OPT_Wmaybe_uninitialized))
> if uninit pass uses the gimple_location of the read, that shouldn't
> be really changing...
> 
>   Jakub
> 



[PATCH][V3][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Qing Zhao
Hi, Jakub and Richard:

This is the 3rd version of the patch, the major change compared to the previous 
version are:

1. Add warning_enabled_at guard before “suppress_warning”
2. Add location to the call to __builtin_clear_padding for auto init.

The patch has been bootstrapped and regress tested on both x86 and aarch64.
Okay for trunk?

Thanks.

Qing

==
From 8314ded4ca0f59c5a3ec431c9c3768fcaf2a0949 Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Thu, 24 Feb 2022 22:38:38 +
Subject: [PATCH] Suppress uninitialized warnings for new created uses from
 __builtin_clear_padding folding [PR104550]

__builtin_clear_padding(&object) will clear all the padding bits of the object.
actually, it doesn't involve any use of an user variable. Therefore, users do
not expect any uninitialized warning from it. It's reasonable to suppress
uninitialized warnings for all new created uses from __builtin_clear_padding
folding.

PR middle-end/104550

gcc/ChangeLog:

* gimple-fold.cc (clear_padding_flush): Suppress warnings for new
created uses.
* gimplify.cc (gimple_add_padding_init_for_auto_var): Set
location for new created call to __builtin_clear_padding.

gcc/testsuite/ChangeLog:

* gcc.dg/auto-init-pr104550-1.c: New test.
* gcc.dg/auto-init-pr104550-2.c: New test.
* gcc.dg/auto-init-pr104550-3.c: New test.
---
 gcc/gimple-fold.cc  | 11 ++-
 gcc/gimplify.cc |  1 +
 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
 5 files changed, 43 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 16f02c2d098d..67b4963ffd96 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "asan.h"
 #include "diagnostic-core.h"
+#include "diagnostic.h"
 #include "intl.h"
 #include "calls.h"
 #include "tree-vector-builder.h"
@@ -4379,7 +4380,15 @@ clear_padding_flush (clear_padding_struct *buf, bool 
full)
  else
{
  src = make_ssa_name (type);
- g = gimple_build_assign (src, unshare_expr (dst));
+ tree tmp_dst = unshare_expr (dst);
+ /* The folding introduces a read from the tmp_dst, we should
+prevent uninitialized warning analysis from issuing warning
+for such fake read.  */
+ if (warning_enabled_at (buf->loc, OPT_Wuninitialized)
+ || warning_enabled_at (buf->loc,
+OPT_Wmaybe_uninitialized))
+   suppress_warning (tmp_dst, OPT_Wuninitialized);
+ g = gimple_build_assign (src, tmp_dst);
  gimple_set_location (g, buf->loc);
  gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
  tree mask = native_interpret_expr (type,
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index f570daa015a5..977cf458f858 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -1823,6 +1823,7 @@ gimple_add_padding_init_for_auto_var (tree decl, bool 
is_vla,
 
   gimple *call = gimple_build_call (fn, 2, addr_of_decl,
build_one_cst (TREE_TYPE (addr_of_decl)));
+  gimple_set_location (call, EXPR_LOCATION (decl));
   gimplify_seq_add_stmt (seq_p, call);
 }
 
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
new file mode 100644
index ..a08110c3a170
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
@@ -0,0 +1,10 @@
+/* PR 104550*/
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } */
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
new file mode 100644
index ..2c395b32d322
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
@@ -0,0 +1,11 @@
+/* PR 104550 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=zero" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; 
+ __builtin_clear_padding (&info);  /* { dg-bogus "info" &

Re: [PATCH][V3][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Qing Zhao


> On Feb 25, 2022, at 2:38 AM, Richard Biener  wrote:
> 
> On Fri, 25 Feb 2022, Qing Zhao wrote:
> 
>> Hi, Jakub and Richard:
>> 
>> This is the 3rd version of the patch, the major change compared to the 
>> previous version are:
>> 
>> 1. Add warning_enabled_at guard before “suppress_warning”
>> 2. Add location to the call to __builtin_clear_padding for auto init.
>> 
>> The patch has been bootstrapped and regress tested on both x86 and aarch64.
>> Okay for trunk?
>> 
>> Thanks.
>> 
>> Qing
>> 
>> ======
>> From 8314ded4ca0f59c5a3ec431c9c3768fcaf2a0949 Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Thu, 24 Feb 2022 22:38:38 +
>> Subject: [PATCH] Suppress uninitialized warnings for new created uses from
>> __builtin_clear_padding folding [PR104550]
>> 
>> __builtin_clear_padding(&object) will clear all the padding bits of the 
>> object.
>> actually, it doesn't involve any use of an user variable. Therefore, users do
>> not expect any uninitialized warning from it. It's reasonable to suppress
>> uninitialized warnings for all new created uses from __builtin_clear_padding
>> folding.
>> 
>>  PR middle-end/104550
>> 
>> gcc/ChangeLog:
>> 
>>  * gimple-fold.cc (clear_padding_flush): Suppress warnings for new
>>  created uses.
>>  * gimplify.cc (gimple_add_padding_init_for_auto_var): Set
>>  location for new created call to __builtin_clear_padding.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.dg/auto-init-pr104550-1.c: New test.
>>  * gcc.dg/auto-init-pr104550-2.c: New test.
>>  * gcc.dg/auto-init-pr104550-3.c: New test.
>> ---
>> gcc/gimple-fold.cc  | 11 ++-
>> gcc/gimplify.cc |  1 +
>> gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
>> gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
>> gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
>> 5 files changed, 43 insertions(+), 1 deletion(-)
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
>> 
>> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
>> index 16f02c2d098d..67b4963ffd96 100644
>> --- a/gcc/gimple-fold.cc
>> +++ b/gcc/gimple-fold.cc
>> @@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
>> #include "attribs.h"
>> #include "asan.h"
>> #include "diagnostic-core.h"
>> +#include "diagnostic.h"
>> #include "intl.h"
>> #include "calls.h"
>> #include "tree-vector-builder.h"
>> @@ -4379,7 +4380,15 @@ clear_padding_flush (clear_padding_struct *buf, bool 
>> full)
>>else
>>  {
>>src = make_ssa_name (type);
>> -  g = gimple_build_assign (src, unshare_expr (dst));
>> +  tree tmp_dst = unshare_expr (dst);
>> +  /* The folding introduces a read from the tmp_dst, we should
>> + prevent uninitialized warning analysis from issuing warning
>> + for such fake read.  */
>> +  if (warning_enabled_at (buf->loc, OPT_Wuninitialized)
>> +  || warning_enabled_at (buf->loc,
>> + OPT_Wmaybe_uninitialized))
>> +suppress_warning (tmp_dst, OPT_Wuninitialized);
>> +  g = gimple_build_assign (src, tmp_dst);
> 
> So what about just gimple_set_no_warning (g, true); ?  (sorry for
> the ping-pong between us three...)

This didn’t work.  The small testing case still failed. This is due to in 
tree-ssa-uninit.cc,
 it checks get_no_uninit_warning (RHS), not for the whole stmt.

We can update tree-sea-uninit.cc to check the whole stmt, but I am not sure 
whether doing this might introduce other issue.

Qing

> 
>>gimple_set_location (g, buf->loc);
>>gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
>>tree mask = native_interpret_expr (type,
>> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
>> index f570daa015a5..977cf458f858 100644
>> --- a/gcc/gimplify.cc
>> +++ b/gcc/gimplify.cc
>> @@ -1823,6 +1823,7 @@ gimple_add_padding_init_for_auto_var (tree decl, bool 
>> is_vla,
>> 
>>   gimple *call = gimple_build_call (fn, 2, addr_of_decl,
>> 

Re: [PATCH 2/2][middle-end/102276] Adding -Wtrivial-auto-var-init and update documentation.

2022-02-25 Thread Qing Zhao



> On Feb 25, 2022, at 6:43 AM, Richard Biener  wrote:
> 
> On Thu, 24 Feb 2022, Qing Zhao wrote:
> 
>> 
>> 
>>> On Feb 24, 2022, at 4:16 AM, Richard Biener  wrote:
>>> 
>>> On Sat, 19 Feb 2022, Qing Zhao wrote:
>>> 
>>>> Hi,
>>>> 
>>>> This is the 2nd patch for fixing pr102276.
>>>> 
>>>> Adding -Wtrivial-auto-var-init and update documentation.
>>>> 
>>>> Adding a new warning option -Wtrivial-auto-var-init to report cases when
>>>> -ftrivial-auto-var-init cannot initialize the auto variable. At the same
>>>> time, update documentation for -ftrivial-auto-var-init to connect it with
>>>> the new warning option -Wtrivial-auto-var-init,  and add documentation
>>>> for -Wtrivial-auto-var-init.
>>>> 
>>>> Bootstraped and regression tested on both x86 and aarch64.
>>>> 
>>>> Okay for committing?
>>>> 
>>>> thanks.
>>>> 
>>>> Qing.
>>>> 
>>>> ==
>>>> From 4346890b8f4258489c4841f1992ba3ce816d7689 Mon Sep 17 00:00:00 2001
>>>> From: Qing Zhao 
>>>> Date: Fri, 18 Feb 2022 15:53:15 +
>>>> Subject: [PATCH 2/2] Adding -Wtrivial-auto-var-init and update 
>>>> documentation.
>>>> 
>>>> Adding a new warning option -Wtrivial-auto-var-init to report cases when
>>>> -ftrivial-auto-var-init cannot initialize the auto variable. At the same
>>>> time, update documentation for -ftrivial-auto-var-init to connect it with
>>>> the new warning option -Wtrivial-auto-var-init,  and add documentation
>>>> for -Wtrivial-auto-var-init.
>>>> 
>>>> 2022-02-18 Qing Zhao  
>>>> gcc/ChangeLog:
>>>> 
>>>>* common.opt (-Wtrivial-auto-var-init): New option.
>>>>* doc/invoke.texi (-Wtrivial-auto-var-init): Document new option.
>>>>(-ftrivial-auto-var-init): Update option;
>>>>* gimplify.cc (maybe_warn_switch_unreachable): Rename...
>>>>(maybe_warn_switch_unreachable_and_auto_init): ...to this.
>>>>(gimplify_switch_expr): Call new function.
>>>> 
>>>> gcc/testsuite/ChangeLog:
>>>> 
>>>>* gcc.dg/auto-init-pr102276-3.c: New test.
>>>>* gcc.dg/auto-init-pr102276-4.c: New test.
>>>> ---
>>>> gcc/common.opt  |   4 +
>>>> gcc/doc/invoke.texi |  14 ++-
>>>> gcc/gimplify.cc | 100 +++-
>>>> gcc/testsuite/gcc.dg/auto-init-pr102276-3.c |  40 
>>>> gcc/testsuite/gcc.dg/auto-init-pr102276-4.c |  40 
>>>> 5 files changed, 175 insertions(+), 23 deletions(-)
>>>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-3.c
>>>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c
>>>> 
>>>> diff --git a/gcc/common.opt b/gcc/common.opt
>>>> index c21e5273ae3..22c95dbfa49 100644
>>>> --- a/gcc/common.opt
>>>> +++ b/gcc/common.opt
>>>> @@ -801,6 +801,10 @@ Wtrampolines
>>>> Common Var(warn_trampolines) Warning
>>>> Warn whenever a trampoline is generated.
>>>> 
>>>> +Wtrivial-auto-var-init
>>>> +Common Var(warn_trivial_auto_var_init) Warning Init(0)
>>>> +Warn about where -ftrivial-auto-var-init cannot initialize the auto 
>>>> variable.
>>>> +
>>> 
>>> Warn about cases where ... initialize a variable.
>> 
>> Okay. 
>> 
>>> 
>>>> Wtype-limits
>>>> Common Var(warn_type_limits) Warning EnabledBy(Wextra)
>>>> Warn if a comparison is always true or always false due to the limited 
>>>> range of the data type.
>>>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>>>> index e1a00c80307..c61a5b4b4a5 100644
>>>> --- a/gcc/doc/invoke.texi
>>>> +++ b/gcc/doc/invoke.texi
>>>> @@ -399,7 +399,7 @@ Objective-C and Objective-C++ Dialects}.
>>>> -Wswitch  -Wno-switch-bool  -Wswitch-default  -Wswitch-enum @gol
>>>> -Wno-switch-outside-range  -Wno-switch-unreachable  -Wsync-nand @gol
>>>> -Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
>>>> --Wtsan -Wtype-limits  -Wundef @gol
>>>> +-Wtrivial-auto-var-init -Wtsan -Wtype-limits  -Wundef @gol
>>>> -Wuninitial

Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Qing Zhao
Hi, 

After more study of all the discussion so far and the corresponding code for 
suppress_warning, I think the following suggestion
Should be the best approach right now for this issue:

>   SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
>   suppress_warning (tmp_dst, OPT_Wuninitialized);
> with a comment explaing why we do that?


The reason is:

After “SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION)”, 

152 /* Enable, or by default disable, a warning for the expression.
153The wildcard OPT of -1 controls all warnings.  */
154 
155 void
156 suppress_warning (tree expr, opt_code opt /* = all_warnings */,
157   bool supp /* = true */)
158 {
159   if (opt == no_warning)
160 return;
161 
162   const location_t loc = get_location (expr);
163 
164   if (!RESERVED_LOCATION_P (loc))
165 supp = suppress_warning_at (loc, opt, supp) || supp;
166   set_no_warning_bit (expr, supp);
167 }

Suppress_warning will NOT call “suppress_warning_at” to involve any operation 
on the hash tables. It just
simply call “set_no_warning_bit” to set the no_warning bit for the MEM_REF 
expr. 

And later during the routine “maybe_warn_operand” in tree-sea-uninit.cc, 
“get_no_uninit_warning” will also
Simply check the no_warning bit of the MEM_REF to see whether the warning need 
to be issued. 

This resolved all the concerns we have so far.

I will prepare the patch based on this approach.

Let me know your opinion.

Qing

> On Feb 25, 2022, at 4:04 AM, Jakub Jelinek via Gcc-patches 
>  wrote:
> 
> On Fri, Feb 25, 2022 at 10:31:50AM +0100, Richard Biener wrote:
>> I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent"
>> a creative location for the artificial stmt it will of course
>> affect other stmts/expressions using that location.
>> 
>>> I think it will work.
>> 
>> Yes, I think so.  OTOH the uninit pass does
>> 
>>  /* Avoid warning if we've already done so or if the warning has been
>> suppressed.  */
>>  if (((warning_suppressed_p (context, OPT_Wuninitialized)
>>|| (gimple_assign_single_p (context)
>>&& get_no_uninit_warning (gimple_assign_rhs1 (context)
>>  || (var && get_no_uninit_warning (var))
>>  || (var_name_str
>>  && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized)))
>>return;
>> 
>> that's a mightly complicated way to test and I'm not sure we get
>> to the bit on the stmt reliably.  So maybe TREE_NO_WARNING on the
>> reference (or making sure it has UNKNOWN_LOCATION and using
>> suppress_warning on it) is a better idea after all...
> 
> So
>   SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
>   suppress_warning (tmp_dst, OPT_Wuninitialized);
> with a comment explaing why we do that?
> LGTM.
> 
>   Jakub
> 



[PATCH][V4][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Qing Zhao
Hi, 

This is the 4th version based on the discussion so far.

The major change is:

> SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
>   suppress_warning (tmp_dst, OPT_Wuninitialized);
> with a comment explaing why we do that.


The patch has been bootstrapped and regress tested on both x86 and aarch64.
Okay for trunk?

Thanks.

Qing

=
>From 276975e60827942f0dd4043ce5f52e600327d6a8 Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Thu, 24 Feb 2022 22:38:38 +
Subject: [PATCH] Suppress uninitialized warnings for new created uses from
 __builtin_clear_padding folding [PR104550]

__builtin_clear_padding(&object) will clear all the padding bits of the object.
actually, it doesn't involve any use of an user variable. Therefore, users do
not expect any uninitialized warning from it. It's reasonable to suppress
uninitialized warnings for all new created uses from __builtin_clear_padding
folding.

PR middle-end/104550

gcc/ChangeLog:

* gimple-fold.cc (clear_padding_flush): Suppress warnings for new
created uses.

gcc/testsuite/ChangeLog:

* gcc.dg/auto-init-pr104550-1.c: New test.
* gcc.dg/auto-init-pr104550-2.c: New test.
* gcc.dg/auto-init-pr104550-3.c: New test.
---
 gcc/gimple-fold.cc  | 12 +++-
 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
 4 files changed, 43 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 16f02c2d098d..c9179abb27ed 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -4379,7 +4379,17 @@ clear_padding_flush (clear_padding_struct *buf, bool 
full)
  else
{
  src = make_ssa_name (type);
- g = gimple_build_assign (src, unshare_expr (dst));
+ tree tmp_dst = unshare_expr (dst);
+ /* The folding introduces a read from the tmp_dst, we should
+prevent uninitialized warning analysis from issuing warning
+for such fake read.  In order to suppress warning only for
+this expr, we should set the location of tmp_dst to
+UNKNOWN_LOCATION first, then suppress_warning will call
+set_no_warning_bit to set the no_warning flag only for
+tmp_dst.  */
+ SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
+ suppress_warning (tmp_dst, OPT_Wuninitialized);
+ g = gimple_build_assign (src, tmp_dst);
  gimple_set_location (g, buf->loc);
  gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
  tree mask = native_interpret_expr (type,
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
new file mode 100644
index ..a08110c3a170
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
@@ -0,0 +1,10 @@
+/* PR 104550*/
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } */
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
new file mode 100644
index ..2c395b32d322
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
@@ -0,0 +1,11 @@
+/* PR 104550 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=zero" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; 
+ __builtin_clear_padding (&info);  /* { dg-bogus "info" "is used 
uninitialized" } */ 
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
new file mode 100644
index ..9893e37f12d8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
@@ -0,0 +1,11 @@
+/* PR 104550 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info;   /* { dg-bogus "info" "is used uninitialized" } 
*/
+ __builtin_clear_padding (&info);  /* { dg-bogus "info" "is used 
uninitialized" } */ 
+}
-- 
2.27.0



Re: PING*2 : Fwd: [PATCH][gcov-profile/91971]Profile directory concatenated with object file path

2019-10-23 Thread Qing Zhao
Thank you!

Just committed the change at:

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=277344 
<https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=277344>

Qing
> On Oct 23, 2019, at 5:15 AM, Martin Liška  wrote:
> 
> On 10/21/19 5:32 PM, Qing Zhao wrote:
>> Please let me know whether this patch is reasonable or not.
> 
> The patch is fine. Please add PR entry to the ChangeLog and
> install the patch.
> 
> Thanks,
> Martin



Re: [PATCH v6 0/5]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-03-01 Thread Qing Zhao
Ping on this patch set.

Thanks a lot!

Qing

> On Feb 16, 2024, at 14:47, Qing Zhao  wrote:
> 
> Hi,
> 
> This is the 6th version of the patch.
> 
> compare with the 5th version, the only difference is:
> 
> 1. Add the 6th argument to .ACCESS_WITH_SIZE
>   to carry the TYPE of the flexible array.
>   Such information is needed during tree-object-size.cc.
> 
>   previously, we use the result type of the routine
>   .ACCESS_WITH_SIZE to decide the element type of the
>   original array, however, the result type of the routine
>   might be changed during tree optimizations due to 
>   possible type casting in the source code.
> 
> 
> compare with the 4th version, the major difference are:
> 
> 1. Change the return type of the routine .ACCESS_WITH_SIZE 
>   FROM:
> Pointer to the type of the element of the flexible array;
>   TO:
> Pointer to the type of the flexible array;
>And then wrap the call with an indirection reference. 
> 
> 2. Adjust all other parts with this change, (this will simplify the bound 
> sanitizer instrument);
> 
> 3. Add the fixes to the kernel building failures, which include:
>A. The operator “typeof” cannot return correct type for a->array; 
>B. The operator “&” cannot return correct address for a->array;
> 
> 4. Correctly handle the case when the value of “counted-by” is zero or 
> negative as following
>   4.1. Update the counted-by doc with the following:
>When the counted-by field is assigned a negative integer value, the 
> compiler will treat the value as zero. 
>   4.2. Adjust __bdos and array bound sanitizer to handle correctly when 
> “counted-by” is zero. 
> 
> 
> It based on the following proposal:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635884.html
> Represent the missing dependence for the "counted_by" attribute and its 
> consumers
> 
> **The summary of the proposal is:
> 
> * Add a new internal function ".ACCESS_WITH_SIZE" to carry the size 
> information for every reference to a FAM field;
> * In C FE, Replace every reference to a FAM field whose TYPE has the 
> "counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE";
> * In every consumer of the size information, for example, BDOS or array bound 
> sanitizer, query the size information or ACCESS_MODE information from the new 
> internal function;
> * When expansing to RTL, replace the internal function with the actual 
> reference to the FAM field;
> * Some adjustment to ipa alias analysis, and other SSA passes to mitigate the 
> impact to the optimizer and code generation.
> 
> 
> **The new internal function
> 
>  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, TYPE_OF_SIZE, 
> ACCESS_MODE, TYPE_OF_REF)
> 
> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
> 
> which returns the "REF_TO_OBJ" same as the 1st argument;
> 
> Both the return type and the type of the first argument of this function have 
> been converted from the incomplete array type to the corresponding pointer 
> type.
> 
> The call to .ACCESS_WITH_SIZE is wrapped with an INDIRECT_REF, whose type is 
> the original imcomplete array type.
> 
> Please see the following link for why:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
> 
> 1st argument "REF_TO_OBJ": The reference to the object;
> 2nd argument "REF_TO_SIZE": The reference to the size of the object,
> 3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE 
> represents
>   0: unknown;
>   1: the number of the elements of the object type;
>   2: the number of bytes;
> 4th argument "TYPE_OF_SIZE": A constant 0 with the TYPE of the object
>  refed by REF_TO_SIZE
> 5th argument "ACCESS_MODE":
>  -1: Unknown access semantics
>   0: none
>   1: read_only
>   2: write_only
>   3: read_write
> 6th argument "TYPE_OF_REF": A constant 0 with the pointer TYPE to
>  the original flexible array type.
> 
> ** The Patch sets included:
> 
> 1. Provide counted_by attribute to flexible array member field;
>  which includes:
>  * "counted_by" attribute documentation;
>  * C FE handling of the new attribute;
>syntax checking, error reporting;
>  * testing cases;
> 
> 2. Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.
>  which includes:
>  * The definition of the new internal function .ACCESS_WITH_SIZE in 
> internal-fn.def.
>  * C FE converts every reference to a FAM with "counte

Re: [PATCH v6 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-03-13 Thread Qing Zhao
Sid,

Thanks a lot for your time to review the code.
See my reply below:

On Mar 11, 2024, at 10:57, Siddhesh Poyarekar  wrote:

On 2024-02-16 14:47, Qing Zhao wrote:
 return true;
   else
 return targetm.attribute_takes_identifier_p (attr_id);
@@ -2806,6 +2811,53 @@ handle_strict_flex_array_attribute (tree *node, tree 
name,
   return NULL_TREE;
 }
 +/* Handle a "counted_by" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_counted_by_attribute (tree *node, tree name,
+  tree args, int ARG_UNUSED (flags),
+  bool *no_add_attrs)
+{
+  tree decl = *node;
+  tree argval = TREE_VALUE (args);
+
+  /* This attribute only applies to field decls of a structure.  */
+  if (TREE_CODE (decl) != FIELD_DECL)
+{
+  error_at (DECL_SOURCE_LOCATION (decl),
+ "%qE attribute may not be specified for non-field"
+ " declaration %q+D", name, decl);
+  *no_add_attrs = true;
+}
+  /* This attribute only applies to field with array type.  */
+  else if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE)
+{
+  error_at (DECL_SOURCE_LOCATION (decl),
+ "%qE attribute may not be specified for a non-array field",
+ name);
+  *no_add_attrs = true;
+}
+  /* This attribute only applies to a C99 flexible array member type.  */
+  else if (! c_flexible_array_member_type_p (TREE_TYPE (decl)))
+{
+  error_at (DECL_SOURCE_LOCATION (decl),
+ "%qE attribute may not be specified for a non"
+ " flexible array member field",
+ name);
+  *no_add_attrs = true;
+}

How about "not allowed" instead of "may not be specified"?

Okay, will update them.

+  /* The argument should be an identifier.  */
+  else if (TREE_CODE (argval) != IDENTIFIER_NODE)
+{
+  error_at (DECL_SOURCE_LOCATION (decl),
+ "% argument not an identifier");
+  *no_add_attrs = true;
+}

Validate that the attribute only applies to a C99 flexible array member of a 
structure and the argument should be an identifier node.  OK. 
verify_counted_by_attribute does more extensive validation on argval.
Yes.

+
+  return NULL_TREE;
+}
+
 /* Handle a "weak" attribute; arguments as in
struct attribute_spec.handler.  */
 diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index e15eff698dfd..56d828e3dfaf 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -9909,6 +9909,19 @@ c_common_finalize_early_debug (void)
   (*debug_hooks->early_global_decl) (cnode->decl);
 }
 +/* Determine whether TYPE is a ISO C99 flexible array memeber type "[]".  */

s/memeber/member/
Okay, will update it.

+bool
+c_flexible_array_member_type_p (const_tree type)
+{
+  if (TREE_CODE (type) == ARRAY_TYPE
+  && TYPE_SIZE (type) == NULL_TREE
+  && TYPE_DOMAIN (type) != NULL_TREE
+  && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) == NULL_TREE)
+return true;
+
+  return false;
+}
+

Moved from c/c-decl.cc<http://c-decl.cc/>.  OK.

 /* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
 unsigned int
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 2d5f53998855..3e0eed0548b0 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -904,6 +904,7 @@ extern tree fold_for_warn (tree);
 extern tree c_common_get_narrower (tree, int *);
 extern bool get_attribute_operand (tree, unsigned HOST_WIDE_INT *);
 extern void c_common_finalize_early_debug (void);
+extern bool c_flexible_array_member_type_p (const_tree);
 extern unsigned int c_strict_flex_array_level_of (tree);
 extern bool c_option_is_from_cpp_diagnostics (int);
 extern tree c_hardbool_type_attr_1 (tree, tree *, tree *);
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index fe20bc21c926..4348123502e4 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -5301,19 +5301,6 @@ set_array_declarator_inner (struct c_declarator *decl,
   return decl;
 }
 -/* Determine whether TYPE is a ISO C99 flexible array memeber type "[]".  */
-static bool
-flexible_array_member_type_p (const_tree type)
-{
-  if (TREE_CODE (type) == ARRAY_TYPE
-  && TYPE_SIZE (type) == NULL_TREE
-  && TYPE_DOMAIN (type) != NULL_TREE
-  && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) == NULL_TREE)
-return true;
-
-  return false;
-}
-
 /* Determine whether TYPE is a one-element array type "[1]".  */
 static bool
 one_element_array_type_p (const_tree type)
@@ -5350,7 +5337,7 @@ add_flexible_array_elts_to_size (tree decl, tree init)
 elt = CONSTRUCTOR_ELTS (init)->last ().value;
   type = TREE_TYPE (elt);
-  if (flexible_array_member_type_p (type))
+  if (c_flexible_array_member_type_p (type))
 {
   complete_array_type (&type, elt, false);
   DECL_SIZE (decl)
@@ -9317,7 +9304,7 @@ is_flexible_array_member

Re: [PATCH v6 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.

2024-03-13 Thread Qing Zhao


> On Mar 11, 2024, at 13:09, Siddhesh Poyarekar  wrote:
> 
> 
> 
> On 2024-02-16 14:47, Qing Zhao wrote:
>> Including the following changes:
>> * The definition of the new internal function .ACCESS_WITH_SIZE
>>   in internal-fn.def.
>> * C FE converts every reference to a FAM with a "counted_by" attribute
>>   to a call to the internal function .ACCESS_WITH_SIZE.
>>   (build_component_ref in c_typeck.cc)
>>   This includes the case when the object is statically allocated and
>>   initialized.
>>   In order to make this working, the routines initializer_constant_valid_p_1
>>   and output_constant in varasm.cc are updated to handle calls to
>>   .ACCESS_WITH_SIZE.
>>   (initializer_constant_valid_p_1 and output_constant in varasm.c)
>>   However, for the reference inside "offsetof", the "counted_by" attribute is
>>   ignored since it's not useful at all.
>>   (c_parser_postfix_expression in c/c-parser.cc)
>>   In addtion to "offsetof", for the reference inside operator "typeof" and
>>   "alignof", we ignore counted_by attribute too.
>>   When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE,
>>   replace the call with its first argument.
>> * Convert every call to .ACCESS_WITH_SIZE to its first argument.
>>   (expand_ACCESS_WITH_SIZE in internal-fn.cc)
>> * Adjust alias analysis to exclude the new internal from clobbering anything.
>>   (ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in 
>> tree-ssa-alias.cc)
>> * Adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE 
>> when
>>   it's LHS is eliminated as dead code.
>>   (eliminate_unnecessary_stmts in tree-ssa-dce.cc)
>> * Provide the utility routines to check the call is .ACCESS_WITH_SIZE and
>>   get the reference from the call to .ACCESS_WITH_SIZE.
>>   (is_access_with_size_p and get_ref_from_access_with_size in tree.cc)
>> gcc/c/ChangeLog:
>>  * c-parser.cc (c_parser_postfix_expression): Ignore the counted-by
>>  attribute when build_component_ref inside offsetof operator.
>>  * c-tree.h (build_component_ref): Add one more parameter.
>>  * c-typeck.cc (build_counted_by_ref): New function.
>>  (build_access_with_size_for_counted_by): New function.
>>  (build_component_ref): Check the counted-by attribute and build
>>  call to .ACCESS_WITH_SIZE.
>>  (build_unary_op): When building ADDR_EXPR for
>> .ACCESS_WITH_SIZE, use its first argument.
>> (lvalue_p): Accept call to .ACCESS_WITH_SIZE.
>> gcc/ChangeLog:
>>  * internal-fn.cc (expand_ACCESS_WITH_SIZE): New function.
>>  * internal-fn.def (ACCESS_WITH_SIZE): New internal function.
>>  * tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Special case
>>  IFN_ACCESS_WITH_SIZE.
>>  (call_may_clobber_ref_p_1): Special case IFN_ACCESS_WITH_SIZE.
>>  * tree-ssa-dce.cc (eliminate_unnecessary_stmts): Eliminate the call
>>  to .ACCESS_WITH_SIZE when its LHS is dead.
>>  * tree.cc (process_call_operands): Adjust side effect for function
>>  .ACCESS_WITH_SIZE.
>>  (is_access_with_size_p): New function.
>>  (get_ref_from_access_with_size): New function.
>>  * tree.h (is_access_with_size_p): New prototype.
>>  (get_ref_from_access_with_size): New prototype.
>>  * varasm.cc (initializer_constant_valid_p_1): Handle call to
>>  .ACCESS_WITH_SIZE.
>>  (output_constant): Handle call to .ACCESS_WITH_SIZE.
>> gcc/testsuite/ChangeLog:
>>  * gcc.dg/flex-array-counted-by-2.c: New test.
>> ---
>>  gcc/c/c-parser.cc |  10 +-
>>  gcc/c/c-tree.h|   2 +-
>>  gcc/c/c-typeck.cc | 128 +-
>>  gcc/internal-fn.cc|  36 +
>>  gcc/internal-fn.def   |   4 +
>>  .../gcc.dg/flex-array-counted-by-2.c  | 112 +++
>>  gcc/tree-ssa-alias.cc |   2 +
>>  gcc/tree-ssa-dce.cc   |   5 +-
>>  gcc/tree.cc   |  25 +++-
>>  gcc/tree.h|   8 ++
>>  gcc/varasm.cc |  10 ++
>>  11 files changed, 332 insertions(+), 10 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
>> diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
>> index c31349dae2ff..a6ed5ac43bb1 100644
>> --- a/gcc/c/c-parser.cc
>> +++ b/gcc/c/

Re: [PATCH v6 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.

2024-03-13 Thread Qing Zhao


On Mar 11, 2024, at 13:11, Siddhesh Poyarekar  wrote:



On 2024-02-16 14:47, Qing Zhao wrote:
gcc/ChangeLog:
* tree-object-size.cc (access_with_size_object_size): New function.
(call_object_size): Call the new function.
gcc/testsuite/ChangeLog:
* gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT.
* gcc.dg/flex-array-counted-by-3.c: New test.
* gcc.dg/flex-array-counted-by-4.c: New test.
* gcc.dg/flex-array-counted-by-5.c: New test.
---
 .../gcc.dg/builtin-object-size-common.h   |  11 ++
 .../gcc.dg/flex-array-counted-by-3.c  |  63 +++
 .../gcc.dg/flex-array-counted-by-4.c  | 178 ++
 .../gcc.dg/flex-array-counted-by-5.c  |  48 +
 gcc/tree-object-size.cc   |  59 ++
 5 files changed, 359 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-5.c
diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h 
b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
index 66ff7cdd953a..b677067c6e6b 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
@@ -30,3 +30,14 @@ unsigned nfails = 0;
   __builtin_abort ();   \
 return 0;   \
   } while (0)
+
+#define EXPECT(p, _v) do {   \
+  size_t v = _v;   \
+  if (p == v)   \
+__builtin_printf ("ok:  %s == %zd\n", #p, p);   \
+  else   \
+{   \
+  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v);   \
+  FAIL ();   \
+}   \
+} while (0);
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
new file mode 100644
index ..0066c32ca808
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
@@ -0,0 +1,63 @@
+/* test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size.  */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+struct nested_annotated {
+  struct {
+union {
+  int b;
+  float f;
+};
+int n;
+  };
+  int c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+= (struct flex *)malloc (sizeof (struct flex)
+  + normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+= (struct annotated *)malloc (sizeof (struct annotated)
+   + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+= (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
+  + attr_count *  sizeof (int));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1),
+array_annotated->b * sizeof (int));
+EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
+array_nested_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);
+  test ();
+  DONE ();
+}
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
new file mode 100644
index ..3ce7f3545549
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
@@ -0,0 +1,178 @@
+/* test the attribute counted_by and its usage in
+__builtin_dynamic_object_size: what's the correct behavior when the
+allocation size mismatched with the value of counted_by attribute?
+we should always use the latest value that is hold by the counted_by
+field.  */
+/* { dg-do run } */
+/* { dg-options "-O -fstrict-flex-arrays=3" } */
+
+#include "builtin-object-size-common.h"
+
+struct annotated {
+  size_t foo;
+  char others;
+  char array[] __attribute__((counted_by (foo)));
+};
+
+#define noinline __attribute__((__noinline__))
+#define SIZE_BUMP 10
+#define MAX(a, b) ((a) > (b) ? (a) : (b))
+
+/* In general, Due to type casting, the type for the pointee of a pointer
+   does not say anything about the object it points to,
+   So, __builtin_object_size can not directly use the type of the pointee
+   to decide the size of the object the pointer points to.
+
+   there are only two reliable ways:
+   A. observed allocations  (call to the allocation functions in the routine)
+   B. observed accesses (read or write access to the location of the
+ pointer points to)
+
+   that provide information about the type/existence of an object at
+   the

Re: [PATCH v6 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer.

2024-03-13 Thread Qing Zhao


On Mar 11, 2024, at 13:15, Siddhesh Poyarekar  wrote:



On 2024-02-16 14:47, Qing Zhao wrote:
gcc/c-family/ChangeLog:
* c-ubsan.cc (get_bound_from_access_with_size): New function.
(ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE.
gcc/testsuite/ChangeLog:
* gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
---
 gcc/c-family/c-ubsan.cc   | 42 +
 .../ubsan/flex-array-counted-by-bounds-2.c| 45 ++
 .../ubsan/flex-array-counted-by-bounds-3.c| 34 ++
 .../ubsan/flex-array-counted-by-bounds.c  | 46 +++
 4 files changed, 167 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-3.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c
diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 940982819ddf..164b29845b3a 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -376,6 +376,40 @@ ubsan_instrument_return (location_t loc)
   return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, data));
 }
 +/* Get the tree that represented the number of counted_by, i.e, the maximum
+   number of the elements of the object that the call to .ACCESS_WITH_SIZE
+   points to, this number will be the bound of the corresponding array.  */
+static tree
+get_bound_from_access_with_size (tree call)
+{
+  if (!is_access_with_size_p (call))
+return NULL_TREE;
+
+  tree ref_to_size = CALL_EXPR_ARG (call, 1);
+  unsigned int type_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2));

Again for consistency, this should probably be class_of_size.

Okay, I will update this consistently with the change relate to the 3rd 
argument.

+  tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3));
+  tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size),
+build_int_cst (ptr_type_node, 0));
+  /* If size is negative value, treat it as zero.  */
+  if (!TYPE_UNSIGNED (type))
+  {
+tree cond = fold_build2 (LT_EXPR, boolean_type_node,
+  unshare_expr (size), build_zero_cst (type));
+size = fold_build3 (COND_EXPR, type, cond,
+ build_zero_cst (type), size);
+  }
+
+  /* Only when type_of_size is 1,i.e, the number of the elements of
+ the object type, return the size.  */
+  if (type_of_size != 1)
+return NULL_TREE;
+  else
+size = fold_convert (sizetype, size);
+
+  return size;
+}
+
+
 /* Instrument array bounds for ARRAY_REFs.  We create special builtin,
that gets expanded in the sanopt pass, and make an array dimension
of it.  ARRAY is the array, *INDEX is an index to the array.
@@ -401,6 +435,14 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
&& COMPLETE_TYPE_P (type)
&& integer_zerop (TYPE_SIZE (type)))
  bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+  else if (INDIRECT_REF_P (array)
+&& is_access_with_size_p ((TREE_OPERAND (array, 0
+ {
+   bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0)));
+   bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound),
+bound,
+build_int_cst (TREE_TYPE (bound), 1));
+ }

This will wrap if bound == 0, maybe that needs to be special-cased.  And maybe 
also add a test for it below.

Will check on this to see whether a new testing is needed.

Thanks a lot for the review.

Qing

   else
  return NULL_TREE;
 }
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c 
b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
new file mode 100644
index ..148934975ee5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
@@ -0,0 +1,45 @@
+/* test the attribute counted_by and its usage in
+   bounds sanitizer combined with VLA.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-output "index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+#include 
+
+void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
+{
+   struct foo {
+   int n;
+   int p[][n] __attribute__((counted_by(n)));
+   } *f;
+
+   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n]));
+   f->n = m;
+   f->p[m][n-1]=1;
+   return;
+}
+
+void __attribute__((__noinline__)) setup_and_test_v

Re: [PATCH v6 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer.

2024-03-15 Thread Qing Zhao


On Mar 13, 2024, at 15:19, Qing Zhao  wrote:



On Mar 11, 2024, at 13:15, Siddhesh Poyarekar  wrote:



On 2024-02-16 14:47, Qing Zhao wrote:
gcc/c-family/ChangeLog:
* c-ubsan.cc (get_bound_from_access_with_size): New function.
(ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE.
gcc/testsuite/ChangeLog:
* gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
---
 gcc/c-family/c-ubsan.cc   | 42 +
 .../ubsan/flex-array-counted-by-bounds-2.c| 45 ++
 .../ubsan/flex-array-counted-by-bounds-3.c| 34 ++
 .../ubsan/flex-array-counted-by-bounds.c  | 46 +++
 4 files changed, 167 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-3.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c
diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 940982819ddf..164b29845b3a 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -376,6 +376,40 @@ ubsan_instrument_return (location_t loc)
   return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, data));
 }
 +/* Get the tree that represented the number of counted_by, i.e, the maximum
+   number of the elements of the object that the call to .ACCESS_WITH_SIZE
+   points to, this number will be the bound of the corresponding array.  */
+static tree
+get_bound_from_access_with_size (tree call)
+{
+  if (!is_access_with_size_p (call))
+return NULL_TREE;
+
+  tree ref_to_size = CALL_EXPR_ARG (call, 1);
+  unsigned int type_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2));

Again for consistency, this should probably be class_of_size.

Okay, I will update this consistently with the change relate to the 3rd 
argument.

+  tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3));
+  tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size),
+build_int_cst (ptr_type_node, 0));
+  /* If size is negative value, treat it as zero.  */
+  if (!TYPE_UNSIGNED (type))
+  {
+tree cond = fold_build2 (LT_EXPR, boolean_type_node,
+  unshare_expr (size), build_zero_cst (type));
+size = fold_build3 (COND_EXPR, type, cond,
+ build_zero_cst (type), size);
+  }
+
+  /* Only when type_of_size is 1,i.e, the number of the elements of
+ the object type, return the size.  */
+  if (type_of_size != 1)
+return NULL_TREE;
+  else
+size = fold_convert (sizetype, size);
+
+  return size;
+}
+
+
 /* Instrument array bounds for ARRAY_REFs.  We create special builtin,
that gets expanded in the sanopt pass, and make an array dimension
of it.  ARRAY is the array, *INDEX is an index to the array.
@@ -401,6 +435,14 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
&& COMPLETE_TYPE_P (type)
&& integer_zerop (TYPE_SIZE (type)))
  bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+  else if (INDIRECT_REF_P (array)
+&& is_access_with_size_p ((TREE_OPERAND (array, 0
+ {
+   bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0)));
+   bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound),
+bound,
+build_int_cst (TREE_TYPE (bound), 1));
+ }

This will wrap if bound == 0, maybe that needs to be special-cased.  And maybe 
also add a test for it below.

Will check on this to see whether a new testing is needed.

Checked, the current code can handle the case when bound==0 correctly.
I just add a new testing case for this.

thanks.

Qing

Thanks a lot for the review.

Qing

   else
  return NULL_TREE;
 }
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c 
b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
new file mode 100644
index ..148934975ee5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
@@ -0,0 +1,45 @@
+/* test the attribute counted_by and its usage in
+   bounds sanitizer combined with VLA.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-output "index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+#include 
+
+void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
+{
+   struct foo {
+   int n;
+   int p[][n] __attribute__((counted_by(n)));
+   } *f;
+
+

Re: [PATCH v6 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.

2024-03-18 Thread Qing Zhao


On Mar 13, 2024, at 15:17, Qing Zhao  wrote:



On Mar 11, 2024, at 13:11, Siddhesh Poyarekar  wrote:



On 2024-02-16 14:47, Qing Zhao wrote:
gcc/ChangeLog:
* tree-object-size.cc (access_with_size_object_size): New function.
(call_object_size): Call the new function.
gcc/testsuite/ChangeLog:
* gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT.
* gcc.dg/flex-array-counted-by-3.c: New test.
* gcc.dg/flex-array-counted-by-4.c: New test.
* gcc.dg/flex-array-counted-by-5.c: New test.
---
 .../gcc.dg/builtin-object-size-common.h   |  11 ++
 .../gcc.dg/flex-array-counted-by-3.c  |  63 +++
 .../gcc.dg/flex-array-counted-by-4.c  | 178 ++
 .../gcc.dg/flex-array-counted-by-5.c  |  48 +
 gcc/tree-object-size.cc   |  59 ++
 5 files changed, 359 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-5.c
diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h 
b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
index 66ff7cdd953a..b677067c6e6b 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
@@ -30,3 +30,14 @@ unsigned nfails = 0;
   __builtin_abort ();   \
 return 0;   \
   } while (0)
+
+#define EXPECT(p, _v) do {   \
+  size_t v = _v;   \
+  if (p == v)   \
+__builtin_printf ("ok:  %s == %zd\n", #p, p);   \
+  else   \
+{   \
+  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v);   \
+  FAIL ();   \
+}   \
+} while (0);
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
new file mode 100644
index ..0066c32ca808
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
@@ -0,0 +1,63 @@
+/* test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size.  */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+struct nested_annotated {
+  struct {
+union {
+  int b;
+  float f;
+};
+int n;
+  };
+  int c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+= (struct flex *)malloc (sizeof (struct flex)
+  + normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+= (struct annotated *)malloc (sizeof (struct annotated)
+   + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+= (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
+  + attr_count *  sizeof (int));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1),
+array_annotated->b * sizeof (int));
+EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
+array_nested_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);
+  test ();
+  DONE ();
+}
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
new file mode 100644
index ..3ce7f3545549
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
@@ -0,0 +1,178 @@
+/* test the attribute counted_by and its usage in
+__builtin_dynamic_object_size: what's the correct behavior when the
+allocation size mismatched with the value of counted_by attribute?
+we should always use the latest value that is hold by the counted_by
+field.  */
+/* { dg-do run } */
+/* { dg-options "-O -fstrict-flex-arrays=3" } */
+
+#include "builtin-object-size-common.h"
+
+struct annotated {
+  size_t foo;
+  char others;
+  char array[] __attribute__((counted_by (foo)));
+};
+
+#define noinline __attribute__((__noinline__))
+#define SIZE_BUMP 10
+#define MAX(a, b) ((a) > (b) ? (a) : (b))
+
+/* In general, Due to type casting, the type for the pointee of a pointer
+   does not say anything about the object it points to,
+   So, __builtin_object_size can not directly use the type of the pointee
+   to decide the size of the object the pointer points to.
+
+   there are only two reliable ways:
+   A. observed allocations  (call to the allocation functions in the routine)
+   B. observed accesses (read or write access to the location of the
+ pointer points to)
+
+   that provide information abo

Re: [PATCH v6 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.

2024-03-18 Thread Qing Zhao


> On Mar 18, 2024, at 12:30, Siddhesh Poyarekar  wrote:
> 
> On 2024-03-18 12:28, Qing Zhao wrote:
>>>> This should probably bail out if object_size_type & OST_DYNAMIC == 0.
>>> Okay. Will add this.
>> When add this into access_with_size_object_size, I found some old bugs in 
>> early_object_sizes_execute_one, and fixed them as well.
> 
> Would you be able to isolate this fix and post them separately?  I reckon 
> they would be relevant for gcc 14 too.

Yes, that’s a good idea, I can do that.
No specific testing case for it, though. 

thanks.

Qing

> 
> Thanks,
> Sid



[PATCH][tree-object-size]Pass OST_DYNAMIC information to early_object_size phase

2024-03-19 Thread Qing Zhao
 Currently, the OST_DYNAMIC information is not passed to
 early_object_sizes phase. Pass this information to it, and adjust the code
 and testing case accordingly.

bootstrapped and regress tested on both x86 and aarch64. no issue.

Okay for trunk?

thanks.

Qing

gcc/ChangeLog:

* tree-object-size.cc (early_object_sizes_execute_one): Add one more
argument is_dynamic.
(object_sizes_execute): Call early_object_sizes_execute_one with one
more argument.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-dynamic-object-size-10.c: Update testing case.
---
 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c |  4 ++--
 gcc/tree-object-size.cc   | 11 ---
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c 
b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c
index 3a2d9821a44e..3c5430b51358 100644
--- a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c
+++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c
@@ -7,5 +7,5 @@
 
 /* early_objsz should resolve __builtin_dynamic_object_size like
__builtin_object_size.  */
-/* { dg-final { scan-tree-dump "maximum object size 21" "early_objsz" } } */
-/* { dg-final { scan-tree-dump "maximum subobject size 16" "early_objsz" } } */
+/* { dg-final { scan-tree-dump "maximum dynamic object size 21" "early_objsz" 
} } */
+/* { dg-final { scan-tree-dump "maximum dynamic subobject size 16" 
"early_objsz" } } */
diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index 018fbc30cbb6..57739eed3abf 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -2050,7 +2050,8 @@ do_valueize (tree t)
since we're only looking for constant bounds.  */
 
 static void
-early_object_sizes_execute_one (gimple_stmt_iterator *i, gimple *call)
+early_object_sizes_execute_one (gimple_stmt_iterator *i, gimple *call,
+   bool is_dynamic)
 {
   tree ost = gimple_call_arg (call, 1);
   tree lhs = gimple_call_lhs (call);
@@ -2060,9 +2061,12 @@ early_object_sizes_execute_one (gimple_stmt_iterator *i, 
gimple *call)
 return;
 
   unsigned HOST_WIDE_INT object_size_type = tree_to_uhwi (ost);
+  if (is_dynamic)
+object_size_type |= OST_DYNAMIC;
+
   tree ptr = gimple_call_arg (call, 0);
 
-  if (object_size_type != 1 && object_size_type != 3)
+  if ((object_size_type & OST_SUBOBJECT) == 0)
 return;
 
   if (TREE_CODE (ptr) != ADDR_EXPR && TREE_CODE (ptr) != SSA_NAME)
@@ -2071,6 +2075,7 @@ early_object_sizes_execute_one (gimple_stmt_iterator *i, 
gimple *call)
   tree type = TREE_TYPE (lhs);
   tree bytes;
   if (!compute_builtin_object_size (ptr, object_size_type, &bytes)
+  || (TREE_CODE (bytes) != INTEGER_CST)
   || !int_fits_type_p (bytes, type))
 return;
 
@@ -2153,7 +2158,7 @@ object_sizes_execute (function *fun, bool early)
 __builtin_dynamic_object_size too.  */
  if (early)
{
- early_object_sizes_execute_one (&i, call);
+ early_object_sizes_execute_one (&i, call, dynamic);
  continue;
}
 
-- 
2.31.1



Re: [PATCH][tree-object-size]Pass OST_DYNAMIC information to early_object_size phase

2024-03-19 Thread Qing Zhao


On Mar 19, 2024, at 09:46, Jakub Jelinek  wrote:

On Tue, Mar 19, 2024 at 01:14:51PM +, Qing Zhao wrote:
Currently, the OST_DYNAMIC information is not passed to
early_object_sizes phase. Pass this information to it, and adjust the code
and testing case accordingly.

Can you explain why do you think this is desirable?
Having both passes emit the dynamic instrumentation is IMHO undesirable,
the first pass exists just to catch subobject properties which are later
lost.

Okay, thanks for the comments. This makes good sense to me. So, the dynamic 
information
was intended to be ignored in the early pass.

I will try to fix the original issue (for the counted-by patches) in the other 
direction.


In any case, if this isn't a regression fix, it isn't suitable for
stage4, seems quite risky.

Agreed.

thanks.

Qing



* tree-object-size.cc (early_object_sizes_execute_one): Add one more
argument is_dynamic.
(object_sizes_execute): Call early_object_sizes_execute_one with one
more argument.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-dynamic-object-size-10.c: Update testing case.
---
gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c |  4 ++--
gcc/tree-object-size.cc   | 11 ---
2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c 
b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c
index 3a2d9821a44e..3c5430b51358 100644
--- a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c
+++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c
@@ -7,5 +7,5 @@

/* early_objsz should resolve __builtin_dynamic_object_size like
   __builtin_object_size.  */
-/* { dg-final { scan-tree-dump "maximum object size 21" "early_objsz" } } */
-/* { dg-final { scan-tree-dump "maximum subobject size 16" "early_objsz" } } */
+/* { dg-final { scan-tree-dump "maximum dynamic object size 21" "early_objsz" 
} } */
+/* { dg-final { scan-tree-dump "maximum dynamic subobject size 16" 
"early_objsz" } } */
diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index 018fbc30cbb6..57739eed3abf 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -2050,7 +2050,8 @@ do_valueize (tree t)
   since we're only looking for constant bounds.  */

static void
-early_object_sizes_execute_one (gimple_stmt_iterator *i, gimple *call)
+early_object_sizes_execute_one (gimple_stmt_iterator *i, gimple *call,
+ bool is_dynamic)
{
  tree ost = gimple_call_arg (call, 1);
  tree lhs = gimple_call_lhs (call);
@@ -2060,9 +2061,12 @@ early_object_sizes_execute_one (gimple_stmt_iterator *i, 
gimple *call)
return;

  unsigned HOST_WIDE_INT object_size_type = tree_to_uhwi (ost);
+  if (is_dynamic)
+object_size_type |= OST_DYNAMIC;
+
  tree ptr = gimple_call_arg (call, 0);

-  if (object_size_type != 1 && object_size_type != 3)
+  if ((object_size_type & OST_SUBOBJECT) == 0)
return;

  if (TREE_CODE (ptr) != ADDR_EXPR && TREE_CODE (ptr) != SSA_NAME)
@@ -2071,6 +2075,7 @@ early_object_sizes_execute_one (gimple_stmt_iterator *i, 
gimple *call)
  tree type = TREE_TYPE (lhs);
  tree bytes;
  if (!compute_builtin_object_size (ptr, object_size_type, &bytes)
+  || (TREE_CODE (bytes) != INTEGER_CST)
  || !int_fits_type_p (bytes, type))
return;

@@ -2153,7 +2158,7 @@ object_sizes_execute (function *fun, bool early)
 __builtin_dynamic_object_size too.  */
  if (early)
{
-   early_object_sizes_execute_one (&i, call);
+   early_object_sizes_execute_one (&i, call, dynamic);
  continue;
}

--
2.31.1

Jakub



[PATCH v7 0/5] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-03-20 Thread Qing Zhao
Hi,

This is the 7th version of the patch.

compare with the 6th version, the difference are:

updates per Siddhesh's comments:
1. update the error messages in "handle_counted_by_attribute"
   then update the testing case accordingly;
2. update the error messages in "verify_counted_by_attribute"
   then update the testing case accordingly;
3. update the documentation of "counted_by" in extend.texi
4. for the 3rd argument of ACCESS_WITH_SIZE, change it as following:
+   3rd argument CLASS_OF_SIZE: The size referenced by the REF_TO_SIZE 
represents
+ 0: the number of bytes;
+ 1: the number of the elements of the object type;

Update all other places accordingly.
5. update the comments of the routine "access_with_size_object_size"
   bail out if (object_size_type & OST_DYNAMIC) == 0 for this routine.
   change the variable name of "type_of_size" to "class_of_size" for 
   consistence.
6. add one more testing case for bound sanitizer to handle the case when
   counted-by field is zero value.


It based on the following original proposal:

https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635884.html
Represent the missing dependence for the "counted_by" attribute and its 
consumers

**The summary of the proposal is:

* Add a new internal function ".ACCESS_WITH_SIZE" to carry the size information 
for every reference to a FAM field;
* In C FE, Replace every reference to a FAM field whose TYPE has the 
"counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE";
* In every consumer of the size information, for example, BDOS or array bound 
sanitizer, query the size information or ACCESS_MODE information from the new 
internal function;
* When expansing to RTL, replace the internal function with the actual 
reference to the FAM field;
* Some adjustment to ipa alias analysis, and other SSA passes to mitigate the 
impact to the optimizer and code generation.


**The new internal function

  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, TYPE_OF_SIZE, 
ACCESS_MODE, TYPE_OF_REF)

INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)

which returns the "REF_TO_OBJ" same as the 1st argument;

Both the return type and the type of the first argument of this function have 
been converted from the incomplete array type to the corresponding pointer type.

The call to .ACCESS_WITH_SIZE is wrapped with an INDIRECT_REF, whose type is 
the original imcomplete array type.

Please see the following link for why:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html

1st argument "REF_TO_OBJ": The reference to the object;
2nd argument "REF_TO_SIZE": The reference to the size of the object,
3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE represents
   0: the number of bytes;
   1: the number of the elements of the object type;
4th argument "TYPE_OF_SIZE": A constant 0 with the TYPE of the object
  refed by REF_TO_SIZE
5th argument "ACCESS_MODE":
  -1: Unknown access semantics
   0: none
   1: read_only
   2: write_only
   3: read_write
6th argument "TYPE_OF_REF": A constant 0 with the pointer TYPE to
  to the original flexible array type.

** The Patch sets included:

1. Provide counted_by attribute to flexible array member field;
  which includes:
  * "counted_by" attribute documentation;
  * C FE handling of the new attribute;
syntax checking, error reporting;
  * testing cases;

2. Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.
  which includes:
  * The definition of the new internal function .ACCESS_WITH_SIZE in 
internal-fn.def.
  * C FE converts every reference to a FAM with "counted_by" attribute to a 
call to the internal function .ACCESS_WITH_SIZE.
(build_component_ref in c_typeck.cc)
This includes the case when the object is statically allocated and 
initialized.
In order to make this working, we should update 
initializer_constant_valid_p_1 and output_constant in varasm.cc to include 
calls to .ACCESS_WITH_SIZE.

However, for the reference inside "offsetof", ignore the "counted_by" 
attribute since it's not useful at all. (c_parser_postfix_expression in 
c/c-parser.cc)
In addtion to "offsetof", for the reference inside operator "typeof" and
  "alignof", we ignore counted_by attribute too.
When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE,
  replace the call with its first argument.

  * Convert every call to .ACCESS_WITH_SIZE to its first argument.
(expand_ACCESS_WITH_SIZE in internal-fn.cc)
  * adjust alias analysis to exclude the new internal from clobbering 
anything.
(ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in 
tree-ssa-alias.cc)
  * adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE 
when
it's LHS is eliminated as dead code.
(eliminate_unnecessary_stmts in tree-ssa-dce.cc)

  1   2   3   4   5   6   7   8   9   10   >