On IPA PTA field sensitivity and pointer expression (part 2)

2020-09-28 Thread Erick Ochoa

Hi,

previously I sent an e-mail inquiring about the state of points-to 
information of structure variables allocated in the heap. It was brought 
to my attention that heap variables do not have a size to model and 
therefore IPA-PTA is not able to provide field sensitivity.


I now understand better how field sensitivity is modeled in IPA-PTA and 
the way size is needed in order to compute the correct solution. 
However, I am now trying to compute the points-to analysis for pointer 
expressions for stack allocated struct variables. I am trying to answer 
the question:


What does `temp->f1` points to? For the following simple example without 
heap allocated memory.


```c
struct A { char* f0; char *f1; struct A *f2;};

int __GIMPLE(startwith("ipa-pta"))
main (int argc, char * * argv)
{
  struct A p1;
  char * pc;
  char c;
  char *cast;
  struct A*temp;
  char *temp2;
  int i;
  int _27;

  i_15 = 1;
  pc = &c;
  p1.f1 = pc;
  p1.f2 = &p1;
  _27 = 0;
  cast = pc;
  temp = p1.f2;
  temp2 = temp->f1;
  return _27;
}
```

There are two question I have regarding this example. The first one is 
that IPA-PTA will determine that temp2 points to { c p1 } while I think 
it should only point to { c } and I'm trying to understand why. The 
second thing is that, I am still unsure how to get points-to information 
for pointer expressions like temp->f1.


Details:

IPA-PTA correctly points out that the structure p1 and structure pointer 
temp can point to both { c and p1 }


```
c = { }
p1 = { c p1 } same as temp_33
temp_33 = { c p1 }
```

I believe this is because p1 is a the whole struct variable, and temp_33 
is also modeling the whole struct variable. (in other words *temp_33+64 
points-to c, *temp_33+128 points-to p1. Note that nothing is in field f0)


However, in the case of temp2, we have the following points-to information:


```
temp2_34 = { c p1 }
```

which I believe is an over approximation. Looking at the constraints 
generated, we see that temp2_34 was assigned the following constraint


temp2_34 = *temp_33 + 64

And that means that the method do_sd_constraint should have been used to 
compute the correct points to information. Looking at the the method, 
and adding some print statements, it is clear to me that the problem 
with this imprecision is that temp_33 may point to { c } in its second 
field. However, isn't GCC supposed to take into account field 
information in this case? I believe that in order to make this more 
precise we need a change in the get_varinfo API to something that takes 
into account offsets and gets the solution for pointer expressions.


Instead of this line
  else if (v->may_have_pointers
   && add_graph_edge (graph, lhs, t))
flag |= bitmap_ior_into (sol, get_varinfo (t)->solution);

something like:

  else if (v->may_have_pointers
   && add_graph_edge (graph, lhs, t))
flag |= bitmap_ior_into (sol, get_varinfo (t, 
roffset)->solution);


This seems to me that it is already a known issue and it might be 
described accurately by this comment.


  TODO: Adding offsets to pointer-to-structures can be handled (IE not 
punted

  on and turned into anything), but isn't.  You can just see what offset
  inside the pointed-to struct it's going to access.

So, I just want to confirm, does this comment refer concretely to what 
I'm trying to do? And does this mean that in order to accomplish an API 
similar to what I described, would I need to create new constraint 
variables? (One new constraint variable for each field in all pointer to 
struct variables)


Thanks!


Re: On IPA PTA field sensitivity and pointer expression (part 2)

2020-09-28 Thread Erick Ochoa




On 28/09/2020 14:25, Erick Ochoa wrote:

Hi,

previously I sent an e-mail inquiring about the state of points-to 
information of structure variables allocated in the heap. It was brought 
to my attention that heap variables do not have a size to model and 
therefore IPA-PTA is not able to provide field sensitivity.


I now understand better how field sensitivity is modeled in IPA-PTA and 
the way size is needed in order to compute the correct solution. 
However, I am now trying to compute the points-to analysis for pointer 
expressions for stack allocated struct variables. I am trying to answer 
the question:


What does `temp->f1` points to? For the following simple example without 
heap allocated memory.


```c
struct A { char* f0; char *f1; struct A *f2;};

int __GIMPLE(startwith("ipa-pta"))
main (int argc, char * * argv)
{
   struct A p1;
   char * pc;
   char c;
   char *cast;
   struct A*temp;
   char *temp2;
   int i;
   int _27;

   i_15 = 1;
   pc = &c;
   p1.f1 = pc;
   p1.f2 = &p1;
   _27 = 0;
   cast = pc;
   temp = p1.f2;
   temp2 = temp->f1;
   return _27;
}
```

There are two question I have regarding this example. The first one is 
that IPA-PTA will determine that temp2 points to { c p1 } while I think 
it should only point to { c } and I'm trying to understand why. The 
second thing is that, I am still unsure how to get points-to information 
for pointer expressions like temp->f1.


Details:

IPA-PTA correctly points out that the structure p1 and structure pointer 
temp can point to both { c and p1 }


```
c = { }
p1 = { c p1 } same as temp_33
temp_33 = { c p1 }
```

I believe this is because p1 is a the whole struct variable, and temp_33 
is also modeling the whole struct variable. (in other words *temp_33+64 
points-to c, *temp_33+128 points-to p1. Note that nothing is in field f0)


However, in the case of temp2, we have the following points-to information:


```
temp2_34 = { c p1 }
```

which I believe is an over approximation. Looking at the constraints 
generated, we see that temp2_34 was assigned the following constraint


temp2_34 = *temp_33 + 64

And that means that the method do_sd_constraint should have been used to 
compute the correct points to information. Looking at the the method, 
and adding some print statements, it is clear to me that the problem 
with this imprecision is that temp_33 may point to { c } in its second 
field. 


Small correction: temp_33 may point to p1 in its third field.

However, isn't GCC supposed to take into account field
information in this case? I believe that in order to make this more 
precise we need a change in the get_varinfo API to something that takes 
into account offsets and gets the solution for pointer expressions.


Instead of this line
   else if (v->may_have_pointers
    && add_graph_edge (graph, lhs, t))
     flag |= bitmap_ior_into (sol, get_varinfo (t)->solution);

something like:

   else if (v->may_have_pointers
    && add_graph_edge (graph, lhs, t))
     flag |= bitmap_ior_into (sol, get_varinfo (t, 
roffset)->solution);


This seems to me that it is already a known issue and it might be 
described accurately by this comment.


   TODO: Adding offsets to pointer-to-structures can be handled (IE not 
punted

   on and turned into anything), but isn't.  You can just see what offset
   inside the pointed-to struct it's going to access.

So, I just want to confirm, does this comment refer concretely to what 
I'm trying to do? And does this mean that in order to accomplish an API 
similar to what I described, would I need to create new constraint 
variables? (One new constraint variable for each field in all pointer to 
struct variables)


Thanks!


Re: First set of patches to allow changing the long double default were posted:

2020-09-28 Thread Joseph Myers
On Thu, 24 Sep 2020, Michael Meissner wrote:

> As per the discussion in this thread, I did decide to keep things to two types
> within the compiler.  This means that an explicit __float128 or _Float128 will
> use the same type node and TFmode as long double uses if the default for long
> double is IEEE 128-bit.

I'm not sure which patch in the series is supposed to be implementing 
that, but C requires that long double and _Float128 are distinct types (so 
you can use _Generic to choose between them, for example) even if they 
have the same ABI.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH v4] : Add nitems()

2020-09-28 Thread Alejandro Colomar via Gcc
'nitems()' calculates the length of an array in number of items.
It is safe: if a pointer is passed to the macro (or function, in C++),
the compilation is broken due to:
 - In >= C11: _Static_assert()
 - In C89, C99: Negative anonymous bitfield
 - In C++: The template requires an array

Some BSDs already provide a macro nitems() in ,
although it usually doesn't provide safety against pointers.

This patch uses the same name for compatibility reasons,
and to be the least disruptive with existing code.

This patch also adds some other macros, which are required by 'nitems()':

__is_same_type(a, b):
Returns non-zero if the two input arguments are of the same type.

__is_array(arr):
Returns non-zero if the input argument is of an array type.

__must_be(expr, msg):
Allows using _Static_assert() everywhere an expression can be used.
It evaluates '(int)0' or breaks the compilation.

__must_be_array(arr):
It evaluates to '(int)0' if the argument is of an array type.
Else, it breaks compilation.

__nitems(arr):
It implements the basic sizeof division needed to calculate the array length.


P.S.: I'd like to put this patch in the public domain.

Signed-off-by: Alejandro Colomar 
---

A few changes since v3:

- Macros don't need reserved names in their parameters,
so I simplified those names.

- I fixed some wrong indentation levels.

- Renamed __array_len() to __nitems() for consistency.


 misc/sys/param.h | 47 +++
 1 file changed, 47 insertions(+)

diff --git a/misc/sys/param.h b/misc/sys/param.h
index d7c319b157..08d4093961 100644
--- a/misc/sys/param.h
+++ b/misc/sys/param.h
@@ -102,5 +102,52 @@
 #define MIN(a,b) (((a)<(b))?(a):(b))
 #define MAX(a,b) (((a)>(b))?(a):(b))
 
+/* Macros related to the types of variables */
+#define __is_same_type(a, b)  \
+   __builtin_types_compatible_p(__typeof__(a), __typeof__(b))
+#define __is_array(arr)(!__is_same_type((arr), &(arr)[0]))
+
+/* Macros for embedding _Static_assert() in expressions */
+#if __STDC_VERSION__ >= 201112L
+# define __must_be(expr, msg)   ( \
+0 * (int)sizeof(  \
+  struct {\
+_Static_assert((expr), msg);  \
+char _ISO_C_forbids_a_struct_with_no_members; \
+  }   \
+) \
+)
+#else
+# define __must_be(expr, msg)   ( \
+0 * (int)sizeof(  \
+  struct {\
+int  : (-!(expr));\
+char _ISO_C_forbids_a_struct_with_no_members; \
+  }   \
+) \
+)
+#endif
+#define __must_be_array(arr)   __must_be(__is_array(arr), "Must be an array!")
+
+/* Macros for array sizes */
+#if defined(__cplusplus)
+# if __cplusplus >= 201103L
+template
+  constexpr inline std::size_t
+  nitems(const _Tp(&)[_Len]) __THROW
+  {
+return _Len;
+  }
+# else /* __cplusplus < 201103L */
+template
+  char
+  (&__nitems_chararr(const _Tp(&)[_Len]))[_Len];
+#  define nitems(arr)  (sizeof(__nitems_chararr(arr)))
+# endif /* __cplusplus < 201103L */
+#else /* !defined(__cplusplus) */
+# define __nitems(arr) (sizeof((arr)) / sizeof((arr)[0]))
+# define nitems(arr)   (__nitems(arr) + __must_be_array(arr))
+#endif /* !defined(__cplusplus) */
+
 
 #endif  /* sys/param.h */
-- 
2.28.0



Re: First set of patches to allow changing the long double default were posted:

2020-09-28 Thread Michael Meissner via Gcc
On Mon, Sep 28, 2020 at 04:38:51PM +, Joseph Myers wrote:
> On Thu, 24 Sep 2020, Michael Meissner wrote:
> 
> > As per the discussion in this thread, I did decide to keep things to two 
> > types
> > within the compiler.  This means that an explicit __float128 or _Float128 
> > will
> > use the same type node and TFmode as long double uses if the default for 
> > long
> > double is IEEE 128-bit.
> 
> I'm not sure which patch in the series is supposed to be implementing 
> that, but C requires that long double and _Float128 are distinct types (so 
> you can use _Generic to choose between them, for example) even if they 
> have the same ABI.

Hmmm, but others said that it would complicate things if __float128 were
different than long double.  I've implemented it both ways (I would have to dig
up the patches to have separate types).

And doing so might break all of the glibc efforts to provide dual symbols.

I could obviously use __float128 to be the same as long double, and _Float128
being a unique type if long double is IEEE.  And since C++ doesn't have
_Float128, it would 'work' without introducing a new mangling name.

I don't think there is a solution that doesn't break something.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: First set of patches to allow changing the long double default were posted:

2020-09-28 Thread Jakub Jelinek via Gcc
On Mon, Sep 28, 2020 at 05:07:55PM -0400, Michael Meissner wrote:
> On Mon, Sep 28, 2020 at 04:38:51PM +, Joseph Myers wrote:
> > On Thu, 24 Sep 2020, Michael Meissner wrote:
> > 
> > > As per the discussion in this thread, I did decide to keep things to two 
> > > types
> > > within the compiler.  This means that an explicit __float128 or _Float128 
> > > will
> > > use the same type node and TFmode as long double uses if the default for 
> > > long
> > > double is IEEE 128-bit.
> > 
> > I'm not sure which patch in the series is supposed to be implementing 
> > that, but C requires that long double and _Float128 are distinct types (so 
> > you can use _Generic to choose between them, for example) even if they 
> > have the same ABI.
> 
> Hmmm, but others said that it would complicate things if __float128 were
> different than long double.  I've implemented it both ways (I would have to 
> dig
> up the patches to have separate types).
> 
> And doing so might break all of the glibc efforts to provide dual symbols.
> 
> I could obviously use __float128 to be the same as long double, and _Float128
> being a unique type if long double is IEEE.  And since C++ doesn't have
> _Float128, it would 'work' without introducing a new mangling name.

That seems preferrable to me.

Jakub



Re: {standard input}:1174: Error: inappropriate arguments for opcode 'mpydu'

2020-09-28 Thread Nicolas Pitre
On Sun, 27 Sep 2020, Rong Chen wrote:

> Hi Nicolas,
> 
> Thanks for the feedback, the error still remains with gcc 10.2.0:

I've created the simplest test case that can be. You won't believe it.

Test case:

$ cat test.c
unsigned int test(unsigned int x, unsigned long long y)
{
y /= 0x2000;
if (x > 1)
y *= x;
return y;
}
$ export 
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/0day/gcc-9.3.0-nolibc/arc-elf/libexec/gcc/arc-elf/9.3.0
$ ~/0day/gcc-9.3.0-nolibc/arc-elf/bin/arc-elf-gcc -mcpu=hs38 -mbig-endian -O2 
-c test.c
/tmp/cc0GAomh.s: Assembler messages:
/tmp/cc0GAomh.s:21: Error: inappropriate arguments for opcode 'mpydu'

I know nothing about ARC. Please anyone take it over from here.


Nicolas


Re: First set of patches to allow changing the long double default were posted:

2020-09-28 Thread Joseph Myers
On Mon, 28 Sep 2020, Michael Meissner via Gcc wrote:

> > I'm not sure which patch in the series is supposed to be implementing 
> > that, but C requires that long double and _Float128 are distinct types (so 
> > you can use _Generic to choose between them, for example) even if they 
> > have the same ABI.
> 
> Hmmm, but others said that it would complicate things if __float128 were
> different than long double.  I've implemented it both ways (I would have to 
> dig
> up the patches to have separate types).
> 
> And doing so might break all of the glibc efforts to provide dual symbols.
> 
> I could obviously use __float128 to be the same as long double, and _Float128
> being a unique type if long double is IEEE.  And since C++ doesn't have
> _Float128, it would 'work' without introducing a new mangling name.

My comment is specifically about _Float128, in C; it doesn't say anything 
about what should happen for C++ (though it would be nice to resolve the 
ICEs that occur when _FloatN / _FloatNx types leak into C++ code via 
built-in functions, bug 85518).  Maybe it makes sense for the type (or at 
least the type used by the built-in functions and __float128, since 
_Float128 itself can't be accessed directly in C++) to be the same in the 
C++ case.

The existing code in build_common_tree_nodes that creates all the _FloatN 
/ _FloatNx type nodes always creates a new node with make_node.  And since 
I didn't spot anything in your patch series that changes that, I couldn't 
see how that patch series actually does make them the same type.

-- 
Joseph S. Myers
jos...@codesourcery.com