Re: [PATCH, ARM] PR68674 Fix LTO support for neon builtins and error catching

Kyrill Tkachov Thu, 10 Dec 2015 02:19:58 -0800


On 10/12/15 10:11, Christian Bruel wrote:



On 12/10/2015 10:59 AM, Kyrill Tkachov wrote:


On 10/12/15 09:26, Christian Bruel wrote:

Hi Kyrill,

On 12/09/2015 06:32 PM, Kyrill Tkachov wrote:

Hi Christian,

On 08/12/15 12:53, Christian Bruel wrote:

Hi,

The order of the NEON builtins construction has led to complications since the 
attribute target support. This was not a problem when driven from the command 
line, but was causing various issues when the builtins was mixed between fpu
configurations or when used with LTO.

Firstly the builtin functions was not initialized before the parsing of 
functions, leading to wrong type initializations.

Then error catching code when a builtin was used without the proper fpu flags 
was incomprehensible for the user, for instance

#include "arm_neon.h"

int8x8_t a, b;
int16x8_t e;

void
main()
{
    e = (int16x8_t)__builtin_neon_vaddlsv8qi (a, b);
}

compiled with default options (without -mfpu=neon -mfloat-abi=hard) gave pages 
of

/arm-none-eabi/6.0.0/include/arm_neon.h:39:9: error: unknown type name 
'__simd64_int8_t'
   typedef __simd64_int8_t int8x8_t;
...
...
arm_neon.h:4724:3: error: can't convert a vector of type 'poly64x2_t {aka 
__vector(4) int}' to type 'int' which has different size
     return (poly64x2_t)__builtin_neon_vsli_nv2di ((int64x2_t) __a, (int64x2_t) 
__b, __c);
     ^~~~~~
...
... and one for each arm_neon.h lines..

by postponing the check into arm_expand_builtin, we now emit something more 
useful:

testo.c: In function 'main':
testo.c:9:7: error: '__builtin_neon_vaddlsv8qi' neon builtin is not supported 
in this configuration.
     e = (int16x8_t)__builtin_neon_vaddlsv8qi (a, b);
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

One small side effect to note: The total memory allocated is 370k bigger when 
neon is not used, so this support will have a follow-up to make their 
initialization lazy. But I'd like first to stabilize the stuff for stage3 (or 
get it
pre-approved if the memory is an issue)

tested without new failures with {,-mfpu=vfp,-mfpu=neon}{,-march=armv7-a\}
(a few tests that was fail are now unsupported)


I agree, the vector types (re)initialisation is a tricky part.
I've seen similar issues in the aarch64 work for target attributes

    bool
    arm_vector_mode_supported_p (machine_mode mode)
    {
-  /* Neon also supports V2SImode, etc. listed in the clause below.  */
-  if (TARGET_NEON && (mode == V2SFmode || mode == V4SImode || mode == V8HImode
+  if (mode == V2SFmode || mode == V4SImode || mode == V8HImode
          || mode == V4HFmode || mode == V16QImode || mode == V4SFmode
-      || mode == V2DImode || mode == V8HFmode))
-    return true;
-
-  if ((TARGET_NEON || TARGET_IWMMXT)
-      && ((mode == V2SImode)
-      || (mode == V4HImode)
-      || (mode == V8QImode)))
+      || mode == V2DImode || mode == V8HFmode
+      || mode == V2SImode || mode == V4HImode || mode == V8QImode)
        return true;


So this allows vector modes unconditionally for all targets/fpu configurations?
I was tempted to do that in aarch64 when I was encountering similar issues.
In the end what worked for me was re-laying out the vector types in 
SET_CURRENT_FUNCTION
if necessary (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01084.html)


yes my assumption was that arm_init_neon_builtins () is now called for all 
targets, since the check is done at expand time and that the builtins need to 
be known by lto, with the vector type initialization, before they are expanded.
However at that time, lto streaming-in have not yet processed the attributes 
and TARGET_NEON is not set for the function.

I had a look at your re-layout, but I'm not sure. it feels like a hack. I think 
this should be solved first place during the builtin construction. Also 
set_current_function is too late, builtin_expand that will explode because of 
the
unknown modes.

But raise the point. In fact I was not really happy with this 
arm_vector_mode_supported_p neither as I was not sure about other contexts it 
can be called from and I cannot clearly claim that this change is always 
correct.


So the main usage of targetm.vector_mode_supported_p is in stor-layout.c and 
vector_type_mode in particular seems
to have a relevant comment:
   /* Vector types need to re-check the target flags each time we report
      the machine mode.  We need to do this because attribute target can
      change the result of vector_mode_supported_p and have_regs_of_mode
      on a per-function basis.  Thus the TYPE_MODE of a VECTOR_TYPE can
      change on a per-function basis.  */

I think that implies that it expects targetm.vector_mode_supported_p to reject 
vector modes in
contexts that don't support NEON...


yes, thanks for this clarification, that settles it. this part of my patch is 
rubbish :-)

I'd like to think about other way to set the vector modes from 
arm_init_neon_builtins before the target flags are known. I'm thinking about 
the lazy initialization at expand time, or using a contextual boolean flags. 
how does that sound ?


Laying out the vector types during arm_init_neon_builtins sounds more promising 
to me.
Changing layout of types during expand is risky, from what I remember.


I am thinking about the arm_builtin_decl hook, not expand. There is a bool 
initialize_p flag that seems perfect for the need. (apparently it's always true 
and never used by any other target)


Sounds promising. I'm not familiar with the callsites of targetm.builtin_decl, 
but if it does what we want
maybe it's worth pursuing.

Kyrill


In principle, the types and builtins created in arm_init_neon_builtins are only 
ever supposed to be used in
a NEON context, so I thought that just turning on NEON upon entry into 
arm_init_neon_builtins and resetting
it back upon exit would work. However, this won't work because we construct our 
builtin types by copying existing
type nodes (e.g. intQI_type_node) that have been laid out earlier by the midend 
(frontend?) assuming no NEON.

I wonder if we can explicitly layout these global types in the 
arm_init_neon_builtins context...

Thanks,
Kyrill

many thanks,

Christian


Kyrill

Re: [PATCH, ARM] PR68674 Fix LTO support for neon builtins and error catching

Reply via email to