On 8/26/21 13:04, Richard Biener wrote:
On Tue, Aug 24, 2021 at 3:04 PM Martin Liška <mli...@suse.cz> wrote:

On 8/24/21 14:13, Richard Biener wrote:
On Thu, Jul 1, 2021 at 3:13 PM Martin Liška <mli...@suse.cz> wrote:

On 10/23/20 1:47 PM, Martin Liška wrote:
Hey.

Hello.

I deferred the patch for GCC 12. Since the time, I messed up with options
I feel more familiar with the option handling. So ...


This is a follow-up of the discussion that happened in thread about 
no_stack_protector
attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html

The current optimize attribute works in the following way:
- 1) we take current global_options as base
- 2) maybe_default_options is called for the currently selected optimization 
level, which
        means all rules in default_options_table are executed
- 3) attribute values are applied (via decode_options)

So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and 
__attribute__((optimize("-fno-stack-protector")))
ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is 
default:
       /* -O1 and -Og optimizations.  */
       { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },

My patch handled and the current optimize attribute really behaves that same as 
appending attribute value
to the command line. So far so good. We should also reflect that in 
documentation entry which is quite
vague right now:

^^^ all these are still valid arguments, plus I'm adding a new test-case that 
tests that.

Hey.

There is also handle_common_deferred_options that's not called so any
option processed there should
probably be excempt from being set/unset in the optimize attribute?

Looking at the handled options, they have all Defer type and not Optimization.
Thus we should be fine.



"""
The optimize attribute is used to specify that a function is to be compiled 
with different optimization options than specified on the command line.
"""

I addressed that with documentation changes, should be more clear to users. 
Moreover, I noticed that we declare 'optimize' attribute
as something not for a production use:

"The optimize attribute should be used for debugging purposes only. It is not 
suitable in production code."

Are we sure about the statement? I know that e.g. glibc uses that.

Well, given we're changing behavior now that warning looks valid ;)

Yeah! True.

I'll also note that

"The optimize attribute arguments of a function behave
as if they were added to the command line options."

is still likely untrue, the global state init is complicated ;)

Sure, but the situation should be much closer to it :) Do you have a better 
wording?

Maybe "The intent is that the optimize attribute behaves as if the
arguments were
appended to the command line."

But as said originally below I'm not sure that this behavior is what people
expect.  If I say optimize("fno-tree-vectorize") then I do expect that to retain
other command-line arguments.

Yep, that's clear!

If I say optimize(1) I'm not sure I would expect
-ftree-vectorize on the command-line to prevail ;)

That's a very good question: theoretically yes, I can imagine various scenarios 
how can -Ox be used
in an attribute:

1) -O0 for a function that e.g. violate strict aliasing, or when one wants to 
debug a fn in an optimized binary
2) -Ofast for a function which is performance critical

On the other hand, my original motivation was a kernel compiler with:
-O2 -fno-omit-frame-pointer and 
__attribute__((optimize("-fno-stack-protector")))

It's not intuitive that you end up with -O2 -fno-stack-protector (and 
-fomit-frame-pointer).


Is google code search still a thing?  Can one search all of github
somehow?  I really
wonder how 'optimize' is used at the moment.  There are quite some optimize
attributes in the target part of the testsuite for example.  And
testsuite/gcc.dg/vect/bb-slp-41.c

I can investigate that.

suggests that optimize("-fno-tree-fre") preserves at least the
optimization level?

There's always the possibility to preserve the current behavior for 'optimize'

Yes, but we should somehow document the current weird behavior when it comes to 
-Ox options.

and add a new 'add_optimize' attribute that does the other thing.

Considering that ...





and we may want to handle -Ox in the attribute in a special way. I guess many 
macro/pragma users expect that

-O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
with -ftree-vectorize -O1 ?

This is my older suggestion and it will likely make it even much complicated. 
So ...

In theory it's just dropping the command-line but yes, the question is
what happens
to the non-optimization part of the command-line.  We obviously shouldn't drop
-std=gnu14 and it's side-effects.  So yes, even documenting exactly what this
would do is difficult ;)


As implemented your patch seems to turn it into -ftree-vectorize -O1.

Yes.

IIRC multiple optimize attributes apply
ontop of each other, and it makes sense to me that optimize (2),
optimize ("tree-vectorize") behaves the same
as optimize (2, "tree-vectorize").  I'm not sure this is still the
case after your patch?  Also consider

#pragma GCC optimize ("tree-vectorize")
void foo () { ...}

#pragma GCC optimize ("tree-loop-distribution")
void bar () {... }

I'd expect bar to have both vectorization and loop distribution
enabled? (note I didn't use push/pop here)

Yes, yes and yes. I'm going to verify it.


The situation with 'target' attribute is different. When parsing the attribute, 
we intentionally drop all existing target flags:

$ cat -n gcc/config/i386/i386-options.c
...
     1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
     1246                  {
     1247                    /* If arch= is set,  clear all bits in 
x_ix86_isa_flags,
     1248                       except for ISA_64BIT, ABI_64, ABI_X32, and 
CODE16
     1249                       and all bits in x_ix86_isa_flags2.  */
     1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
     1251                                               | OPTION_MASK_ABI_64
     1252                                               | OPTION_MASK_ABI_X32
     1253                                               | OPTION_MASK_CODE16);
     1254                    opts->x_ix86_isa_flags_explicit &= 
(OPTION_MASK_ISA_64BIT
     1255                                                        | 
OPTION_MASK_ABI_64
     1256                                                        | 
OPTION_MASK_ABI_X32
     1257                                                        | 
OPTION_MASK_CODE16);
     1258                    opts->x_ix86_isa_flags2 = 0;
     1259                    opts->x_ix86_isa_flags2_explicit = 0;
     1260                  }

That seems logical because target attribute is used for e.g. ifunc 
multi-versioning and one needs
to be sure all existing ISA flags are dropped. However, I noticed clang behaves 
differently:

$ cat hreset.c
#pragma GCC target "arch=geode"
#include <immintrin.h>
void foo(unsigned int eax)
{
     _hreset (eax);
}

$ clang hreset.c -mhreset  -c -O2 -m32
$ gcc hreset.c -mhreset  -c -O2 -m32
In file included from 
/home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
                    from 
/home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
                    from hreset.c:2:
hreset.c: In function ‘foo’:
/home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1:
 error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific 
option mismatch
      39 | _hreset (unsigned int __EAX)
         | ^~~~~~~
hreset.c:5:3: note: called from here
       5 |   _hreset (eax);
         |   ^~~~~~~~~~~~~

Anyway, I think the current target attribute handling should be preserved.

I think this and the -O1 argument above suggests that there should be
a way to distinguish
two modes - add to the active set of options and starting from scratch.

Doing that would make it even crazier :)


Maybe it's over-designing things but do we want to preserve the
existing behavior
and instead add optimize ("+ftree-vectorize") and target ("+avx2") as
a way to amend
the state?

I prefer doing only the append mode (when one can still use -fno-foo for an 
explicit
drop of a flag).


OTOH as we're missing global_options re-init even with your patch we won't get
the defaults correct (aka what toplev::main does with init_options_struct and
the corresponding langhook).  Likewise if lang_hooks.init_options performs any
defaulting a later flag overrides and we override that with optimize() that
doesn't work - I'm thinking of things like flag_complex_method and -fcx-* flags.
So -O2 -fcx-fortran-rules on the command-line and optimize
("no-cx-fortran-rules")
to cancel the -fcx-fortran-rules switch wouldn't work?

In most cases it works. What's problematic about -fcx-fortran-rules is that it 
sets

    /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
    if (flag_cx_limited_range)
      flag_complex_method = 0;

    /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
    if (flag_cx_fortran_rules)
      flag_complex_method = 1;

in process_options (called only for cmdline options) and not in

/* After all options at LOC have been read into OPTS and OPTS_SET,
     finalize settings of those options and diagnose incompatible
     combinations.  */
void
finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
                 location_t loc)

which is a place which is called once options are decoded (both from cmdline 
and when
combined with a attribute or pragma):

Yes, and that flag_complex_method is initialized via the langhook
mentioned, for example
c-family/c-opts.c has

/* Initialize options structure OPTS.  */
void
c_common_init_options_struct (struct gcc_options *opts)
{
   opts->x_flag_exceptions = c_dialect_cxx ();
   opts->x_warn_pointer_arith = c_dialect_cxx ();
   opts->x_warn_write_strings = c_dialect_cxx ();
   opts->x_flag_warn_unused_result = true;

   /* By default, C99-like requirements for complex multiply and divide.  */
   opts->x_flag_complex_method = 2;
}

so an attempt to "cancel" a command-line option that adjusted any of
the above will not
work because we're not re-initializing global_options appropriately.
But maybe we can
just do that?  That is, call

   /* Initialize global options structures; this must be repeated for
      each structure used for parsing options.  */
   init_options_struct (&global_options, &global_options_set);
   lang_hooks.init_options_struct (&global_options);

and

   /* Perform language-specific options initialization.  */
   lang_hooks.init_options (save_decoded_options_count, save_decoded_options);

as done by toplev.c?  Or if we do not want to do that store that state away
to an 'initialized_options/initialized_options_set' set of vars we can
copy from?

I think this one is a bit orthogonal to the suggested changes, right? We can 
hopefully implement it incrementally.

Martin


#1  0x0000000001b69da3 in finish_options (opts=opts@entry=0x26b13e0 <global_options>, 
opts_set=opts_set@entry=0x26afdc0 <global_options_set>, loc=loc@entry=258754) at 
/home/marxin/Programming/gcc/gcc/opts.c:1303

#2  0x0000000000dd9e3b in decode_options (opts=0x26b13e0 <global_options>, opts_set=0x26afdc0 
<global_options_set>, decoded_options=<optimized out>, 
decoded_options_count=decoded_options_count@entry=4, loc=258754, dc=0x26b2b00 
<global_diagnostic_context>,

      target_option_override_hook=0x0) at 
/home/marxin/Programming/gcc/gcc/opts-global.c:324

#3  0x0000000000921144 in parse_optimize_options (args=args@entry=<tree_list 
0x7ffff76e1910>, attr_p=attr_p@entry=false) at 
/home/marxin/Programming/gcc/gcc/c-family/c-common.c:5921

#4  0x0000000000972aab in handle_pragma_optimize (dummy=<optimized out>) at 
/home/marxin/Programming/gcc/gcc/c-family/c-pragma.c:993

#5  0x00000000008e3118 in c_parser_pragma (parser=0x7ffff7fbeab0, 
context=pragma_external, if_p=0x0) at 
/home/marxin/Programming/gcc/gcc/c/c-parser.c:12573


Martin


Thanks,
Richard.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin


I'm also planning to take a look at the target macro/attribute, I expect 
similar problems:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469

Thoughts?
Thanks,
Martin



Reply via email to