On Thu, Aug 26, 2021 at 2:39 PM Martin Liška <mli...@suse.cz> wrote:
>
> On 8/26/21 13:04, Richard Biener wrote:
> > On Tue, Aug 24, 2021 at 3:04 PM Martin Liška <mli...@suse.cz> wrote:
> >>
> >> On 8/24/21 14:13, Richard Biener wrote:
> >>> On Thu, Jul 1, 2021 at 3:13 PM Martin Liška <mli...@suse.cz> wrote:
> >>>>
> >>>> On 10/23/20 1:47 PM, Martin Liška wrote:
> >>>>> Hey.
> >>>>
> >>>> Hello.
> >>>>
> >>>> I deferred the patch for GCC 12. Since the time, I messed up with options
> >>>> I feel more familiar with the option handling. So ...
> >>>>
> >>>>>
> >>>>> This is a follow-up of the discussion that happened in thread about 
> >>>>> no_stack_protector
> >>>>> attribute: 
> >>>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
> >>>>>
> >>>>> The current optimize attribute works in the following way:
> >>>>> - 1) we take current global_options as base
> >>>>> - 2) maybe_default_options is called for the currently selected 
> >>>>> optimization level, which
> >>>>>         means all rules in default_options_table are executed
> >>>>> - 3) attribute values are applied (via decode_options)
> >>>>>
> >>>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer 
> >>>>> and __attribute__((optimize("-fno-stack-protector")))
> >>>>> ends basically with -O2 -fno-stack-protector because 
> >>>>> -fno-omit-frame-pointer is default:
> >>>>>        /* -O1 and -Og optimizations.  */
> >>>>>        { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
> >>>>>
> >>>>> My patch handled and the current optimize attribute really behaves that 
> >>>>> same as appending attribute value
> >>>>> to the command line. So far so good. We should also reflect that in 
> >>>>> documentation entry which is quite
> >>>>> vague right now:
> >>>>
> >>>> ^^^ all these are still valid arguments, plus I'm adding a new test-case 
> >>>> that tests that.
> >>
> >> Hey.
> >>
> >>> There is also handle_common_deferred_options that's not called so any
> >>> option processed there should
> >>> probably be excempt from being set/unset in the optimize attribute?
> >>
> >> Looking at the handled options, they have all Defer type and not 
> >> Optimization.
> >> Thus we should be fine.
> >>
> >>>
> >>>>>
> >>>>> """
> >>>>> The optimize attribute is used to specify that a function is to be 
> >>>>> compiled with different optimization options than specified on the 
> >>>>> command line.
> >>>>> """
> >>>>
> >>>> I addressed that with documentation changes, should be more clear to 
> >>>> users. Moreover, I noticed that we declare 'optimize' attribute
> >>>> as something not for a production use:
> >>>>
> >>>> "The optimize attribute should be used for debugging purposes only. It 
> >>>> is not suitable in production code."
> >>>>
> >>>> Are we sure about the statement? I know that e.g. glibc uses that.
> >>>
> >>> Well, given we're changing behavior now that warning looks valid ;)
> >>
> >> Yeah! True.
> >>
> >>> I'll also note that
> >>>
> >>> "The optimize attribute arguments of a function behave
> >>> as if they were added to the command line options."
> >>>
> >>> is still likely untrue, the global state init is complicated ;)
> >>
> >> Sure, but the situation should be much closer to it :) Do you have a 
> >> better wording?
> >
> > Maybe "The intent is that the optimize attribute behaves as if the
> > arguments were
> > appended to the command line."
> >
> > But as said originally below I'm not sure that this behavior is what people
> > expect.  If I say optimize("fno-tree-vectorize") then I do expect that to 
> > retain
> > other command-line arguments.
>
> Yep, that's clear!
>
> > If I say optimize(1) I'm not sure I would expect
> > -ftree-vectorize on the command-line to prevail ;)
>
> That's a very good question: theoretically yes, I can imagine various 
> scenarios how can -Ox be used
> in an attribute:
>
> 1) -O0 for a function that e.g. violate strict aliasing, or when one wants to 
> debug a fn in an optimized binary
> 2) -Ofast for a function which is performance critical
>
> On the other hand, my original motivation was a kernel compiler with:
> -O2 -fno-omit-frame-pointer and 
> __attribute__((optimize("-fno-stack-protector")))
>
> It's not intuitive that you end up with -O2 -fno-stack-protector (and 
> -fomit-frame-pointer).
>
> >
> > Is google code search still a thing?  Can one search all of github
> > somehow?  I really
> > wonder how 'optimize' is used at the moment.  There are quite some optimize
> > attributes in the target part of the testsuite for example.  And
> > testsuite/gcc.dg/vect/bb-slp-41.c
>
> I can investigate that.
>
> > suggests that optimize("-fno-tree-fre") preserves at least the
> > optimization level?
> >
> > There's always the possibility to preserve the current behavior for 
> > 'optimize'
>
> Yes, but we should somehow document the current weird behavior when it comes 
> to -Ox options.
>
> > and add a new 'add_optimize' attribute that does the other thing.
>
> Considering that ...
>
> >
> >>>
> >>>
> >>>>>
> >>>>> and we may want to handle -Ox in the attribute in a special way. I 
> >>>>> guess many macro/pragma users expect that
> >>>>>
> >>>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 
> >>>>> and not
> >>>>> with -ftree-vectorize -O1 ?
> >>
> >> This is my older suggestion and it will likely make it even much 
> >> complicated. So ...
> >
> > In theory it's just dropping the command-line but yes, the question is
> > what happens
> > to the non-optimization part of the command-line.  We obviously shouldn't 
> > drop
> > -std=gnu14 and it's side-effects.  So yes, even documenting exactly what 
> > this
> > would do is difficult ;)
> >
> >>>
> >>> As implemented your patch seems to turn it into -ftree-vectorize -O1.
> >>
> >> Yes.
> >>
> >>> IIRC multiple optimize attributes apply
> >>> ontop of each other, and it makes sense to me that optimize (2),
> >>> optimize ("tree-vectorize") behaves the same
> >>> as optimize (2, "tree-vectorize").  I'm not sure this is still the
> >>> case after your patch?  Also consider
> >>>
> >>> #pragma GCC optimize ("tree-vectorize")
> >>> void foo () { ...}
> >>>
> >>> #pragma GCC optimize ("tree-loop-distribution")
> >>> void bar () {... }
> >>>
> >>> I'd expect bar to have both vectorization and loop distribution
> >>> enabled? (note I didn't use push/pop here)
> >>
> >> Yes, yes and yes. I'm going to verify it.
> >>
> >>>
> >>>> The situation with 'target' attribute is different. When parsing the 
> >>>> attribute, we intentionally drop all existing target flags:
> >>>>
> >>>> $ cat -n gcc/config/i386/i386-options.c
> >>>> ...
> >>>>      1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
> >>>>      1246                  {
> >>>>      1247                    /* If arch= is set,  clear all bits in 
> >>>> x_ix86_isa_flags,
> >>>>      1248                       except for ISA_64BIT, ABI_64, ABI_X32, 
> >>>> and CODE16
> >>>>      1249                       and all bits in x_ix86_isa_flags2.  */
> >>>>      1250                    opts->x_ix86_isa_flags &= 
> >>>> (OPTION_MASK_ISA_64BIT
> >>>>      1251                                               | 
> >>>> OPTION_MASK_ABI_64
> >>>>      1252                                               | 
> >>>> OPTION_MASK_ABI_X32
> >>>>      1253                                               | 
> >>>> OPTION_MASK_CODE16);
> >>>>      1254                    opts->x_ix86_isa_flags_explicit &= 
> >>>> (OPTION_MASK_ISA_64BIT
> >>>>      1255                                                        | 
> >>>> OPTION_MASK_ABI_64
> >>>>      1256                                                        | 
> >>>> OPTION_MASK_ABI_X32
> >>>>      1257                                                        | 
> >>>> OPTION_MASK_CODE16);
> >>>>      1258                    opts->x_ix86_isa_flags2 = 0;
> >>>>      1259                    opts->x_ix86_isa_flags2_explicit = 0;
> >>>>      1260                  }
> >>>>
> >>>> That seems logical because target attribute is used for e.g. ifunc 
> >>>> multi-versioning and one needs
> >>>> to be sure all existing ISA flags are dropped. However, I noticed clang 
> >>>> behaves differently:
> >>>>
> >>>> $ cat hreset.c
> >>>> #pragma GCC target "arch=geode"
> >>>> #include <immintrin.h>
> >>>> void foo(unsigned int eax)
> >>>> {
> >>>>      _hreset (eax);
> >>>> }
> >>>>
> >>>> $ clang hreset.c -mhreset  -c -O2 -m32
> >>>> $ gcc hreset.c -mhreset  -c -O2 -m32
> >>>> In file included from 
> >>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
> >>>>                     from 
> >>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
> >>>>                     from hreset.c:2:
> >>>> hreset.c: In function ‘foo’:
> >>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1:
> >>>>  error: inlining failed in call to ‘always_inline’ ‘_hreset’: target 
> >>>> specific option mismatch
> >>>>       39 | _hreset (unsigned int __EAX)
> >>>>          | ^~~~~~~
> >>>> hreset.c:5:3: note: called from here
> >>>>        5 |   _hreset (eax);
> >>>>          |   ^~~~~~~~~~~~~
> >>>>
> >>>> Anyway, I think the current target attribute handling should be 
> >>>> preserved.
> >>>
> >>> I think this and the -O1 argument above suggests that there should be
> >>> a way to distinguish
> >>> two modes - add to the active set of options and starting from scratch.
> >>
> >> Doing that would make it even crazier :)
> >>
> >>>
> >>> Maybe it's over-designing things but do we want to preserve the
> >>> existing behavior
> >>> and instead add optimize ("+ftree-vectorize") and target ("+avx2") as
> >>> a way to amend
> >>> the state?
> >>
> >> I prefer doing only the append mode (when one can still use -fno-foo for 
> >> an explicit
> >> drop of a flag).
> >>
> >>>
> >>> OTOH as we're missing global_options re-init even with your patch we 
> >>> won't get
> >>> the defaults correct (aka what toplev::main does with init_options_struct 
> >>> and
> >>> the corresponding langhook).  Likewise if lang_hooks.init_options 
> >>> performs any
> >>> defaulting a later flag overrides and we override that with optimize() 
> >>> that
> >>> doesn't work - I'm thinking of things like flag_complex_method and -fcx-* 
> >>> flags.
> >>> So -O2 -fcx-fortran-rules on the command-line and optimize
> >>> ("no-cx-fortran-rules")
> >>> to cancel the -fcx-fortran-rules switch wouldn't work?
> >>
> >> In most cases it works. What's problematic about -fcx-fortran-rules is 
> >> that it sets
> >>
> >>     /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  
> >> */
> >>     if (flag_cx_limited_range)
> >>       flag_complex_method = 0;
> >>
> >>     /* With -fcx-fortran-rules, we do something in-between cheap and C99.  
> >> */
> >>     if (flag_cx_fortran_rules)
> >>       flag_complex_method = 1;
> >>
> >> in process_options (called only for cmdline options) and not in
> >>
> >> /* After all options at LOC have been read into OPTS and OPTS_SET,
> >>      finalize settings of those options and diagnose incompatible
> >>      combinations.  */
> >> void
> >> finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
> >>                  location_t loc)
> >>
> >> which is a place which is called once options are decoded (both from 
> >> cmdline and when
> >> combined with a attribute or pragma):
> >
> > Yes, and that flag_complex_method is initialized via the langhook
> > mentioned, for example
> > c-family/c-opts.c has
> >
> > /* Initialize options structure OPTS.  */
> > void
> > c_common_init_options_struct (struct gcc_options *opts)
> > {
> >    opts->x_flag_exceptions = c_dialect_cxx ();
> >    opts->x_warn_pointer_arith = c_dialect_cxx ();
> >    opts->x_warn_write_strings = c_dialect_cxx ();
> >    opts->x_flag_warn_unused_result = true;
> >
> >    /* By default, C99-like requirements for complex multiply and divide.  */
> >    opts->x_flag_complex_method = 2;
> > }
> >
> > so an attempt to "cancel" a command-line option that adjusted any of
> > the above will not
> > work because we're not re-initializing global_options appropriately.
> > But maybe we can
> > just do that?  That is, call
> >
> >    /* Initialize global options structures; this must be repeated for
> >       each structure used for parsing options.  */
> >    init_options_struct (&global_options, &global_options_set);
> >    lang_hooks.init_options_struct (&global_options);
> >
> > and
> >
> >    /* Perform language-specific options initialization.  */
> >    lang_hooks.init_options (save_decoded_options_count, 
> > save_decoded_options);
> >
> > as done by toplev.c?  Or if we do not want to do that store that state away
> > to an 'initialized_options/initialized_options_set' set of vars we can
> > copy from?
>
> I think this one is a bit orthogonal to the suggested changes, right? We can 
> hopefully implement it incrementally.

It's I think required to get close to "behave as if appended to the
command-line".

Richard.

> Martin
>
> >
> >> #1  0x0000000001b69da3 in finish_options (opts=opts@entry=0x26b13e0 
> >> <global_options>, opts_set=opts_set@entry=0x26afdc0 <global_options_set>, 
> >> loc=loc@entry=258754) at /home/marxin/Programming/gcc/gcc/opts.c:1303
> >>
> >> #2  0x0000000000dd9e3b in decode_options (opts=0x26b13e0 <global_options>, 
> >> opts_set=0x26afdc0 <global_options_set>, decoded_options=<optimized out>, 
> >> decoded_options_count=decoded_options_count@entry=4, loc=258754, 
> >> dc=0x26b2b00 <global_diagnostic_context>,
> >>
> >>       target_option_override_hook=0x0) at 
> >> /home/marxin/Programming/gcc/gcc/opts-global.c:324
> >>
> >> #3  0x0000000000921144 in parse_optimize_options 
> >> (args=args@entry=<tree_list 0x7ffff76e1910>, attr_p=attr_p@entry=false) at 
> >> /home/marxin/Programming/gcc/gcc/c-family/c-common.c:5921
> >>
> >> #4  0x0000000000972aab in handle_pragma_optimize (dummy=<optimized out>) 
> >> at /home/marxin/Programming/gcc/gcc/c-family/c-pragma.c:993
> >>
> >> #5  0x00000000008e3118 in c_parser_pragma (parser=0x7ffff7fbeab0, 
> >> context=pragma_external, if_p=0x0) at 
> >> /home/marxin/Programming/gcc/gcc/c/c-parser.c:12573
> >>
> >>
> >> Martin
> >>
> >>>
> >>> Thanks,
> >>> Richard.
> >>>
> >>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>>>
> >>>> Ready to be installed?
> >>>> Thanks,
> >>>> Martin
> >>>>
> >>>>>
> >>>>> I'm also planning to take a look at the target macro/attribute, I 
> >>>>> expect similar problems:
> >>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
> >>>>>
> >>>>> Thoughts?
> >>>>> Thanks,
> >>>>> Martin
> >>>>
> >>
>

Reply via email to