On Thu, Aug 26, 2021 at 2:39 PM Martin Liška <mli...@suse.cz> wrote: > > On 8/26/21 13:04, Richard Biener wrote: > > On Tue, Aug 24, 2021 at 3:04 PM Martin Liška <mli...@suse.cz> wrote: > >> > >> On 8/24/21 14:13, Richard Biener wrote: > >>> On Thu, Jul 1, 2021 at 3:13 PM Martin Liška <mli...@suse.cz> wrote: > >>>> > >>>> On 10/23/20 1:47 PM, Martin Liška wrote: > >>>>> Hey. > >>>> > >>>> Hello. > >>>> > >>>> I deferred the patch for GCC 12. Since the time, I messed up with options > >>>> I feel more familiar with the option handling. So ... > >>>> > >>>>> > >>>>> This is a follow-up of the discussion that happened in thread about > >>>>> no_stack_protector > >>>>> attribute: > >>>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html > >>>>> > >>>>> The current optimize attribute works in the following way: > >>>>> - 1) we take current global_options as base > >>>>> - 2) maybe_default_options is called for the currently selected > >>>>> optimization level, which > >>>>> means all rules in default_options_table are executed > >>>>> - 3) attribute values are applied (via decode_options) > >>>>> > >>>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer > >>>>> and __attribute__((optimize("-fno-stack-protector"))) > >>>>> ends basically with -O2 -fno-stack-protector because > >>>>> -fno-omit-frame-pointer is default: > >>>>> /* -O1 and -Og optimizations. */ > >>>>> { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 }, > >>>>> > >>>>> My patch handled and the current optimize attribute really behaves that > >>>>> same as appending attribute value > >>>>> to the command line. So far so good. We should also reflect that in > >>>>> documentation entry which is quite > >>>>> vague right now: > >>>> > >>>> ^^^ all these are still valid arguments, plus I'm adding a new test-case > >>>> that tests that. > >> > >> Hey. > >> > >>> There is also handle_common_deferred_options that's not called so any > >>> option processed there should > >>> probably be excempt from being set/unset in the optimize attribute? > >> > >> Looking at the handled options, they have all Defer type and not > >> Optimization. > >> Thus we should be fine. > >> > >>> > >>>>> > >>>>> """ > >>>>> The optimize attribute is used to specify that a function is to be > >>>>> compiled with different optimization options than specified on the > >>>>> command line. > >>>>> """ > >>>> > >>>> I addressed that with documentation changes, should be more clear to > >>>> users. Moreover, I noticed that we declare 'optimize' attribute > >>>> as something not for a production use: > >>>> > >>>> "The optimize attribute should be used for debugging purposes only. It > >>>> is not suitable in production code." > >>>> > >>>> Are we sure about the statement? I know that e.g. glibc uses that. > >>> > >>> Well, given we're changing behavior now that warning looks valid ;) > >> > >> Yeah! True. > >> > >>> I'll also note that > >>> > >>> "The optimize attribute arguments of a function behave > >>> as if they were added to the command line options." > >>> > >>> is still likely untrue, the global state init is complicated ;) > >> > >> Sure, but the situation should be much closer to it :) Do you have a > >> better wording? > > > > Maybe "The intent is that the optimize attribute behaves as if the > > arguments were > > appended to the command line." > > > > But as said originally below I'm not sure that this behavior is what people > > expect. If I say optimize("fno-tree-vectorize") then I do expect that to > > retain > > other command-line arguments. > > Yep, that's clear! > > > If I say optimize(1) I'm not sure I would expect > > -ftree-vectorize on the command-line to prevail ;) > > That's a very good question: theoretically yes, I can imagine various > scenarios how can -Ox be used > in an attribute: > > 1) -O0 for a function that e.g. violate strict aliasing, or when one wants to > debug a fn in an optimized binary > 2) -Ofast for a function which is performance critical > > On the other hand, my original motivation was a kernel compiler with: > -O2 -fno-omit-frame-pointer and > __attribute__((optimize("-fno-stack-protector"))) > > It's not intuitive that you end up with -O2 -fno-stack-protector (and > -fomit-frame-pointer). > > > > > Is google code search still a thing? Can one search all of github > > somehow? I really > > wonder how 'optimize' is used at the moment. There are quite some optimize > > attributes in the target part of the testsuite for example. And > > testsuite/gcc.dg/vect/bb-slp-41.c > > I can investigate that. > > > suggests that optimize("-fno-tree-fre") preserves at least the > > optimization level? > > > > There's always the possibility to preserve the current behavior for > > 'optimize' > > Yes, but we should somehow document the current weird behavior when it comes > to -Ox options. > > > and add a new 'add_optimize' attribute that does the other thing. > > Considering that ... > > > > >>> > >>> > >>>>> > >>>>> and we may want to handle -Ox in the attribute in a special way. I > >>>>> guess many macro/pragma users expect that > >>>>> > >>>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 > >>>>> and not > >>>>> with -ftree-vectorize -O1 ? > >> > >> This is my older suggestion and it will likely make it even much > >> complicated. So ... > > > > In theory it's just dropping the command-line but yes, the question is > > what happens > > to the non-optimization part of the command-line. We obviously shouldn't > > drop > > -std=gnu14 and it's side-effects. So yes, even documenting exactly what > > this > > would do is difficult ;) > > > >>> > >>> As implemented your patch seems to turn it into -ftree-vectorize -O1. > >> > >> Yes. > >> > >>> IIRC multiple optimize attributes apply > >>> ontop of each other, and it makes sense to me that optimize (2), > >>> optimize ("tree-vectorize") behaves the same > >>> as optimize (2, "tree-vectorize"). I'm not sure this is still the > >>> case after your patch? Also consider > >>> > >>> #pragma GCC optimize ("tree-vectorize") > >>> void foo () { ...} > >>> > >>> #pragma GCC optimize ("tree-loop-distribution") > >>> void bar () {... } > >>> > >>> I'd expect bar to have both vectorization and loop distribution > >>> enabled? (note I didn't use push/pop here) > >> > >> Yes, yes and yes. I'm going to verify it. > >> > >>> > >>>> The situation with 'target' attribute is different. When parsing the > >>>> attribute, we intentionally drop all existing target flags: > >>>> > >>>> $ cat -n gcc/config/i386/i386-options.c > >>>> ... > >>>> 1245 if (opt == IX86_FUNCTION_SPECIFIC_ARCH) > >>>> 1246 { > >>>> 1247 /* If arch= is set, clear all bits in > >>>> x_ix86_isa_flags, > >>>> 1248 except for ISA_64BIT, ABI_64, ABI_X32, > >>>> and CODE16 > >>>> 1249 and all bits in x_ix86_isa_flags2. */ > >>>> 1250 opts->x_ix86_isa_flags &= > >>>> (OPTION_MASK_ISA_64BIT > >>>> 1251 | > >>>> OPTION_MASK_ABI_64 > >>>> 1252 | > >>>> OPTION_MASK_ABI_X32 > >>>> 1253 | > >>>> OPTION_MASK_CODE16); > >>>> 1254 opts->x_ix86_isa_flags_explicit &= > >>>> (OPTION_MASK_ISA_64BIT > >>>> 1255 | > >>>> OPTION_MASK_ABI_64 > >>>> 1256 | > >>>> OPTION_MASK_ABI_X32 > >>>> 1257 | > >>>> OPTION_MASK_CODE16); > >>>> 1258 opts->x_ix86_isa_flags2 = 0; > >>>> 1259 opts->x_ix86_isa_flags2_explicit = 0; > >>>> 1260 } > >>>> > >>>> That seems logical because target attribute is used for e.g. ifunc > >>>> multi-versioning and one needs > >>>> to be sure all existing ISA flags are dropped. However, I noticed clang > >>>> behaves differently: > >>>> > >>>> $ cat hreset.c > >>>> #pragma GCC target "arch=geode" > >>>> #include <immintrin.h> > >>>> void foo(unsigned int eax) > >>>> { > >>>> _hreset (eax); > >>>> } > >>>> > >>>> $ clang hreset.c -mhreset -c -O2 -m32 > >>>> $ gcc hreset.c -mhreset -c -O2 -m32 > >>>> In file included from > >>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97, > >>>> from > >>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27, > >>>> from hreset.c:2: > >>>> hreset.c: In function ‘foo’: > >>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: > >>>> error: inlining failed in call to ‘always_inline’ ‘_hreset’: target > >>>> specific option mismatch > >>>> 39 | _hreset (unsigned int __EAX) > >>>> | ^~~~~~~ > >>>> hreset.c:5:3: note: called from here > >>>> 5 | _hreset (eax); > >>>> | ^~~~~~~~~~~~~ > >>>> > >>>> Anyway, I think the current target attribute handling should be > >>>> preserved. > >>> > >>> I think this and the -O1 argument above suggests that there should be > >>> a way to distinguish > >>> two modes - add to the active set of options and starting from scratch. > >> > >> Doing that would make it even crazier :) > >> > >>> > >>> Maybe it's over-designing things but do we want to preserve the > >>> existing behavior > >>> and instead add optimize ("+ftree-vectorize") and target ("+avx2") as > >>> a way to amend > >>> the state? > >> > >> I prefer doing only the append mode (when one can still use -fno-foo for > >> an explicit > >> drop of a flag). > >> > >>> > >>> OTOH as we're missing global_options re-init even with your patch we > >>> won't get > >>> the defaults correct (aka what toplev::main does with init_options_struct > >>> and > >>> the corresponding langhook). Likewise if lang_hooks.init_options > >>> performs any > >>> defaulting a later flag overrides and we override that with optimize() > >>> that > >>> doesn't work - I'm thinking of things like flag_complex_method and -fcx-* > >>> flags. > >>> So -O2 -fcx-fortran-rules on the command-line and optimize > >>> ("no-cx-fortran-rules") > >>> to cancel the -fcx-fortran-rules switch wouldn't work? > >> > >> In most cases it works. What's problematic about -fcx-fortran-rules is > >> that it sets > >> > >> /* With -fcx-limited-range, we do cheap and quick complex arithmetic. > >> */ > >> if (flag_cx_limited_range) > >> flag_complex_method = 0; > >> > >> /* With -fcx-fortran-rules, we do something in-between cheap and C99. > >> */ > >> if (flag_cx_fortran_rules) > >> flag_complex_method = 1; > >> > >> in process_options (called only for cmdline options) and not in > >> > >> /* After all options at LOC have been read into OPTS and OPTS_SET, > >> finalize settings of those options and diagnose incompatible > >> combinations. */ > >> void > >> finish_options (struct gcc_options *opts, struct gcc_options *opts_set, > >> location_t loc) > >> > >> which is a place which is called once options are decoded (both from > >> cmdline and when > >> combined with a attribute or pragma): > > > > Yes, and that flag_complex_method is initialized via the langhook > > mentioned, for example > > c-family/c-opts.c has > > > > /* Initialize options structure OPTS. */ > > void > > c_common_init_options_struct (struct gcc_options *opts) > > { > > opts->x_flag_exceptions = c_dialect_cxx (); > > opts->x_warn_pointer_arith = c_dialect_cxx (); > > opts->x_warn_write_strings = c_dialect_cxx (); > > opts->x_flag_warn_unused_result = true; > > > > /* By default, C99-like requirements for complex multiply and divide. */ > > opts->x_flag_complex_method = 2; > > } > > > > so an attempt to "cancel" a command-line option that adjusted any of > > the above will not > > work because we're not re-initializing global_options appropriately. > > But maybe we can > > just do that? That is, call > > > > /* Initialize global options structures; this must be repeated for > > each structure used for parsing options. */ > > init_options_struct (&global_options, &global_options_set); > > lang_hooks.init_options_struct (&global_options); > > > > and > > > > /* Perform language-specific options initialization. */ > > lang_hooks.init_options (save_decoded_options_count, > > save_decoded_options); > > > > as done by toplev.c? Or if we do not want to do that store that state away > > to an 'initialized_options/initialized_options_set' set of vars we can > > copy from? > > I think this one is a bit orthogonal to the suggested changes, right? We can > hopefully implement it incrementally.
It's I think required to get close to "behave as if appended to the command-line". Richard. > Martin > > > > >> #1 0x0000000001b69da3 in finish_options (opts=opts@entry=0x26b13e0 > >> <global_options>, opts_set=opts_set@entry=0x26afdc0 <global_options_set>, > >> loc=loc@entry=258754) at /home/marxin/Programming/gcc/gcc/opts.c:1303 > >> > >> #2 0x0000000000dd9e3b in decode_options (opts=0x26b13e0 <global_options>, > >> opts_set=0x26afdc0 <global_options_set>, decoded_options=<optimized out>, > >> decoded_options_count=decoded_options_count@entry=4, loc=258754, > >> dc=0x26b2b00 <global_diagnostic_context>, > >> > >> target_option_override_hook=0x0) at > >> /home/marxin/Programming/gcc/gcc/opts-global.c:324 > >> > >> #3 0x0000000000921144 in parse_optimize_options > >> (args=args@entry=<tree_list 0x7ffff76e1910>, attr_p=attr_p@entry=false) at > >> /home/marxin/Programming/gcc/gcc/c-family/c-common.c:5921 > >> > >> #4 0x0000000000972aab in handle_pragma_optimize (dummy=<optimized out>) > >> at /home/marxin/Programming/gcc/gcc/c-family/c-pragma.c:993 > >> > >> #5 0x00000000008e3118 in c_parser_pragma (parser=0x7ffff7fbeab0, > >> context=pragma_external, if_p=0x0) at > >> /home/marxin/Programming/gcc/gcc/c/c-parser.c:12573 > >> > >> > >> Martin > >> > >>> > >>> Thanks, > >>> Richard. > >>> > >>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > >>>> > >>>> Ready to be installed? > >>>> Thanks, > >>>> Martin > >>>> > >>>>> > >>>>> I'm also planning to take a look at the target macro/attribute, I > >>>>> expect similar problems: > >>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469 > >>>>> > >>>>> Thoughts? > >>>>> Thanks, > >>>>> Martin > >>>> > >> >