[gcc r14-9358] contrib: Update test_mklog to correspond to mklog
https://gcc.gnu.org/g:0c1ff8951c2f5ff5b0699bbfa7523f690deac713 commit r14-9358-g0c1ff8951c2f5ff5b0699bbfa7523f690deac713 Author: Filip Kastl Date: Thu Mar 7 13:23:49 2024 +0100 contrib: Update test_mklog to correspond to mklog contrib/ChangeLog: * test_mklog.py: "Moved to..." -> "Move to..." Signed-off-by: Filip Kastl Diff: --- contrib/test_mklog.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/test_mklog.py b/contrib/test_mklog.py index b6210738e55..80e159fcca4 100755 --- a/contrib/test_mklog.py +++ b/contrib/test_mklog.py @@ -400,7 +400,7 @@ rename to gcc/ipa-icf2.c EXPECTED8 = '''\ gcc/ChangeLog: - * ipa-icf.c: Moved to... + * ipa-icf.c: Move to... * ipa-icf2.c: ...here. '''
[gcc r14-9383] MAINTAINERS: Fix order in Write After Aproval
https://gcc.gnu.org/g:1329dacdc0fbe7d43550294fe8b0323a6dc5ce9e commit r14-9383-g1329dacdc0fbe7d43550294fe8b0323a6dc5ce9e Author: Filip Kastl Date: Fri Mar 8 09:14:44 2024 +0100 MAINTAINERS: Fix order in Write After Aproval ChangeLog: * MAINTAINERS: Fix order of names in Write After Aproval Signed-off-by: Filip Kastl Diff: --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index a681518d704..8f64ee630b4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -448,9 +448,9 @@ Wei Guozhi Vineet Gupta Naveen H.S Mostafa Hagog -Demin Han Jivan Hakobyan Andrew Haley +Demin Han Frederik Harwath Stuart Hastings Michael Haubenwallner
[gcc r15-479] MAINTAINERS: Fix an entry using spaces instead of tabs
https://gcc.gnu.org/g:1a809280929fac9836ff31dcc0980ac8acee7631 commit r15-479-g1a809280929fac9836ff31dcc0980ac8acee7631 Author: Filip Kastl Date: Tue May 14 10:34:12 2024 +0200 MAINTAINERS: Fix an entry using spaces instead of tabs In the MAINTAINERS file, names and emails are separated by tabs. One of the entries recently added used spaces. This patch corrects this. The check-MAINTAINERS.py script breaks a bit when this happens. This patch also adds warning about this situation into the script. ChangeLog: * MAINTAINERS: Use tabs between name and email. contrib/ChangeLog: * check-MAINTAINERS.py: Add warning about not using tabs. Signed-off-by: Filip Kastl Diff: --- MAINTAINERS | 2 +- contrib/check-MAINTAINERS.py | 8 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 361059fd55c6..8bb435dd54ea 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -738,7 +738,7 @@ Kwok Cheung Yeung Greta Yorsh David Yuste Adhemerval Zanella -Xiao Zeng +Xiao Zeng Dennis Zhang Yufeng Zhang Qing Zhao diff --git a/contrib/check-MAINTAINERS.py b/contrib/check-MAINTAINERS.py index 9f31a10bcffb..2bac67f08214 100755 --- a/contrib/check-MAINTAINERS.py +++ b/contrib/check-MAINTAINERS.py @@ -71,6 +71,14 @@ def check_group(name, lines): print(f'Line should not start with space: "{line}"') exit_code = 2 +# Special-case some names +if line == 'James Norris': +continue + +if '\t' not in line: +print(f'Name and email should be separated by tabs: "{line}"') +exit_code = 2 + lines = [line + '\n' for line in lines] sorted_lines = sorted(lines, key=sort_by_surname) if lines != sorted_lines:
[gcc r15-3256] gimple ssa: switchconv: Use __builtin_popcount and support more types in exp transform [PR116355]
https://gcc.gnu.org/g:1c4b9826bd0d5ac471543c68f097d80b1969f599 commit r15-3256-g1c4b9826bd0d5ac471543c68f097d80b1969f599 Author: Filip Kastl Date: Wed Aug 28 15:47:44 2024 +0200 gimple ssa: switchconv: Use __builtin_popcount and support more types in exp transform [PR116355] The gen_pow2p function generates (a & -a) == a as a fallback for POPCOUNT (a) == 1. Not only is the bitmagic not equivalent to POPCOUNT (a) == 1 but it also introduces UB (consider signed a = INT_MIN). This patch rewrites gen_pow2p to always use __builtin_popcount instead. This means that what the end result GIMPLE code is gets decided by an already existing machinery in a later pass. That is a cleaner solution I think. This existing machinery also uses a ^ (a - 1) > a - 1 which is the correct bitmagic. While rewriting gen_pow2p I had to add logic for converting the operand's type to a type that __builtin_popcount accepts. I naturally also added this logic to gen_log2. Thanks to this, exponential index transform gains the capability to handle all operand types with precision at most that of long long int. gcc/ChangeLog: PR tree-optimization/116355 * tree-switch-conversion.cc (can_log2): Add capability to suggest converting the operand to a different type. (gen_log2): Add capability to generate a conversion in case the operand is of a type incompatible with the logarithm operation. (can_pow2p): New function. (gen_pow2p): Rewrite to use __builtin_popcount instead of manually inserting an internal fn call or bitmagic. Also add capability to generate a conversion. (switch_conversion::is_exp_index_transform_viable): Call can_pow2p. Store types suggested by can_log2 and gen_log2. (switch_conversion::exp_index_transform): Params of gen_pow2p and gen_log2 changed so update their calls. * tree-switch-conversion.h: Add m_exp_index_transform_log2_type and m_exp_index_transform_pow2p_type to switch_conversion class to track type conversions needed to generate the "is power of 2" and logarithm operations. gcc/testsuite/ChangeLog: PR tree-optimization/116355 * gcc.target/i386/switch-exp-transform-1.c: Don't test for presence of POPCOUNT internal fn after switch conversion. Test for it after __builtin_popcount has had a chance to get expanded. * gcc.target/i386/switch-exp-transform-3.c: Also test char and short. Signed-off-by: Filip Kastl Diff: --- .../gcc.target/i386/switch-exp-transform-1.c | 7 +- .../gcc.target/i386/switch-exp-transform-3.c | 98 - gcc/tree-switch-conversion.cc | 152 - gcc/tree-switch-conversion.h | 7 + 4 files changed, 227 insertions(+), 37 deletions(-) diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c index 53d31460ba37..a8c9e03e515f 100644 --- a/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c @@ -1,9 +1,10 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-switchconv -mpopcnt -mbmi" } */ +/* { dg-options "-O2 -fdump-tree-switchconv -fdump-tree-widening_mul -mpopcnt -mbmi" } */ /* Checks that exponential index transform enables switch conversion to convert this switch into an array lookup. Also checks that the "index variable is a - power of two" check has been generated. */ + power of two" check has been generated and that it has been later expanded + into an internal function. */ int foo(unsigned bar) { @@ -29,4 +30,4 @@ int foo(unsigned bar) } /* { dg-final { scan-tree-dump "CSWTCH" "switchconv" } } */ -/* { dg-final { scan-tree-dump "POPCOUNT" "switchconv" } } */ +/* { dg-final { scan-tree-dump "POPCOUNT" "widening_mul" } } */ diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c index 64a7b1461721..5011d1ebb0e8 100644 --- a/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c @@ -3,10 +3,104 @@ /* Checks that the exponential index transformation is done for all these types of the index variable: + - (unsigned) char + - (unsigned) short - (unsigned) int - (unsigned) long - (unsigned) long long */ +int unopt_char(char bit_position) +{ +switch (bit_position) +{ +case (1 << 0): +return 0; +case (1 << 1): +return 1; +case (1 << 2): +return 2; +case (1 << 3): +return 3; +case (1 << 4): +
[gcc r15-1862] MAINTAINERS: Fix order in DCO
https://gcc.gnu.org/g:4da5dc4be81b2797943fea44b0d40ac04700baee commit r15-1862-g4da5dc4be81b2797943fea44b0d40ac04700baee Author: Filip Kastl Date: Fri Jul 5 15:17:58 2024 +0200 MAINTAINERS: Fix order in DCO ChangeLog: * MAINTAINERS: Fix order in Contributing under the DCO. Signed-off-by: Filip Kastl Diff: --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index b4739f29107..762b91256c4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -766,6 +766,7 @@ Robin Dapp Robin Dapp Michal Jires Matthias Kretz +Prathamesh Kulkarni Tim Lange Jeff Law Jeff Law @@ -791,4 +792,3 @@ Jonathan Wakely Alexander Westbrooks Chung-Ju Wu Pengxuan Zheng -Prathamesh Kulkarni
[gcc r14-9932] contrib/check-params-in-docs.py: Ignore target-specific params
https://gcc.gnu.org/g:e30e760b51b108786946e04a26e92531762b022d commit r14-9932-ge30e760b51b108786946e04a26e92531762b022d Author: Filip Kastl Date: Fri Apr 12 09:52:27 2024 +0200 contrib/check-params-in-docs.py: Ignore target-specific params contrib/check-params-in-docs.py is a script that checks that all options reported with gcc --help=params are in gcc/doc/invoke.texi and vice versa. gcc/doc/invoke.texi lists target-specific params but gcc --help=params doesn't. This meant that the script would mistakenly complain about parms missing from --help=params. Previously, the script was just set to ignore aarch64 and gcn params which solved this issue only for x86. This patch sets the script to ignore all target-specific params. contrib/ChangeLog: * check-params-in-docs.py: Ignore target specific params. Signed-off-by: Filip Kastl Diff: --- contrib/check-params-in-docs.py | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/contrib/check-params-in-docs.py b/contrib/check-params-in-docs.py index f7879dd8e08..ccdb8d72169 100755 --- a/contrib/check-params-in-docs.py +++ b/contrib/check-params-in-docs.py @@ -38,6 +38,9 @@ def get_param_tuple(line): description = line[i:].strip() return (name, description) +def target_specific(param): +return param.split('-')[0] in ('aarch64', 'gcn', 'x86') + parser = argparse.ArgumentParser() parser.add_argument('texi_file') @@ -45,13 +48,16 @@ parser.add_argument('params_output') args = parser.parse_args() -ignored = {'logical-op-non-short-circuit', 'gcn-preferred-vectorization-factor'} -params = {} +ignored = {'logical-op-non-short-circuit'} +help_params = {} for line in open(args.params_output).readlines(): if line.startswith(' ' * 2) and not line.startswith(' ' * 8): r = get_param_tuple(line) -params[r[0]] = r[1] +help_params[r[0]] = r[1] + +# Skip target-specific params +help_params = [x for x in help_params.keys() if not target_specific(x)] # Find section in .texi manual with parameters texi = ([x.strip() for x in open(args.texi_file).readlines()]) @@ -66,14 +72,13 @@ for line in texi: texi_params.append(line[len(token):]) break -# skip digits +# Skip digits texi_params = [x for x in texi_params if not x[0].isdigit()] -# skip aarch64 params -texi_params = [x for x in texi_params if not x.startswith('aarch64')] -sorted_params = sorted(texi_params) +# Skip target-specific params +texi_params = [x for x in texi_params if not target_specific(x)] texi_set = set(texi_params) - ignored -params_set = set(params.keys()) - ignored +params_set = set(help_params) - ignored success = True extra = texi_set - params_set
[gcc r15-2416] gimple ssa: Teach switch conversion to optimize powers of 2 switches
https://gcc.gnu.org/g:2b3533cd871f62923e7a4f06a826f37bf0f35c5c commit r15-2416-g2b3533cd871f62923e7a4f06a826f37bf0f35c5c Author: Filip Kastl Date: Tue Jul 30 18:40:29 2024 +0200 gimple ssa: Teach switch conversion to optimize powers of 2 switches Sometimes a switch has case numbers that are powers of 2. Switch conversion usually isn't able to optimize these switches. This patch adds "exponential index transformation" to switch conversion. After switch conversion applies this transformation on the switch the index variable of the switch becomes the exponent instead of the whole value. For example: switch (i) { case (1 << 0): return 0; case (1 << 1): return 1; case (1 << 2): return 2; ... case (1 << 30): return 30; default: return 31; } gets transformed roughly into switch (log2(i)) { case 0: return 0; case 1: return 1; case 2: return 2; ... case 30: return 30; default: return 31; } This enables switch conversion to further optimize the switch. This patch only enables this transformation if there are optabs for FFS so that the base 2 logarithm can be computed efficiently at runtime. gcc/ChangeLog: * tree-switch-conversion.cc (can_log2): New static function to check if gen_log2 can be used on current target. (gen_log2): New static function to generate efficient GIMPLE code for taking an exact base 2 log. (gen_pow2p): New static function to generate efficient GIMPLE code for checking if a value is a power of 2. (switch_conversion::switch_conversion): Track if the transformation happened. (switch_conversion::is_exp_index_transform_viable): New function to decide whether the transformation should be applied. (switch_conversion::exp_index_transform): New function to execute the transformation. (switch_conversion::gen_inbound_check): Don't remove the default BB if the transformation happened. (switch_conversion::expand): Execute the transform if it is viable. Skip the "sufficiently small case range" test if the transformation is going to be executed. * tree-switch-conversion.h: Add is_exp_index_transform_viable and exp_index_transform. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/switch-3.c: Disable switch conversion. * gcc.target/i386/switch-exp-transform-1.c: New test. * gcc.target/i386/switch-exp-transform-2.c: New test. * gcc.target/i386/switch-exp-transform-3.c: New test. Signed-off-by: Filip Kastl Diff: --- gcc/testsuite/gcc.dg/tree-ssa/switch-3.c | 2 +- .../gcc.target/i386/switch-exp-transform-1.c | 32 ++ .../gcc.target/i386/switch-exp-transform-2.c | 35 +++ .../gcc.target/i386/switch-exp-transform-3.c | 148 ++ gcc/tree-switch-conversion.cc | 326 - gcc/tree-switch-conversion.h | 18 ++ 6 files changed, 555 insertions(+), 6 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c b/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c index 44981e1d1861..83aae3843e91 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/switch-3.c @@ -1,4 +1,4 @@ -/* { dg-options "-O2 -fdump-tree-switchlower1" } */ +/* { dg-options "-O2 -fdump-tree-switchlower1 -fdisable-tree-switchconv" } */ int cipher_to_alg(int cipher) { diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c new file mode 100644 index ..53d31460ba37 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-switchconv -mpopcnt -mbmi" } */ + +/* Checks that exponential index transform enables switch conversion to convert + this switch into an array lookup. Also checks that the "index variable is a + power of two" check has been generated. */ + +int foo(unsigned bar) +{ +switch (bar) +{ +case (1 << 0): +return 1; +case (1 << 1): +return 2; +case (1 << 2): +return 3; +case (1 << 3): +return 4; +case (1 << 4): +return 8; +case (1 << 5): +return 13; +case (1 << 6): +return 21; +default: +return 0; +} +} + +/* { dg-final { scan-tree-dump "CSWTCH" "switchconv" } } */ +/* { dg-final { scan-tree-dump "POPCOUNT" "switchconv" } } */ diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c b/gcc/testsuite/
[gcc r15-2434] testsuite: Adjust switch-exp-transform-3.c for 32bit
https://gcc.gnu.org/g:f40fd85c32c9ab4849065d0d14cd5a7ad67619b8 commit r15-2434-gf40fd85c32c9ab4849065d0d14cd5a7ad67619b8 Author: Filip Kastl Date: Wed Jul 31 13:40:45 2024 +0200 testsuite: Adjust switch-exp-transform-3.c for 32bit 32bit x86 CPUs won't natively support the FFS operation on a 64 bit type. Therefore, I'm setting the long long int part of the switch-exp-transform-3.c test to only execute with 64bit targets. gcc/testsuite/ChangeLog: * gcc.target/i386/switch-exp-transform-3.c: Set the long long int test to only execute with 64bit targets. Signed-off-by: Filip Kastl Diff: --- gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c index c8fae70692e5..64a7b1461721 100644 --- a/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c @@ -99,6 +99,8 @@ int unopt_unsigned_long(unsigned long bit_position) } } +#ifdef __x86_64__ + int unopt_long_long(long long bit_position) { switch (bit_position) @@ -145,4 +147,7 @@ int unopt_unsigned_long_long(unsigned long long bit_position) } } -/* { dg-final { scan-tree-dump-times "Applying exponential index transform" 6 "switchconv" } } */ +#endif + +/* { dg-final { scan-tree-dump-times "Applying exponential index transform" 4 "switchconv" { target ia32 } } } */ +/* { dg-final { scan-tree-dump-times "Applying exponential index transform" 6 "switchconv" { target { ! ia32 } } } } */
[gcc r15-2723] gimple ssa: Fix a typo in gimple-ssa-sccopy.cc
https://gcc.gnu.org/g:bb30fdd3436987aee6a22610e1d22b091c7ded6e commit r15-2723-gbb30fdd3436987aee6a22610e1d22b091c7ded6e Author: Filip Kastl Date: Mon Aug 5 14:39:06 2024 +0200 gimple ssa: Fix a typo in gimple-ssa-sccopy.cc Fixes a misplaced comment in gimple-ssa-sccopy.cc. The comment belongs to a bitmap definition but was instead placed before the beginning of a namespace block. gcc/ChangeLog: * gimple-ssa-sccopy.cc: Move a misplaced comment. Signed-off-by: Filip Kastl Diff: --- gcc/gimple-ssa-sccopy.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc index 138ee9a0ac48..191a4c0b451d 100644 --- a/gcc/gimple-ssa-sccopy.cc +++ b/gcc/gimple-ssa-sccopy.cc @@ -92,10 +92,11 @@ along with GCC; see the file COPYING3. If not see Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791, Section 3.2. */ +namespace { + /* Bitmap tracking statements which were propagated to be removed at the end of the pass. */ -namespace { static bitmap dead_stmts; /* State of vertex during SCC discovery.
[gcc r15-2755] gimple ssa: Put SCCOPY logic into a class
https://gcc.gnu.org/g:af1010268f81fc891a6bbf8ed9d5b8a3b5ce44cb commit r15-2755-gaf1010268f81fc891a6bbf8ed9d5b8a3b5ce44cb Author: Filip Kastl Date: Tue Aug 6 15:19:11 2024 +0200 gimple ssa: Put SCCOPY logic into a class Currently the main logic of the sccopy pass is implemented as static functions. This patch instead puts the code into a class. This also gets rid of a global variable (dead_stmts). gcc/ChangeLog: * gimple-ssa-sccopy.cc (class scc_copy_prop): New class. (replace_scc_by_value): Put into... (scc_copy_prop::replace_scc_by_value): ...scc_copy_prop. (sccopy_visit_op): Put into... (scc_copy_prop::visit_op): ...scc_copy_prop. (sccopy_propagate): Put into... (scc_copy_prop::propagate): ...scc_copy_prop. (init_sccopy): Replace by... (scc_copy_prop::scc_copy_prop): ...the construtor. (finalize_sccopy): Replace by... (scc_copy_prop::~scc_copy_prop): ...the destructor. (pass_sccopy::execute): Use scc_copy_prop. Signed-off-by: Filip Kastl Diff: --- gcc/gimple-ssa-sccopy.cc | 66 +++- 1 file changed, 37 insertions(+), 29 deletions(-) diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc index 191a4c0b451d..d9eaeab4abbe 100644 --- a/gcc/gimple-ssa-sccopy.cc +++ b/gcc/gimple-ssa-sccopy.cc @@ -94,11 +94,6 @@ along with GCC; see the file COPYING3. If not see namespace { -/* Bitmap tracking statements which were propagated to be removed at the end of - the pass. */ - -static bitmap dead_stmts; - /* State of vertex during SCC discovery. unvisited Vertex hasn't yet been popped from worklist. @@ -459,11 +454,33 @@ get_all_stmt_may_generate_copy (void) return result; } +/* SCC copy propagation + + 'scc_copy_prop::propagate ()' is the main function of this pass. */ + +class scc_copy_prop +{ +public: + scc_copy_prop (); + ~scc_copy_prop (); + void propagate (); + +private: + /* Bitmap tracking statements which were propagated so that they can be + removed at the end of the pass. */ + bitmap dead_stmts; + + void visit_op (tree op, hash_set &outer_ops, + hash_set &scc_set, bool &is_inner, + tree &last_outer_op); + void replace_scc_by_value (vec scc, tree val); +}; + /* For each statement from given SCC, replace its usages by value VAL. */ -static void -replace_scc_by_value (vec scc, tree val) +void +scc_copy_prop::replace_scc_by_value (vec scc, tree val) { for (gimple *stmt : scc) { @@ -476,12 +493,12 @@ replace_scc_by_value (vec scc, tree val) fprintf (dump_file, "Replacing SCC of size %d\n", scc.length ()); } -/* Part of 'sccopy_propagate ()'. */ +/* Part of 'scc_copy_prop::propagate ()'. */ -static void -sccopy_visit_op (tree op, hash_set &outer_ops, -hash_set &scc_set, bool &is_inner, -tree &last_outer_op) +void +scc_copy_prop::visit_op (tree op, hash_set &outer_ops, +hash_set &scc_set, bool &is_inner, +tree &last_outer_op) { bool op_in_scc = false; @@ -539,8 +556,8 @@ sccopy_visit_op (tree op, hash_set &outer_ops, Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791, Section 3.2. */ -static void -sccopy_propagate () +void +scc_copy_prop::propagate () { auto_vec useful_stmts = get_all_stmt_may_generate_copy (); scc_discovery discovery; @@ -575,14 +592,12 @@ sccopy_propagate () for (j = 0; j < gimple_phi_num_args (phi); j++) { op = gimple_phi_arg_def (phi, j); - sccopy_visit_op (op, outer_ops, scc_set, is_inner, - last_outer_op); + visit_op (op, outer_ops, scc_set, is_inner, last_outer_op); } break; case GIMPLE_ASSIGN: op = gimple_assign_rhs1 (stmt); - sccopy_visit_op (op, outer_ops, scc_set, is_inner, - last_outer_op); + visit_op (op, outer_ops, scc_set, is_inner, last_outer_op); break; default: gcc_unreachable (); @@ -613,19 +628,13 @@ sccopy_propagate () } } -/* Called when pass execution starts. */ - -static void -init_sccopy (void) +scc_copy_prop::scc_copy_prop () { /* For propagated statements. */ dead_stmts = BITMAP_ALLOC (NULL); } -/* Called before pass execution ends. */ - -static void -finalize_sccopy (void) +scc_copy_prop::~scc_copy_prop () { /* Remove all propagated statements. */ simple_dce_from_worklist (dead_stmts); @@ -668,9 +677,8 @@ public: unsigned pass_sccopy::execute (function *) { - init_sccopy (); - sccopy_propagate (); - finalize_sccopy (); + scc_copy_prop sccopy; + sccopy.pro
[gcc r15-4024] gimple ssa: Don't use __builtin_popcount in switch exp transform [PR116616]
https://gcc.gnu.org/g:ffc389cb11a2a61fb89b6034d3f3fe0896b29064 commit r15-4024-gffc389cb11a2a61fb89b6034d3f3fe0896b29064 Author: Filip Kastl Date: Wed Oct 2 14:14:44 2024 +0200 gimple ssa: Don't use __builtin_popcount in switch exp transform [PR116616] Switch exponential transformation in the switch conversion pass currently generates tmp1 = __builtin_popcount (var); tmp2 = tmp1 == 1; when inserting code to determine if var is power of two. If the target doesn't support expanding the builtin as special instructions switch conversion relies on this whole pattern being expanded as bitmagic. However, it is possible that other GIMPLE optimizations move the two statements of the pattern apart. In that case the builtin becomes a libgcc call in the final binary. The call is slow and in case of freestanding programs can result in linking error (this bug was originally found while compiling Linux kernel). This patch modifies switch conversion to insert the bitmagic (var ^ (var - 1)) > (var - 1) instead of the builtin. gcc/ChangeLog: PR tree-optimization/116616 * tree-switch-conversion.cc (can_pow2p): Remove this function. (gen_pow2p): Generate bitmagic instead of a builtin. Remove the TYPE parameter. (switch_conversion::is_exp_index_transform_viable): Don't call can_pow2p. (switch_conversion::exp_index_transform): Call gen_pow2p without the TYPE parameter. * tree-switch-conversion.h: Remove m_exp_index_transform_pow2p_type. gcc/testsuite/ChangeLog: PR tree-optimization/116616 * gcc.target/i386/switch-exp-transform-1.c: Don't test for presence of the POPCOUNT internal fn call. Signed-off-by: Filip Kastl Diff: --- .../gcc.target/i386/switch-exp-transform-1.c | 7 +- gcc/tree-switch-conversion.cc | 84 +- gcc/tree-switch-conversion.h | 6 +- 3 files changed, 23 insertions(+), 74 deletions(-) diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c index a8c9e03e515f..4832f5b52c33 100644 --- a/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c @@ -1,10 +1,8 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-switchconv -fdump-tree-widening_mul -mpopcnt -mbmi" } */ +/* { dg-options "-O2 -fdump-tree-switchconv -mbmi" } */ /* Checks that exponential index transform enables switch conversion to convert - this switch into an array lookup. Also checks that the "index variable is a - power of two" check has been generated and that it has been later expanded - into an internal function. */ + this switch into an array lookup. */ int foo(unsigned bar) { @@ -30,4 +28,3 @@ int foo(unsigned bar) } /* { dg-final { scan-tree-dump "CSWTCH" "switchconv" } } */ -/* { dg-final { scan-tree-dump "POPCOUNT" "widening_mul" } } */ diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc index c1332a260943..00426d46 100644 --- a/gcc/tree-switch-conversion.cc +++ b/gcc/tree-switch-conversion.cc @@ -133,75 +133,33 @@ gen_log2 (tree op, location_t loc, tree *result, tree type) return stmts; } -/* Is it possible to efficiently check that a value of TYPE is a power of 2? - - If yes, returns TYPE. If no, returns NULL_TREE. May also return another - type. This indicates that logarithm of the variable can be computed but - only after it is converted to this type. - - Also see gen_pow2p. */ - -static tree -can_pow2p (tree type) -{ - /* __builtin_popcount supports the unsigned type or its long and long long - variants. Choose the smallest out of those that can still fit TYPE. */ - int prec = TYPE_PRECISION (type); - int i_prec = TYPE_PRECISION (unsigned_type_node); - int li_prec = TYPE_PRECISION (long_unsigned_type_node); - int lli_prec = TYPE_PRECISION (long_long_unsigned_type_node); - - if (prec <= i_prec) -return unsigned_type_node; - else if (prec <= li_prec) -return long_unsigned_type_node; - else if (prec <= lli_prec) -return long_long_unsigned_type_node; - else -return NULL_TREE; -} - -/* Build a sequence of gimple statements checking that OP is a power of 2. Use - special optabs if target supports them. Return the result as a - boolean_type_node ssa name through RESULT. Assumes that OP's value will - be non-negative. The generated check may give arbitrary answer for negative - values. - - Before computing the check, OP may have to be converted to another type. - This should be specified in TYPE. Use can_pow2p to decide what this type - should be. - - Should only be used if can_pow2p returns true for type of OP. */ +/* Build a sequence of g
[gcc r15-4372] MAINTAINERS: Fix name order
https://gcc.gnu.org/g:2813a5bc7af2865ee4d2e94bce59a7fdefeea0b3 commit r15-4372-g2813a5bc7af2865ee4d2e94bce59a7fdefeea0b3 Author: Filip Kastl Date: Wed Oct 16 08:50:46 2024 +0200 MAINTAINERS: Fix name order ChangeLog: * MAINTAINERS: Fix Write After Approval name order. Signed-off-by: Filip Kastl Diff: --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index cf1cf78e16cb..269ac2ea6b49 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -760,7 +760,6 @@ Ankur Saini arsenic Hariharan Sandanagobalane hariharans Richard Sandiford rsandifo Iain Sandoe iains -Feng Xuefxue Duncan Sandsbaldrick Sujoy Saraswati ssaraswati Trevor Saunders tbsaunde @@ -880,6 +879,7 @@ Ruoyao Xi xry111 Mingjie Xingxmj Chenghua Xu paulhua Li Xu - +Feng Xuefxue Canqun Yang canqun Fei Yangfyang Jeffrey Yasskin jyasskin
[gcc r15-5204] i386: Add -mveclibabi=aocl [PR56504]
https://gcc.gnu.org/g:99ec0eb32a03506142f30c158276b4131aa73fe8 commit r15-5204-g99ec0eb32a03506142f30c158276b4131aa73fe8 Author: Filip Kastl Date: Wed Nov 13 16:11:14 2024 +0100 i386: Add -mveclibabi=aocl [PR56504] We currently support generating vectorized math calls to the AMD core math library (ACML) (-mveclibabi=acml). That library is end-of-life and its successor is the math library from AMD Optimizing CPU Libraries (AOCL). This patch adds support for AOCL (-mveclibabi=aocl). That significantly broadens the range of vectorized math functions optimized for AMD CPUs that GCC can generate calls to. See the edit to invoke.texi for a complete list of added functions. Compared to the list of functions in AOCL LibM docs I left out these vectorized function families: - sincos and all functions working with arrays ... Because these functions have pointer arguments and that would require a bigger rework of ix86_veclibabi_aocl(). Also, I'm not sure if GCC even ever generates calls to these functions. - linearfrac ... Because these functions are specific to the AMD library. There's no equivalent glibc function nor GCC internal function nor GCC built-in. - powx, sqrt, fabs ... Because GCC doesn't vectorize these functions into calls and uses instructions instead. I also left amd_vrd2_expm1() (the AMD docs list the function but I wasn't able to link calls to it with the current version of the library). gcc/ChangeLog: PR target/56504 * config/i386/i386-options.cc (ix86_option_override_internal): Add ix86_veclibabi_type_aocl case. * config/i386/i386-options.h (ix86_veclibabi_aocl): Add extern ix86_veclibabi_aocl(). * config/i386/i386-opts.h (enum ix86_veclibabi): Add ix86_veclibabi_type_aocl into the ix86_veclibabi enum. * config/i386/i386.cc (ix86_veclibabi_aocl): New function. * config/i386/i386.opt: Add the 'aocl' type. * doc/invoke.texi: Document -mveclibabi=aocl. gcc/testsuite/ChangeLog: PR target/56504 * gcc.target/i386/vectorize-aocl1.c: New test. Signed-off-by: Filip Kastl Diff: --- gcc/config/i386/i386-options.cc | 4 + gcc/config/i386/i386-options.h | 1 + gcc/config/i386/i386-opts.h | 3 +- gcc/config/i386/i386.cc | 142 +++ gcc/config/i386/i386.opt| 3 + gcc/doc/invoke.texi | 57 -- gcc/testsuite/gcc.target/i386/vectorize-aocl1.c | 224 7 files changed, 418 insertions(+), 16 deletions(-) diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index 603166d249c6..76a20179a365 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -2877,6 +2877,10 @@ ix86_option_override_internal (bool main_args_p, ix86_veclib_handler = &ix86_veclibabi_acml; break; + case ix86_veclibabi_type_aocl: + ix86_veclib_handler = &ix86_veclibabi_aocl; + break; + default: gcc_unreachable (); } diff --git a/gcc/config/i386/i386-options.h b/gcc/config/i386/i386-options.h index 0d448ef9f154..591a6152c012 100644 --- a/gcc/config/i386/i386-options.h +++ b/gcc/config/i386/i386-options.h @@ -60,6 +60,7 @@ void ix86_simd_clone_adjust (struct cgraph_node *node); extern tree (*ix86_veclib_handler) (combined_fn, tree, tree); extern tree ix86_veclibabi_svml (combined_fn, tree, tree); extern tree ix86_veclibabi_acml (combined_fn, tree, tree); +extern tree ix86_veclibabi_aocl (combined_fn, tree, tree); enum ix86_function_specific_strings { diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h index 35542b289363..69fcd82bf47f 100644 --- a/gcc/config/i386/i386-opts.h +++ b/gcc/config/i386/i386-opts.h @@ -87,7 +87,8 @@ enum asm_dialect { enum ix86_veclibabi { ix86_veclibabi_type_none, ix86_veclibabi_type_svml, - ix86_veclibabi_type_acml + ix86_veclibabi_type_acml, + ix86_veclibabi_type_aocl }; enum stack_protector_guard { diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 526c9df7618d..9d3d8abf7803 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -19882,6 +19882,148 @@ ix86_veclibabi_acml (combined_fn fn, tree type_out, tree type_in) return new_fndecl; } +/* Handler for an AOCL-LibM-style interface to + a library with vectorized intrinsics. */ + +tree +ix86_veclibabi_aocl (combined_fn fn, tree type_out, tree type_in) +{ + char name[20] = "amd_vr"; + int name_len = 6; + tree fntype, new_fndecl, args; + unsigned arity; + const char *bname; + machine_mode el_mode, in_mode; + int n, in_n; + + /* AOCL-LibM is 64bits only. It is also only suitable for unsafe ma
[gcc r15-3690] contrib: Set check-params-in-docs.py to skip tables of values of a param
https://gcc.gnu.org/g:4b7e6d5faa137f18a36d8c6323a8640e61ee48f1 commit r15-3690-g4b7e6d5faa137f18a36d8c6323a8640e61ee48f1 Author: Filip Kastl Date: Wed Sep 18 16:38:30 2024 +0200 contrib: Set check-params-in-docs.py to skip tables of values of a param Currently check-params-in-docs.py reports extra params being listed in invoke.texi. However, those aren't actual params but items in a table of possible values of the aarch64-autove-preference param. This patch changes check-params-in-docs.py to ignore similar tables. contrib/ChangeLog: * check-params-in-docs.py: Skip tables of values of a param. Remove code that skips items beginning with a number. Signed-off-by: Filip Kastl Diff: --- contrib/check-params-in-docs.py | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/contrib/check-params-in-docs.py b/contrib/check-params-in-docs.py index ccdb8d721696..102f0e64e989 100755 --- a/contrib/check-params-in-docs.py +++ b/contrib/check-params-in-docs.py @@ -66,14 +66,23 @@ texi = takewhile(lambda x: '@node Instrumentation Options' not in x, texi) texi = list(texi)[1:] texi_params = [] +skip = False for line in texi: +# Skip @table @samp sections of manual where values of a param are usually +# listed +if skip: +if line.startswith('@end table'): +skip = False +continue +elif line.startswith('@table @samp'): +skip = True +continue + for token in ('@item ', '@itemx '): if line.startswith(token): texi_params.append(line[len(token):]) break -# Skip digits -texi_params = [x for x in texi_params if not x[0].isdigit()] # Skip target-specific params texi_params = [x for x in texi_params if not target_specific(x)]
[gcc r15-5923] contrib: Fix 2 bugs in check-params-in-docs.py
https://gcc.gnu.org/g:15f5972e16e9a8f6ef0a372fdbe5359df3d0af1a commit r15-5923-g15f5972e16e9a8f6ef0a372fdbe5359df3d0af1a Author: Filip Kastl Date: Wed Dec 4 15:46:54 2024 +0100 contrib: Fix 2 bugs in check-params-in-docs.py In my last patch for check-params-in-docs.py I accidentally 1. left one occurence of the 'help_params' variable not renamed 2. converted 'help_params' from a dict to a list These issues cause the script to error when encountering a parameter missing in docs. This patch should fix these issues. contrib/ChangeLog: * check-params-in-docs.py: 'params' -> 'help_params'. Don't convert 'help_params' to a list. Signed-off-by: Filip Kastl Diff: --- contrib/check-params-in-docs.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/contrib/check-params-in-docs.py b/contrib/check-params-in-docs.py index 102f0e64e989..5d5c64c14f77 100755 --- a/contrib/check-params-in-docs.py +++ b/contrib/check-params-in-docs.py @@ -57,7 +57,7 @@ for line in open(args.params_output).readlines(): help_params[r[0]] = r[1] # Skip target-specific params -help_params = [x for x in help_params.keys() if not target_specific(x)] +help_params = {x:y for x,y in help_params.items() if not target_specific(x)} # Find section in .texi manual with parameters texi = ([x.strip() for x in open(args.texi_file).readlines()]) @@ -87,7 +87,7 @@ for line in texi: texi_params = [x for x in texi_params if not target_specific(x)] texi_set = set(texi_params) - ignored -params_set = set(help_params) - ignored +params_set = set(help_params.keys()) - ignored success = True extra = texi_set - params_set @@ -101,7 +101,7 @@ if len(missing): print('Missing:') for m in missing: print('@item ' + m) -print(params[m]) +print(help_params[m]) print() success = False
[gcc r15-5933] params.opt: Fix typo
https://gcc.gnu.org/g:2a2f285ecd2cd681cadae305990ffb9e23e157cb commit r15-5933-g2a2f285ecd2cd681cadae305990ffb9e23e157cb Author: Filip Kastl Date: Thu Dec 5 11:23:13 2024 +0100 params.opt: Fix typo Add missing '=' after -param=cycle-accurate-model. gcc/ChangeLog: * params.opt: Add missing '=' after -param=cycle-accurate-model. Signed-off-by: Filip Kastl Diff: --- gcc/params.opt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/params.opt b/gcc/params.opt index f5cc71d0f493..5853bf02f9ee 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -66,7 +66,7 @@ Enable asan stack protection. Common Joined UInteger Var(param_asan_use_after_return) Init(1) IntegerRange(0, 1) Param Optimization Enable asan detection of use-after-return bugs. --param=cycle-accurate-model +-param=cycle-accurate-model= Common Joined UInteger Var(param_cycle_accurate_model) Init(1) IntegerRange(0, 1) Param Optimization Whether the scheduling description is mostly a cycle-accurate model of the target processor and is likely to be spill aggressively to fill any pipeline bubbles.
[gcc r15-5934] doc: Add store-forwarding-max-distance to invoke.texi
https://gcc.gnu.org/g:9755f5973473aa547063d1a97d47a409d237eb5b commit r15-5934-g9755f5973473aa547063d1a97d47a409d237eb5b Author: Filip Kastl Date: Thu Dec 5 11:27:26 2024 +0100 doc: Add store-forwarding-max-distance to invoke.texi gcc/ChangeLog: * doc/invoke.texi: Add store-forwarding-max-distance. Signed-off-by: Filip Kastl Diff: --- gcc/doc/invoke.texi | 5 + 1 file changed, 5 insertions(+) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index d2409a41d50a..4b1acf9b79c1 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -17122,6 +17122,11 @@ diagnostics. @item store-merging-max-size Maximum size of a single store merging region in bytes. +@item store-forwarding-max-distance +Maximum number of instruction distance that a small store forwarded to a larger +load may stall. Value '0' disables the cost checks for the +avoid-store-forwarding pass. + @item hash-table-verification-limit The number of elements for which hash table verification is done for each searched element.
[gcc r15-6120] gimple: Add limit after which slower switchlower algs are used [PR117091] [PR117352]
https://gcc.gnu.org/g:56946c801a7cf3a831a11870b7e11ba08bf9bd87 commit r15-6120-g56946c801a7cf3a831a11870b7e11ba08bf9bd87 Author: Filip Kastl Date: Wed Dec 11 19:57:04 2024 +0100 gimple: Add limit after which slower switchlower algs are used [PR117091] [PR117352] This patch adds a limit on the number of cases of a switch. When this limit is exceeded, switch lowering decides to use faster but less powerful algorithms. In particular this means that for finding bit tests switch lowering decides between the old dynamic programming O(n^2) algorithm and the new greedy algorithm that Andi Kleen recently added but then reverted due to PR117352. It also means that switch lowering may bail out on finding jump tables if the switch is too large (Btw it also may not bail! It can happen that the greedy algorithms finds some bit tests which then basically split the switch into multiple smaller switches and those may be small enough to fit under the limit.) The limit is implemented as --param switch-lower-slow-alg-max-cases. Exceeding the limit is reported through -Wdisabled-optimization. This patch fixes the issue with the greedy algorithm described in PR117352. The problem was incorrect usage of the is_beneficial() heuristic. gcc/ChangeLog: PR middle-end/117091 PR middle-end/117352 * doc/invoke.texi: Add switch-lower-slow-alg-max-cases. * params.opt: Add switch-lower-slow-alg-max-cases. * tree-switch-conversion.cc (jump_table_cluster::find_jump_tables): Note in a comment that we are looking for jump tables in case sequences delimited by the already found bit tests. (bit_test_cluster::find_bit_tests): Decide between find_bit_tests_fast() and find_bit_tests_slow(). (bit_test_cluster::find_bit_tests_fast): New function. (bit_test_cluster::find_bit_tests_slow): New function. (switch_decision_tree::analyze_switch_statement): Report exceeding the limit. * tree-switch-conversion.h: Add find_bit_tests_fast() and find_bit_tests_slow(). Co-Authored-By: Andi Kleen Signed-off-by: Filip Kastl Diff: --- gcc/doc/invoke.texi | 3 ++ gcc/params.opt| 4 ++ gcc/tree-switch-conversion.cc | 112 +++--- gcc/tree-switch-conversion.h | 18 +++ 4 files changed, 130 insertions(+), 7 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 14afd1934bd2..3cb9a50b6909 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -16500,6 +16500,9 @@ Switch initialization conversion refuses to create arrays that are bigger than @option{switch-conversion-max-branch-ratio} times the number of branches in the switch. +@item switch-lower-slow-alg-max-cases +Maximum number of cases for slow switch lowering algorithms to be used. + @item max-partial-antic-length Maximum length of the partial antic set computed during the tree partial redundancy elimination optimization (@option{-ftree-pre}) when diff --git a/gcc/params.opt b/gcc/params.opt index 5853bf02f9ee..1c88d5212c40 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -1052,6 +1052,10 @@ Maximum number of instruction distance that a small store forwarded to a larger Common Joined UInteger Var(param_switch_conversion_branch_ratio) Init(8) IntegerRange(1, 65536) Param Optimization The maximum ratio between array size and switch branches for a switch conversion to take place. +-param=switch-lower-slow-alg-max-cases= +Common Joined UInteger Var(param_switch_lower_slow_alg_max_cases) Init(1000) IntegerRange(1, 10) Param Optimization +Maximum number of cases for slow switch lowering algorithms to be used. + -param=modref-max-bases= Common Joined UInteger Var(param_modref_max_bases) Init(32) Param Optimization Maximum number of bases stored in each modref tree. diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc index 3436c2a8b98c..b98e70cf7d16 100644 --- a/gcc/tree-switch-conversion.cc +++ b/gcc/tree-switch-conversion.cc @@ -54,6 +54,7 @@ Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA #include "tree-cfgcleanup.h" #include "hwint.h" #include "internal-fn.h" +#include "diagnostic-core.h" /* ??? For lang_hooks.types.type_for_mode, but is there a word_mode type in the GIMPLE type system that is language-independent? */ @@ -1641,6 +1642,11 @@ jump_table_cluster::find_jump_tables (vec &clusters) return clusters.copy (); unsigned l = clusters.length (); + + /* Note: l + 1 is the number of cases of the switch. */ + if (l + 1 > (unsigned) param_switch_lower_slow_alg_max_cases) +return clusters.copy (); + auto_vec min; min.reserve (l + 1); @@ -1771,16 +1777,80 @@ jump_table_cluster::is_beneficial (const vec &, return
[gcc r14-11431] gimple: sccopy: Don't increment i after vec::unordered_remove()
https://gcc.gnu.org/g:13950737746e6d6503ad7f1df5a8c47010857ff8 commit r14-11431-g13950737746e6d6503ad7f1df5a8c47010857ff8 Author: Filip Kastl Date: Thu Mar 20 11:54:59 2025 +0100 gimple: sccopy: Don't increment i after vec::unordered_remove() I increment the index variable in a loop even when I do vec::unordered_remove() which causes the vector traversal to miss some elements. Mikael notified me of this mistake I made in my last patch. gcc/ChangeLog: * gimple-ssa-sccopy.cc (scc_copy_prop::propagate): Don't increment after vec::unordered_remove(). Reported-by: Mikael Morin Signed-off-by: Filip Kastl (cherry picked from commit a1363f8dd8037d40e9fbf04c2ba8d6d3e7e5c269) Diff: --- gcc/gimple-ssa-sccopy.cc | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc index d4d06f3b13e7..f7e121992e5f 100644 --- a/gcc/gimple-ssa-sccopy.cc +++ b/gcc/gimple-ssa-sccopy.cc @@ -554,9 +554,11 @@ sccopy_propagate () get removed. That means parts of CFG get removed. Those may contain copy statements. For that reason we prune SCCs here. */ unsigned i; - for (i = 0; i < scc.length (); i++) + for (i = 0; i < scc.length ();) if (gimple_bb (scc[i]) == NULL) scc.unordered_remove (i); + else + i++; if (scc.is_empty ()) { scc.release ();
[gcc r15-8468] gimple: sccopy: Don't increment i after vec::unordered_remove()
https://gcc.gnu.org/g:a1363f8dd8037d40e9fbf04c2ba8d6d3e7e5c269 commit r15-8468-ga1363f8dd8037d40e9fbf04c2ba8d6d3e7e5c269 Author: Filip Kastl Date: Thu Mar 20 11:54:59 2025 +0100 gimple: sccopy: Don't increment i after vec::unordered_remove() I increment the index variable in a loop even when I do vec::unordered_remove() which causes the vector traversal to miss some elements. Mikael notified me of this mistake I made in my last patch. gcc/ChangeLog: * gimple-ssa-sccopy.cc (scc_copy_prop::propagate): Don't increment after vec::unordered_remove(). Reported-by: Mikael Morin Signed-off-by: Filip Kastl Diff: --- gcc/gimple-ssa-sccopy.cc | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc index 298feb055711..ee2a7fa8a727 100644 --- a/gcc/gimple-ssa-sccopy.cc +++ b/gcc/gimple-ssa-sccopy.cc @@ -582,9 +582,11 @@ scc_copy_prop::propagate () get removed. That means parts of CFG get removed. Those may contain copy statements. For that reason we prune SCCs here. */ unsigned i; - for (i = 0; i < scc.length (); i++) + for (i = 0; i < scc.length ();) if (gimple_bb (scc[i]) == NULL) scc.unordered_remove (i); + else + i++; if (scc.is_empty ()) { scc.release ();
[gcc r16-349] gimple: Switch bit-test lowering testcases for the more powerful alg
https://gcc.gnu.org/g:8444c4cc7648f4396e2a3726677f909438e92c80 commit r16-349-g8444c4cc7648f4396e2a3726677f909438e92c80 Author: Filip Kastl Date: Thu May 1 15:32:36 2025 +0200 gimple: Switch bit-test lowering testcases for the more powerful alg This patch adds 2 testcases. One tests that GCC is able to create bit-test clusters of size 64. The other one contains two switches which GCC wouldn't completely cover with bit-test clusters before the changes from this patch set. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/switch-5.c: New test. * gcc.dg/tree-ssa/switch-6.c: New test. Signed-off-by: Filip Kastl Diff: --- gcc/testsuite/gcc.dg/tree-ssa/switch-5.c | 60 gcc/testsuite/gcc.dg/tree-ssa/switch-6.c | 51 +++ 2 files changed, 111 insertions(+) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/switch-5.c b/gcc/testsuite/gcc.dg/tree-ssa/switch-5.c new file mode 100644 index ..b05742cf153c --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/switch-5.c @@ -0,0 +1,60 @@ +/* { dg-do compile { target { { x86_64-*-* aarch64-*-* ia64-*-* powerpc64-*-* } && lp64 } } } */ +/* { dg-options "-O2 -fdump-tree-switchlower1" } */ + +int f0(); +int f1(); +int f2(); +int f3(); +int f4(); + +int foo(int a) +{ +switch (a) +{ +case 0: +case 2: +case 4: +case 6: +return f0(); +case 8: +return f1(); +case 10: +case 14: +case 16: +case 18: +return f2(); +case 12: +return f3(); +case 20: +return f4(); +} +return -1; +} + +/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: BT:0-8 BT:10-20" "switchlower1" } } */ + +int bar(int a) +{ +switch (a) +{ +case 20: +case 18: +case 16: +case 14: +return f0(); +case 12: +return f1(); +case 10: +case 6: +case 4: +case 2: +return f2(); +case 8: +return f3(); +case 0: +return f4(); +} +return -1; +} + +/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: BT:0-10 BT:12-20" "switchlower1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/switch-6.c b/gcc/testsuite/gcc.dg/tree-ssa/switch-6.c new file mode 100644 index ..bbbc87462c40 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/switch-6.c @@ -0,0 +1,51 @@ +/* { dg-do compile { target { { x86_64-*-* aarch64-*-* ia64-*-* powerpc64-*-* } && lp64 } } } */ +/* { dg-options "-O2 -fdump-tree-switchlower1 -fno-jump-tables" } */ + +/* Test that bit-test switch lowering can create cluster of size 64 (there was + an of-by-one error causing it to only do 63 before). */ + +int f(); + +int foo(int a) +{ +switch (a) +{ +case 0: +case 3: +case 5: +case 7: +case 9: +case 11: +case 13: +case 15: +case 17: +case 19: +case 21: +case 23: +case 25: +case 27: +case 29: +case 31: +case 33: +case 35: +case 37: +case 39: +case 41: +case 43: +case 45: +case 47: +case 49: +case 51: +case 53: +case 55: +case 57: +case 59: +case 61: +case 63: +return f(); +default: +return -1; +} +} + +/* { dg-final { scan-tree-dump ";; GIMPLE switch case clusters: BT:0-63" "switchlower1" } } */
[gcc r16-347] gimple: Make bit-test switch lowering more powerful
https://gcc.gnu.org/g:1381a5114788a2e9234ff54e0cd7a3c810f0d02d commit r16-347-g1381a5114788a2e9234ff54e0cd7a3c810f0d02d Author: Filip Kastl Date: Thu May 1 15:31:30 2025 +0200 gimple: Make bit-test switch lowering more powerful A reasonable goal for bit-test lowering is to produce the least amount of clusters for a given switch (a cluster is basically a group of cases that can be handled by constantly many operations). The current algorithm doesn't always give optimal solutions in that sense. This patch should fix this. The important thing is basically just to ask if a cluster is_beneficial() more proactively. The patch also has a fix for a mistake which made bit-test lowering only create BITS_IN_WORD - 1 big clusters. There are also some new comments that go into more detail on the dynamic programming algorithm. gcc/ChangeLog: * tree-switch-conversion.cc (bit_test_cluster::find_bit_tests): Modify the dynamic programming algorithm to take is_beneficial() into account earlier. To do this efficiently, copy some logic from is_beneficial() here. Add detailed comments about how the DP algorithm works. (bit_test_cluster::can_be_handled): Check that the cluster range is >, not >= BITS_IN_WORD. Remove the "vec &, unsigned, unsigned" overloaded variant since we no longer need it. (bit_test_cluster::is_beneficial): Add a comment that this function is closely tied to m_max_case_bit_tests. Remove the "vec &, unsigned, unsigned" overloaded variant since we no longer need it. * tree-switch-conversion.h: Remove the vec overloaded variants of bit_test_cluster::is_beneficial and bit_test_cluster::can_be_handled. Signed-off-by: Filip Kastl Diff: --- gcc/tree-switch-conversion.cc | 153 ++ gcc/tree-switch-conversion.h | 10 --- 2 files changed, 67 insertions(+), 96 deletions(-) diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc index a70274b03372..4f0be8c43f07 100644 --- a/gcc/tree-switch-conversion.cc +++ b/gcc/tree-switch-conversion.cc @@ -1783,58 +1783,98 @@ bit_test_cluster::find_bit_tests (vec &clusters, int max_c) if (!is_enabled () || max_c == 1) return clusters.copy (); + /* Dynamic programming algorithm. + + In: List of simple clusters + Out: List of simple clusters and bit test clusters such that each bit test + cluster can_be_handled() and is_beneficial() + + Tries to merge consecutive clusters into bigger (bit test) ones. Tries to + end up with as few clusters as possible. */ + unsigned l = clusters.length (); auto_vec min; min.reserve (l + 1); - min.quick_push (min_cluster_item (0, 0, 0)); + gcc_checking_assert (l > 0); + gcc_checking_assert (l <= INT_MAX); - unsigned bits_in_word = GET_MODE_BITSIZE (word_mode); + int bits_in_word = GET_MODE_BITSIZE (word_mode); - for (unsigned i = 1; i <= l; i++) + /* First phase: Compute the minimum number of clusters for each prefix of the + input list incrementally + + min[i] = (count, j, _) means that the prefix ending with the (i-1)-th + element can be made to contain as few as count clusters and that in such + clustering the last cluster is made up of input clusters [j, i-1] + (inclusive). */ + min.quick_push (min_cluster_item (0, 0, INT_MAX)); + min.quick_push (min_cluster_item (1, 0, INT_MAX)); + for (int i = 2; i <= (int) l; i++) { - /* Set minimal # of clusters with i-th item to infinite. */ - min.quick_push (min_cluster_item (INT_MAX, INT_MAX, INT_MAX)); + auto_vec unique_labels; /* Since each cluster contains at least one case number and one bit test cluster can cover at most bits_in_word case numbers, we don't need to look farther than bits_in_word clusters back. */ - unsigned j; - if (i - 1 >= bits_in_word) - j = i - 1 - bits_in_word; - else - j = 0; - for (; j < i; j++) + for (int j = i - 1; j >= 0 && j >= i - bits_in_word; j--) { - if (min[j].m_count + 1 < min[i].m_count - && can_be_handled (clusters, j, i - 1)) - min[i] = min_cluster_item (min[j].m_count + 1, j, INT_MAX); - } + /* Consider creating a bit test cluster from input clusters [j, i-1] +(inclusive) */ - gcc_checking_assert (min[i].m_count != INT_MAX); + simple_cluster *sc = static_cast (clusters[j]); + unsigned label = sc->m_case_bb->index; + if (!unique_labels.contains (label)) + { + if (unique_labels.length () >= m_max_case_bit_tests) + /* is_beneficial() will be false for this and the following + iterations. */ + break; +
[gcc r16-348] gimple: Don't warn about using different algs for big switch lowering [PR117091]
https://gcc.gnu.org/g:c14560907a9586ad405f26ab937881eb08f39497 commit r16-348-gc14560907a9586ad405f26ab937881eb08f39497 Author: Filip Kastl Date: Thu May 1 15:32:07 2025 +0200 gimple: Don't warn about using different algs for big switch lowering [PR117091] We currently don't switch to a faster switch lowering algorithm when a switch is too big. This patch removes a warning about this. PR middle-end/117091 gcc/ChangeLog: * tree-switch-conversion.cc (switch_decision_tree::analyze_switch_statement): Remove warning about using different algorithms. Signed-off-by: Filip Kastl Diff: --- gcc/tree-switch-conversion.cc | 7 --- 1 file changed, 7 deletions(-) diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc index 4f0be8c43f07..dea217a01efb 100644 --- a/gcc/tree-switch-conversion.cc +++ b/gcc/tree-switch-conversion.cc @@ -2257,13 +2257,6 @@ switch_decision_tree::analyze_switch_statement () reset_out_edges_aux (m_switch); - if (l > (unsigned) param_switch_lower_slow_alg_max_cases) -warning_at (gimple_location (m_switch), OPT_Wdisabled_optimization, - "Using faster switch lowering algorithms. " - "Number of switch cases (%d) exceeds " - "%<--param=switch-lower-slow-alg-max-cases=%d%> limit.", - l, param_switch_lower_slow_alg_max_cases); - /* Find bit-test clusters. */ vec output = bit_test_cluster::find_bit_tests (clusters, max_c);
[gcc r16-346] gimple: Merge slow and fast bit-test switch lowering [PR117091]
https://gcc.gnu.org/g:5274db0c9b8c0e2d2879b237eb2ab576543b6c37 commit r16-346-g5274db0c9b8c0e2d2879b237eb2ab576543b6c37 Author: Filip Kastl Date: Thu May 1 15:30:52 2025 +0200 gimple: Merge slow and fast bit-test switch lowering [PR117091] PR117091 showed that bit-test switch lowering can take a lot of time. The algorithm was O(n^2). We therefore came up with a faster algorithm (O(n * BITS_IN_WORD)) and made GCC choose between the slow and the fast algorithm based on how big the switch is. Here I combine the algorithms so that we get the results of the slower algorithm in the faster asymptotic time. PR middle-end/117091 gcc/ChangeLog: * tree-switch-conversion.cc (bit_test_cluster::find_bit_tests_fast): Remove function. (bit_test_cluster::find_bit_tests_slow): Remove function. (bit_test_cluster::find_bit_tests): We don't need to decide between slow and fast so just put the modified (no longer) slow algorithm here. Signed-off-by: Filip Kastl Diff: --- gcc/tree-switch-conversion.cc | 107 +++--- 1 file changed, 17 insertions(+), 90 deletions(-) diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc index 39a8a893edde..a70274b03372 100644 --- a/gcc/tree-switch-conversion.cc +++ b/gcc/tree-switch-conversion.cc @@ -1773,92 +1773,38 @@ jump_table_cluster::is_beneficial (const vec &, return end - start + 1 >= case_values_threshold (); } -/* Find bit tests of given CLUSTERS, where all members of the vector are of - type simple_cluster. Use a fast algorithm that might not find the optimal - solution (minimal number of clusters on the output). New clusters are - returned. - - You should call find_bit_tests () instead of calling this function - directly. */ - -vec -bit_test_cluster::find_bit_tests_fast (vec &clusters) -{ - unsigned l = clusters.length (); - vec output; - - output.create (l); - - /* Look at sliding BITS_PER_WORD sized windows in the switch value space - and determine if they are suitable for a bit test cluster. Worst case - this can examine every value BITS_PER_WORD-1 times. */ - unsigned k; - for (unsigned i = 0; i < l; i += k) -{ - hash_set targets; - cluster *start_cluster = clusters[i]; - - /* Find the biggest k such that clusters i to i+k-1 can be turned into a -one big bit test cluster. */ - k = 0; - while (i + k < l) - { - cluster *end_cluster = clusters[i + k]; - - /* Does value range fit into the BITS_PER_WORD window? */ - HOST_WIDE_INT w = cluster::get_range (start_cluster->get_low (), - end_cluster->get_high ()); - if (w == 0 || w > BITS_PER_WORD) - break; - - /* Check for max # of targets. */ - if (targets.elements () == m_max_case_bit_tests - && !targets.contains (end_cluster->m_case_bb)) - break; - - targets.add (end_cluster->m_case_bb); - k++; - } - - if (is_beneficial (k, targets.elements ())) - { - output.safe_push (new bit_test_cluster (clusters, i, i + k - 1, - i == 0 && k == l)); - } - else - { - output.safe_push (clusters[i]); - /* ??? Might be able to skip more. */ - k = 1; - } -} - - return output; -} - /* Find bit tests of given CLUSTERS, where all members of the vector - are of type simple_cluster. Use a slow (quadratic) algorithm that always - finds the optimal solution (minimal number of clusters on the output). New - clusters are returned. - - You should call find_bit_tests () instead of calling this function - directly. */ + are of type simple_cluster. MAX_C is the approx max number of cases per + label. New clusters are returned. */ vec -bit_test_cluster::find_bit_tests_slow (vec &clusters) +bit_test_cluster::find_bit_tests (vec &clusters, int max_c) { + if (!is_enabled () || max_c == 1) +return clusters.copy (); + unsigned l = clusters.length (); auto_vec min; min.reserve (l + 1); min.quick_push (min_cluster_item (0, 0, 0)); + unsigned bits_in_word = GET_MODE_BITSIZE (word_mode); + for (unsigned i = 1; i <= l; i++) { /* Set minimal # of clusters with i-th item to infinite. */ min.quick_push (min_cluster_item (INT_MAX, INT_MAX, INT_MAX)); - for (unsigned j = 0; j < i; j++) + /* Since each cluster contains at least one case number and one bit test +cluster can cover at most bits_in_word case numbers, we don't need to +look farther than bits_in_word clusters back. */ + unsigned j; + if (i - 1 >= bits_in_word) + j = i - 1 - bits_in_word; + else + j = 0; + for (; j < i; j++) {
[gcc r14-11373] gimple: sccopy: Prune removed statements from SCCs [PR117919]
https://gcc.gnu.org/g:6ffbc711afbda9446df51fd2b542ecd61853283d commit r14-11373-g6ffbc711afbda9446df51fd2b542ecd61853283d Author: Filip Kastl Date: Sun Mar 2 06:39:17 2025 +0100 gimple: sccopy: Prune removed statements from SCCs [PR117919] While writing the sccopy pass I didn't realize that 'replace_uses_by ()' can remove portions of the CFG. This happens when replacing arguments of some statement results in the removal of an EH edge. Because of this sccopy can then work with GIMPLE statements that aren't part of the IR anymore. In PR117919 this triggered an assertion within the pass which assumes that statements the pass works with are reachable. This patch tells the pass to notice when a statement isn't in the IR anymore and remove it from it's worklist. PR tree-optimization/117919 gcc/ChangeLog: * gimple-ssa-sccopy.cc (scc_copy_prop::propagate): Prune statements that 'replace_uses_by ()' removed. gcc/testsuite/ChangeLog: * g++.dg/pr117919.C: New test. Signed-off-by: Filip Kastl (cherry picked from commit 5349aa2accdf34a7bf9cabd1447878aaadfc0e87) Diff: --- gcc/gimple-ssa-sccopy.cc| 13 +++ gcc/testsuite/g++.dg/pr117919.C | 52 + 2 files changed, 65 insertions(+) diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc index 138ee9a0ac48..d4d06f3b13e7 100644 --- a/gcc/gimple-ssa-sccopy.cc +++ b/gcc/gimple-ssa-sccopy.cc @@ -550,6 +550,19 @@ sccopy_propagate () { vec scc = worklist.pop (); + /* When we do 'replace_scc_by_value' it may happen that some EH edges +get removed. That means parts of CFG get removed. Those may +contain copy statements. For that reason we prune SCCs here. */ + unsigned i; + for (i = 0; i < scc.length (); i++) + if (gimple_bb (scc[i]) == NULL) + scc.unordered_remove (i); + if (scc.is_empty ()) + { + scc.release (); + continue; + } + auto_vec inner; hash_set outer_ops; tree last_outer_op = NULL_TREE; diff --git a/gcc/testsuite/g++.dg/pr117919.C b/gcc/testsuite/g++.dg/pr117919.C new file mode 100644 index ..fa2d9c9cd1e5 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr117919.C @@ -0,0 +1,52 @@ +/* PR tree-optimization/117919 */ +/* { dg-do compile } */ +/* { dg-options "-O1 -fno-tree-forwprop -fnon-call-exceptions --param=early-inlining-insns=192 -std=c++20" } */ + +char _M_p, _M_construct___beg; +struct _Alloc_hider { + _Alloc_hider(char); +}; +long _M_string_length; +void _M_destroy(); +void _S_copy_chars(char *, char *, char *) noexcept; +char _M_local_data(); +struct Trans_NS___cxx11_basic_string { + _Alloc_hider _M_dataplus; + bool _M_is_local() { +if (_M_local_data()) + if (_M_string_length) +return true; +return false; + } + void _M_dispose() { +if (!_M_is_local()) + _M_destroy(); + } + char *_M_construct___end; + Trans_NS___cxx11_basic_string(Trans_NS___cxx11_basic_string &) + : _M_dataplus(0) { +struct _Guard { + ~_Guard() { _M_guarded->_M_dispose(); } + Trans_NS___cxx11_basic_string *_M_guarded; +} __guard0; +_S_copy_chars(&_M_p, &_M_construct___beg, _M_construct___end); + } +}; +namespace filesystem { +struct path { + path(); + Trans_NS___cxx11_basic_string _M_pathname; +}; +} // namespace filesystem +struct FileWriter { + filesystem::path path; + FileWriter() : path(path) {} +}; +struct LanguageFileWriter : FileWriter { + LanguageFileWriter(filesystem::path) {} +}; +int +main() { + filesystem::path output_file; + LanguageFileWriter writer(output_file); +}
[gcc r15-7779] gimple: sccopy: Prune removed statements from SCCs [PR117919]
https://gcc.gnu.org/g:5349aa2accdf34a7bf9cabd1447878aaadfc0e87 commit r15-7779-g5349aa2accdf34a7bf9cabd1447878aaadfc0e87 Author: Filip Kastl Date: Sun Mar 2 06:39:17 2025 +0100 gimple: sccopy: Prune removed statements from SCCs [PR117919] While writing the sccopy pass I didn't realize that 'replace_uses_by ()' can remove portions of the CFG. This happens when replacing arguments of some statement results in the removal of an EH edge. Because of this sccopy can then work with GIMPLE statements that aren't part of the IR anymore. In PR117919 this triggered an assertion within the pass which assumes that statements the pass works with are reachable. This patch tells the pass to notice when a statement isn't in the IR anymore and remove it from it's worklist. PR tree-optimization/117919 gcc/ChangeLog: * gimple-ssa-sccopy.cc (scc_copy_prop::propagate): Prune statements that 'replace_uses_by ()' removed. gcc/testsuite/ChangeLog: * g++.dg/pr117919.C: New test. Signed-off-by: Filip Kastl Diff: --- gcc/gimple-ssa-sccopy.cc| 13 +++ gcc/testsuite/g++.dg/pr117919.C | 52 + 2 files changed, 65 insertions(+) diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc index 9f25fbaff365..7ffb5718ab6b 100644 --- a/gcc/gimple-ssa-sccopy.cc +++ b/gcc/gimple-ssa-sccopy.cc @@ -568,6 +568,19 @@ scc_copy_prop::propagate () { vec scc = worklist.pop (); + /* When we do 'replace_scc_by_value' it may happen that some EH edges +get removed. That means parts of CFG get removed. Those may +contain copy statements. For that reason we prune SCCs here. */ + unsigned i; + for (i = 0; i < scc.length (); i++) + if (gimple_bb (scc[i]) == NULL) + scc.unordered_remove (i); + if (scc.is_empty ()) + { + scc.release (); + continue; + } + auto_vec inner; hash_set outer_ops; tree last_outer_op = NULL_TREE; diff --git a/gcc/testsuite/g++.dg/pr117919.C b/gcc/testsuite/g++.dg/pr117919.C new file mode 100644 index ..fa2d9c9cd1e5 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr117919.C @@ -0,0 +1,52 @@ +/* PR tree-optimization/117919 */ +/* { dg-do compile } */ +/* { dg-options "-O1 -fno-tree-forwprop -fnon-call-exceptions --param=early-inlining-insns=192 -std=c++20" } */ + +char _M_p, _M_construct___beg; +struct _Alloc_hider { + _Alloc_hider(char); +}; +long _M_string_length; +void _M_destroy(); +void _S_copy_chars(char *, char *, char *) noexcept; +char _M_local_data(); +struct Trans_NS___cxx11_basic_string { + _Alloc_hider _M_dataplus; + bool _M_is_local() { +if (_M_local_data()) + if (_M_string_length) +return true; +return false; + } + void _M_dispose() { +if (!_M_is_local()) + _M_destroy(); + } + char *_M_construct___end; + Trans_NS___cxx11_basic_string(Trans_NS___cxx11_basic_string &) + : _M_dataplus(0) { +struct _Guard { + ~_Guard() { _M_guarded->_M_dispose(); } + Trans_NS___cxx11_basic_string *_M_guarded; +} __guard0; +_S_copy_chars(&_M_p, &_M_construct___beg, _M_construct___end); + } +}; +namespace filesystem { +struct path { + path(); + Trans_NS___cxx11_basic_string _M_pathname; +}; +} // namespace filesystem +struct FileWriter { + filesystem::path path; + FileWriter() : path(path) {} +}; +struct LanguageFileWriter : FileWriter { + LanguageFileWriter(filesystem::path) {} +}; +int +main() { + filesystem::path output_file; + LanguageFileWriter writer(output_file); +}
[gcc r15-7651] invoke.texi: Fix typo in the file-cache-lines param
https://gcc.gnu.org/g:a42374b60884d9ac4ff47e7787b32142526ac666 commit r15-7651-ga42374b60884d9ac4ff47e7787b32142526ac666 Author: Filip Kastl Date: Thu Feb 20 13:20:34 2025 +0100 invoke.texi: Fix typo in the file-cache-lines param file-cache-lines param was documented as file-cache-files. This fixes the typo. gcc/ChangeLog: * doc/invoke.texi: Fix typo file-cache-files -> file-cache-lines. Signed-off-by: Filip Kastl Diff: --- gcc/doc/invoke.texi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 0c7adc039b5d..bad49a017cc1 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -15787,7 +15787,7 @@ Max number of files in the file cache. The file cache is used to print source lines in diagnostics and do some source checks like @option{-Wmisleading-indentation}. -@item file-cache-files +@item file-cache-lines Max number of lines to index into file cache. When 0 this is automatically sized. The file cache is used to print source lines in diagnostics and do some source checks like @option{-Wmisleading-indentation}.
[gcc r16-514] testsuite: Disable bit tests in aarch64/pr99988.c
https://gcc.gnu.org/g:afaed441b5b096376afd15cb58c9a8a567fecdcf commit r16-514-gafaed441b5b096376afd15cb58c9a8a567fecdcf Author: Filip Kastl Date: Sat May 10 18:30:23 2025 +0200 testsuite: Disable bit tests in aarch64/pr99988.c My recent changes to bit-test switch lowering broke pr99988.c testcase. The testcase assumes a switch will be lowered using jump tables. Make the testcase run with -fno-bit-tests. Pushed as obvious. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr99988.c: Add -fno-bit-tests. Signed-off-by: Filip Kastl Diff: --- gcc/testsuite/gcc.target/aarch64/pr99988.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/aarch64/pr99988.c b/gcc/testsuite/gcc.target/aarch64/pr99988.c index 7cca49629446..c09ce67c0fa9 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr99988.c +++ b/gcc/testsuite/gcc.target/aarch64/pr99988.c @@ -1,5 +1,5 @@ /* { dg-do compile { target lp64 } } */ -/* { dg-options "-O2 -mbranch-protection=standard" } */ +/* { dg-options "-O2 -mbranch-protection=standard -fno-bit-tests" } */ /* { dg-final { scan-assembler-times {bti j} 13 } } */ int a; int c();
[gcc r16-513] gimple: Don't assert that switch has nondefault cases during lowering [PR120080]
https://gcc.gnu.org/g:358a5aedf2b5b61f4edfc7964144355a4897dbb9 commit r16-513-g358a5aedf2b5b61f4edfc7964144355a4897dbb9 Author: Filip Kastl Date: Sat May 10 16:18:33 2025 +0200 gimple: Don't assert that switch has nondefault cases during lowering [PR120080] I have mistakenly assumed that switch lowering cannot encounter a switch with zero clusters. This patch removes the relevant assert and instead gives up bit-test lowering when this happens. PR tree-optimization/120080 gcc/ChangeLog: * tree-switch-conversion.cc (bit_test_cluster::find_bit_tests): Replace assert with return. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr120080.c: New test. Signed-off-by: Filip Kastl Diff: --- gcc/testsuite/gcc.dg/tree-ssa/pr120080.c | 26 ++ gcc/tree-switch-conversion.cc| 8 +--- 2 files changed, 31 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr120080.c b/gcc/testsuite/gcc.dg/tree-ssa/pr120080.c new file mode 100644 index ..d71ef5e9dd05 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr120080.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-fgimple -O2" } */ + +void __GIMPLE (ssa,startwith("switchlower1")) +foo (int b) +{ + __BB(2): + switch (b) {default: L9; case 0: L5; case 5: L5; case 101: L5; } + + __BB(3): +L9: + switch (b) {default: L7; case 5: L6; case 101: L6; } + + __BB(4): +L6: + __builtin_unreachable (); + + __BB(5): +L7: + __builtin_trap (); + + __BB(6): +L5: + return; + +} diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc index dea217a01efb..bd4de966892c 100644 --- a/gcc/tree-switch-conversion.cc +++ b/gcc/tree-switch-conversion.cc @@ -1793,12 +1793,14 @@ bit_test_cluster::find_bit_tests (vec &clusters, int max_c) end up with as few clusters as possible. */ unsigned l = clusters.length (); - auto_vec min; - min.reserve (l + 1); - gcc_checking_assert (l > 0); + if (l == 0) +return clusters.copy (); gcc_checking_assert (l <= INT_MAX); + auto_vec min; + min.reserve (l + 1); + int bits_in_word = GET_MODE_BITSIZE (word_mode); /* First phase: Compute the minimum number of clusters for each prefix of the
[gcc r16-2245] fortran: Fix indentation
https://gcc.gnu.org/g:48479558b5d687460ab547625a98b851ec422476 commit r16-2245-g48479558b5d687460ab547625a98b851ec422476 Author: Filip Kastl Date: Tue Jul 15 08:39:00 2025 +0200 fortran: Fix indentation Move a block of code two spaces to the left. Commiting as obvious. gcc/fortran/ChangeLog: * resolve.cc (resolve_select_type): Fix indentation. Signed-off-by: Filip Kastl Diff: --- gcc/fortran/resolve.cc | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index 93df5d014fa2..c33bd17da2dc 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -11014,16 +11014,16 @@ resolve_select_type (gfc_code *code, gfc_namespace *old_ns) that does precisely this here (instead of using the 'global' one). */ - /* First check the derived type import status. */ - if (gfc_current_ns->import_state != IMPORT_NOT_SET - && (c->ts.type == BT_DERIVED || c->ts.type == BT_CLASS)) - { - st = gfc_find_symtree (gfc_current_ns->sym_root, - c->ts.u.derived->name); - if (!check_sym_import_status (c->ts.u.derived, st, NULL, old_code, - gfc_current_ns)) - error++; - } + /* First check the derived type import status. */ + if (gfc_current_ns->import_state != IMPORT_NOT_SET + && (c->ts.type == BT_DERIVED || c->ts.type == BT_CLASS)) + { + st = gfc_find_symtree (gfc_current_ns->sym_root, +c->ts.u.derived->name); + if (!check_sym_import_status (c->ts.u.derived, st, NULL, old_code, + gfc_current_ns)) + error++; + } const char * var_name = gfc_var_name_for_select_type_temp (orig_expr1); if (c->ts.type == BT_CLASS)
[gcc r16-2322] tree-ssa-structalias / pta: Fix some GNU coding style deviations
https://gcc.gnu.org/g:a33f9d1cb7f239b399e36dfecb2ed9919cab20f9 commit r16-2322-ga33f9d1cb7f239b399e36dfecb2ed9919cab20f9 Author: Filip Kastl Date: Thu Jul 17 14:30:11 2025 +0200 tree-ssa-structalias / pta: Fix some GNU coding style deviations Fix some deviations from GNU coding style in pta files as reported by contrib/check_GNU_style.py. Most of these are "dot, space, space, end of comment". Commiting as obvious. gcc/ChangeLog: * pta-andersen.cc (struct constraint_graph): Fix GNU style. (constraint_equal): Fix GNU style. (set_union_with_increment): Fix GNU style. (insert_into_complex): Fix GNU style. (merge_node_constraints): Fix GNU style. (unify_nodes): Fix GNU style. (do_ds_constraint): Fix GNU style. (scc_info::scc_info): Fix GNU style. (find_indirect_cycles): Fix GNU style. (equiv_class_lookup_or_add): Fix GNU style. (label_visit): Fix GNU style. (dump_pred_graph): Fix GNU style. (perform_var_substitution): Fix GNU style. (eliminate_indirect_cycles): Fix GNU style. (solve_graph): Fix GNU style. (solve_constraints): Fix GNU style. * tree-ssa-structalias.cc (first_vi_for_offset): Fix GNU style. (debug_constraint): Fix GNU style. * tree-ssa-structalias.h (struct constraint_expr): Fix GNU style. (struct variable_info): Fix GNU style. Signed-off-by: Filip Kastl Diff: --- gcc/pta-andersen.cc | 41 + gcc/tree-ssa-structalias.cc | 4 ++-- gcc/tree-ssa-structalias.h | 10 +- 3 files changed, 28 insertions(+), 27 deletions(-) diff --git a/gcc/pta-andersen.cc b/gcc/pta-andersen.cc index e4cc3af14711..0253f056d2ab 100644 --- a/gcc/pta-andersen.cc +++ b/gcc/pta-andersen.cc @@ -39,7 +39,7 @@ using namespace pointer_analysis; -/* Used for predecessor bitmaps. */ +/* Used for predecessor bitmaps. */ static bitmap_obstack predbitmap_obstack; /* Used for per-solver-iteration bitmaps. */ @@ -56,7 +56,7 @@ struct constraint_graph nodes in the variable map. */ unsigned int size; - /* Explicit successors of each node. */ + /* Explicit successors of each node. */ bitmap *succs; /* Implicit predecessors of each node (Used for variable @@ -71,7 +71,7 @@ struct constraint_graph int *indirect_cycles; /* Representative node for a node. rep[a] == a unless the node has - been unified. */ + been unified. */ unsigned int *rep; /* Equivalence class representative for a label. This is used for @@ -330,7 +330,7 @@ constraint_equal (const constraint &a, const constraint &b) && constraint_expr_equal (a.rhs, b.rhs); } -/* Find a constraint LOOKFOR in the sorted constraint vector VEC */ +/* Find a constraint LOOKFOR in the sorted constraint vector VEC. */ static constraint_t constraint_vec_find (vec vec, @@ -430,8 +430,8 @@ solution_set_expand (bitmap set, bitmap *expanded) process. */ static bool -set_union_with_increment (bitmap to, bitmap delta, HOST_WIDE_INT inc, - bitmap *expanded_delta) +set_union_with_increment (bitmap to, bitmap delta, HOST_WIDE_INT inc, + bitmap *expanded_delta) { bool changed = false; bitmap_iterator bi; @@ -510,7 +510,7 @@ insert_into_complex (constraint_graph_t graph, /* Condense two variable nodes into a single variable node, by moving - all associated info from FROM to TO. Returns true if TO node's + all associated info from FROM to TO. Returns true if TO node's constraint set changes after the merge. */ static bool @@ -523,7 +523,7 @@ merge_node_constraints (constraint_graph_t graph, unsigned int to, gcc_checking_assert (find (from) == to); - /* Move all complex constraints from src node into to node */ + /* Move all complex constraints from src node into to node. */ FOR_EACH_VEC_ELT (graph->complex[from], i, c) { /* In complex constraints for node FROM, we may have either @@ -998,7 +998,7 @@ unify_nodes (constraint_graph_t graph, unsigned int to, unsigned int from, bitmap_set_bit (changed, to); } - /* Mark TO as changed if FROM was changed. If TO was already marked + /* Mark TO as changed if FROM was changed. If TO was already marked as changed, decrease the changed count. */ if (update_changed @@ -1211,7 +1211,7 @@ do_ds_constraint (constraint_t c, bitmap delta, bitmap *expanded_delta) t = find (v->id); - if (solve_add_graph_edge (graph, t, rhs)) + if (solve_add_graph_edge (graph, t, rhs)) bitmap_set_bit (changed, t); } @@ -1281,7 +1281,7 @@ scc_info::scc_info (size_t size) : node_mapping[i] = i; } -/* Free an SCC info structure pointed to by SI */
[gcc r16-2324] tree-ssa-structalias / pta: Fix *more* GNU coding style deviations
https://gcc.gnu.org/g:fcf22e1764b3c242cf6b240b71a46403c7f112ae commit r16-2324-gfcf22e1764b3c242cf6b240b71a46403c7f112ae Author: Filip Kastl Date: Thu Jul 17 14:52:59 2025 +0200 tree-ssa-structalias / pta: Fix *more* GNU coding style deviations This continues my previous commit, where I fixed some deviations from GNU coding style in pta files. This should fix all the remaining issues that contrib/check_GNU_style.py can detect (excluding false positives). Commiting as obvious. gcc/ChangeLog: * tree-ssa-structalias.cc (lookup_vi_for_tree): Fix GNU style. (process_constraint): Fix GNU style. (get_constraint_for_component_ref): Fix GNU style. (get_constraint_for_1): Fix GNU style. (get_function_part_constraint): Fix GNU style. (handle_lhs_call): Fix GNU style. (find_func_aliases_for_builtin_call): Fix GNU style. (find_func_aliases): Fix GNU style. (find_func_clobbers): Fix GNU style. (struct shared_bitmap_hasher): Fix GNU style. (shared_bitmap_hasher::hash): Fix GNU style. (pt_solution_includes_global): Fix GNU style. (init_base_vars): Fix GNU style. (visit_loadstore): Fix GNU style. (compute_dependence_clique): Fix GNU style. (struct pt_solution): Fix GNU style. (ipa_pta_execute): Fix GNU style. Signed-off-by: Filip Kastl Diff: --- gcc/tree-ssa-structalias.cc | 71 +++-- 1 file changed, 36 insertions(+), 35 deletions(-) diff --git a/gcc/tree-ssa-structalias.cc b/gcc/tree-ssa-structalias.cc index b6af061d16bc..fd22a942c386 100644 --- a/gcc/tree-ssa-structalias.cc +++ b/gcc/tree-ssa-structalias.cc @@ -209,7 +209,7 @@ namespace pointer_analysis { /* Used for points-to sets. */ bitmap_obstack pta_obstack; -/* Used for oldsolution members of variables. */ +/* Used for oldsolution members of variables. */ bitmap_obstack oldpta_obstack; /* Table of variable info structures for constraint variables. @@ -734,7 +734,7 @@ lookup_vi_for_tree (tree t) return *slot; } -/* Return a printable name for DECL */ +/* Return a printable name for DECL. */ static const char * alias_get_name (tree decl) @@ -922,10 +922,10 @@ process_constraint (constraint_t t) if (!get_varinfo (lhs.var)->may_have_pointers) return; - /* This can happen in our IR with things like n->a = *p */ + /* This can happen in our IR with things like n->a = *p. */ if (rhs.type == DEREF && lhs.type == DEREF && rhs.var != anything_id) { - /* Split into tmp = *rhs, *lhs = tmp */ + /* Split into tmp = *rhs, *lhs = tmp. */ struct constraint_expr tmplhs; tmplhs = new_scalar_tmp_constraint_exp ("doubledereftmp", true); process_constraint (new_constraint (tmplhs, rhs)); @@ -933,7 +933,7 @@ process_constraint (constraint_t t) } else if ((rhs.type != SCALAR || rhs.offset != 0) && lhs.type == DEREF) { - /* Split into tmp = &rhs, *lhs = tmp */ + /* Split into tmp = &rhs, *lhs = tmp. */ struct constraint_expr tmplhs; tmplhs = new_scalar_tmp_constraint_exp ("derefaddrtmp", true); process_constraint (new_constraint (tmplhs, rhs)); @@ -1101,7 +1101,7 @@ get_constraint_for_component_ref (tree t, vec *results, tree forzero; /* Some people like to do cute things like take the address of - &0->a.b */ + &0->a.b. */ forzero = t; while (handled_component_p (forzero) || INDIRECT_REF_P (forzero) @@ -1175,7 +1175,7 @@ get_constraint_for_component_ref (tree t, vec *results, { /* In languages like C, you can access one past the end of an array. You aren't allowed to dereference it, so we can -ignore this constraint. When we handle pointer subtraction, +ignore this constraint. When we handle pointer subtraction, we may have to do something cute here. */ if (maybe_lt (poly_uint64 (bitpos), get_varinfo (result.var)->fullsize) @@ -1212,7 +1212,7 @@ get_constraint_for_component_ref (tree t, vec *results, results->safe_push (cexpr); } else if (results->length () == 0) - /* Assert that we found *some* field there. The user couldn't be + /* Assert that we found *some* field there. The user couldn't be accessing *only* padding. */ /* Still the user could access one past the end of an array embedded in a struct resulting in accessing *only* padding. */ @@ -1266,7 +1266,7 @@ get_constraint_for_component_ref (tree t, vec *results, /* Dereference the constraint expression CONS, and return the result. DEREF (ADDRESSOF) = SCALAR DEREF (SCALAR) = DEREF - DEREF (DEREF) = (temp = DEREF1; result = DEREF(temp)) + DEREF (DEREF) = (temp = DEREF1; result = DEREF (temp)) This is needed
[gcc r16-1764] contrib/mklog.py: Fix writing to a global variable
https://gcc.gnu.org/g:77ac2ca0fe69b4464050c293076b0fe8a32acd05 commit r16-1764-g77ac2ca0fe69b4464050c293076b0fe8a32acd05 Author: Filip Kastl Date: Sun Jun 29 10:16:35 2025 +0200 contrib/mklog.py: Fix writing to a global variable The last patch of mklog.py put top-level code into function 'main()'. Because of this, writing to global variable 'root' has to be preceded by explicitly declaring 'root' as global. Otherwise the write only has a local effect. Without this change, the '-d' cmdline flag would be broken. Commited as obvious. contrib/ChangeLog: * mklog.py: In 'main()', specify variable 'root' as global. Signed-off-by: Filip Kastl Diff: --- contrib/mklog.py | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/mklog.py b/contrib/mklog.py index 26d4156b0340..b841ef0ae97e 100755 --- a/contrib/mklog.py +++ b/contrib/mklog.py @@ -389,6 +389,7 @@ def main(): if args.input == '-': args.input = None if args.directory: +global root root = args.directory data = open(args.input, newline='\n') if args.input else sys.stdin
[gcc r16-2728] invoke.texi: Update docs of -fdump-{rtl, tree}--
https://gcc.gnu.org/g:57a97725a5c493bd8cde0b0c5679099b1a23c795 commit r16-2728-g57a97725a5c493bd8cde0b0c5679099b1a23c795 Author: Filip Kastl Date: Mon Aug 4 08:32:39 2025 +0200 invoke.texi: Update docs of -fdump-{rtl,tree}-- This patch changes two things. Firstly, we document -fdump-rtl--graph and other such options under -fdump-tree. At least write a remark about this under -fdump-rtl. Secondly, the documentation incorrectly says that -fdump-tree--graph is not implemented. Change that. gcc/ChangeLog: * doc/invoke.texi: Add remark about -options being documented under -fdump-tree. Remove remark about -graph working only for RTL. Signed-off-by: Filip Kastl Diff: --- gcc/doc/invoke.texi | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index c1e708beacf3..105a60d849f5 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -20612,18 +20612,22 @@ LTO output files. @opindex fdump-rtl-@var{pass} @item -d@var{letters} @itemx -fdump-rtl-@var{pass} -@itemx -fdump-rtl-@var{pass}=@var{filename} +@itemx -fdump-rtl-@var{pass}-@var{options} +@itemx -fdump-rtl-@var{pass}-@var{options}=@var{filename} Says to make debugging dumps during compilation at times specified by -@var{letters}. This is used for debugging the RTL-based passes of the +@var{letters} when using @option{-d} or by @var{pass} when using +@option{-fdump-rtl}. This is used for debugging the RTL-based passes of the compiler. Some @option{-d@var{letters}} switches have different meaning when @option{-E} is used for preprocessing. @xref{Preprocessor Options}, for information about preprocessor-specific dump options. -Debug dumps can be enabled with a @option{-fdump-rtl} switch or some -@option{-d} option @var{letters}. Here are the possible -letters for use in @var{pass} and @var{letters}, and their meanings: +The @samp{-@var{options}} form allows greater control over the details of the +dump. See @option{-fdump-tree}. + +Here are actual instances of command-line options following these patterns and +their meanings: @table @gcctabopt @@ -21150,8 +21154,7 @@ GraphViz to @file{@var{file}.@var{passid}.@var{pass}.dot}. Each function in the file is pretty-printed as a subgraph, so that GraphViz can render them all in a single plot. -This option currently only works for RTL dumps, and the RTL is always -dumped in slim form. +RTL is always dumped in slim form. @item vops Enable showing virtual operands for every statement. @item lineno
[gcc r16-3238] MAINTAINERS, contrib: Appease check-MAINTAINERS.py (email order)
https://gcc.gnu.org/g:baa5cc8230738df1e738419c9b3ed0af405a5954 commit r16-3238-gbaa5cc8230738df1e738419c9b3ed0af405a5954 Author: Filip Kastl Date: Sun Aug 17 12:56:03 2025 +0200 MAINTAINERS, contrib: Appease check-MAINTAINERS.py (email order) The contrib/check-MAINTAINERS.py script sorts by surname, name, bugzilla handle and email (in this order). Document this. Switch around Andrew Pinski's entries in Contributing under DCO. Pushing as obvious. ChangeLog: * MAINTAINERS: Switch around Andrew Pinski's entries in Contributing under DCO. contrib/ChangeLog: * check-MAINTAINERS.py: Document the way the script sorts entries. Signed-off-by: Filip Kastl Diff: --- MAINTAINERS | 2 +- contrib/check-MAINTAINERS.py | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 55e33427b916..6b9e4f30732d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -961,8 +961,8 @@ Immad Mir Gaius Mulley Szabolcs Nagy Mikael Pettersson -Andrew Pinski Andrew Pinski +Andrew Pinski Siddhesh Poyarekar Ramana Radhakrishnan Navid Rahimi diff --git a/contrib/check-MAINTAINERS.py b/contrib/check-MAINTAINERS.py index ba2cdb401298..881c7570ac2d 100755 --- a/contrib/check-MAINTAINERS.py +++ b/contrib/check-MAINTAINERS.py @@ -19,8 +19,8 @@ # the Free Software Foundation, 51 Franklin Street, Fifth Floor, # Boston, MA 02110-1301, USA. -# Check that names in the file are sorted -# alphabetically by surname. +# Check that names in the file are sorted alphabetically by surname, name +# bugzilla handle and email (in this order). import locale import sys