gcc/ChangeLog:
* builtins.cc (expand_builtin_fork_or_exec): Check
profile_condition_flag.
* collect2.cc (main): Add -fno-profile-conditions to OBSTACK.
* common.opt: Add new options -fprofile-conditions and
* doc/gcov.texi: Add --conditions documentation.
* doc/invoke.texi: Add -fprofile-conditions documentation.
* gcc.cc: Link gcov on -fprofile-conditions.
* gcov-counter.def (GCOV_COUNTER_CONDS): New.
* gcov-dump.cc (tag_conditions): New.
* gcov-io.h (GCOV_TAG_CONDS): New.
(GCOV_TAG_CONDS_LENGTH): Likewise.
(GCOV_TAG_CONDS_NUM): Likewise.
* gcov.cc (class condition_info): New.
(condition_info::condition_info): New.
(condition_info::popcount): New.
(struct coverage_info): New.
(add_condition_counts): New.
(output_conditions): New.
(print_usage): Add -g, --conditions.
(process_args): Likewise.
(output_intermediate_json_line): Output conditions.
(read_graph_file): Read conditions counters.
(read_count_file): Read conditions counters.
(file_summary): Print conditions.
(accumulate_line_info): Accumulate conditions.
(output_line_details): Print conditions.
* ipa-inline.cc (can_early_inline_edge_p): Check
profile_condition_flag.
* ipa-split.cc (pass_split_functions::gate): Likewise.
* passes.cc (finish_optimization_passes): Likewise.
* profile.cc (find_conditions): New declaration.
(cov_length): Likewise.
(cov_blocks): Likewise.
(cov_masks): Likewise.
(cov_free): Likewise.
(instrument_decisions): New.
(read_thunk_profile): Control output to file.
(branch_prob): Call find_conditions, instrument_decisions.
(init_branch_prob): Add total_num_conds.
(end_branch_prob): Likewise.
* tree-profile.cc (struct conds_ctx): New.
(CONDITIONS_MAX_TERMS): New.
(EDGE_CONDITION): New.
(cmp_index_map): New.
(index_of): New.
(block_conditional_p): New.
(edge_conditional_p): New.
(single): New.
(single_edge): New.
(contract_edge): New.
(contract_edge_up): New.
(ancestors_of): New.
(struct outcomes): New.
(conditional_succs): New.
(condition_index): New.
(masking_vectors): New.
(cond_reachable_from): New.
(neighborhood): New.
(isolate_expression): New.
(emit_bitwise_op): New.
(make_index_map_visit): New.
(make_index_map): New.
(collect_conditions): New.
(yes): New.
(struct condcov): New.
(cov_length): New.
(cov_blocks): New.
(cov_masks): New.
(cov_free): New.
(find_conditions): New.
(instrument_decisions): New.
(tree_profiling): Check profile_condition_flag.
(pass_ipa_tree_profile::gate): Likewise.
gcc/testsuite/ChangeLog:
* lib/gcov.exp: Add condition coverage test function.
* g++.dg/gcov/gcov-18.C: New test.
* gcc.misc-tests/gcov-19.c: New test.
* gcc.misc-tests/gcov-20.c: New test.
* gcc.misc-tests/gcov-21.c: New test.
---
gcc/builtins.cc | 2 +-
gcc/collect2.cc | 7 +-
gcc/common.opt | 8 +
gcc/doc/gcov.texi | 37 +
gcc/doc/invoke.texi | 19 +
gcc/gcc.cc | 4 +-
gcc/gcov-counter.def | 3 +
gcc/gcov-dump.cc | 24 +
gcc/gcov-io.h | 3 +
gcc/gcov.cc | 197 +++-
gcc/ipa-inline.cc | 2 +-
gcc/ipa-split.cc | 3 +-
gcc/passes.cc | 3 +-
gcc/profile.cc | 84 +-
gcc/testsuite/g++.dg/gcov/gcov-18.C | 234 +++++
gcc/testsuite/gcc.misc-tests/gcov-19.c | 1249 ++++++++++++++++++++++++
gcc/testsuite/gcc.misc-tests/gcov-20.c | 22 +
gcc/testsuite/gcc.misc-tests/gcov-21.c | 16 +
gcc/testsuite/lib/gcov.exp | 191 +++-
gcc/tree-profile.cc | 1048 +++++++++++++++++++-
libgcc/libgcov-merge.c | 5 +
21 files changed, 3135 insertions(+), 26 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/gcov/gcov-18.C
create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-19.c
create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-20.c
create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-21.c
@@ -392,6 +394,28 @@ tag_arcs (const char *filename ATTRIBUTE_UNUSED,
}
}
+static void
+tag_conditions (const char *filename ATTRIBUTE_UNUSED,
+ unsigned tag ATTRIBUTE_UNUSED, int length ATTRIBUTE_UNUSED,
+ unsigned depth)
There should be block comments before functions
@@ -134,6 +135,28 @@ public:
vector<unsigned> lines;
};
+class condition_info
... and datastructures.
+/* Output conditions (modified condition/decision coverage) */
+
+static int flag_conditions = 0;
This is really a bool. Gcov was not updated for a while, I think
you can simply convert other flag_* to bools as well.
diff --git a/gcc/ipa-split.cc b/gcc/ipa-split.cc
index 6730f4f9d0e..276ead617c9 100644
--- a/gcc/ipa-split.cc
+++ b/gcc/ipa-split.cc
@@ -1930,7 +1930,8 @@ pass_split_functions::gate (function *)
/* When doing profile feedback, we want to execute the pass after
profiling
is read. So disable one in early optimization. */
return (flag_partial_inlining
- && !profile_arc_flag && !flag_branch_probabilities);
+ && !profile_arc_flag && !flag_branch_probabilities
+ && !profile_condition_flag);
This should be unnecessary - profile_condition is not used for profile
feedback.
}
} // anon namespace
diff --git a/gcc/passes.cc b/gcc/passes.cc
index 6f894a41d22..02194fe286f 100644
--- a/gcc/passes.cc
+++ b/gcc/passes.cc
@@ -352,7 +352,8 @@ finish_optimization_passes (void)
gcc::dump_manager *dumps = m_ctxt->get_dumps ();
timevar_push (TV_DUMP);
- if (profile_arc_flag || flag_test_coverage ||
flag_branch_probabilities)
+ if (profile_arc_flag || profile_condition_flag || flag_test_coverage
+ || flag_branch_probabilities)
{
dumps->dump_start (pass_profile_1->static_pass_number, NULL);
end_branch_prob ();
I think this whole code is unnecesar since branch probability is now an
IPA pass and runs on whole program at once, so the stats done by
end_branch_prob can be output from the pass itself.
(originally the pass was doine as part of optimization queue and we
needed a place to output stats at the end of compilation).
@@ -1397,10 +1415,18 @@ branch_prob (bool thunk)
/* Write the data from which gcov can reconstruct the basic block
graph and function line numbers (the gcno file). */
+ output_to_file = false;
if (coverage_begin_function (lineno_checksum, cfg_checksum))
{
gcov_position_t offset;
+ /* The condition coverage needs a deeper analysis to identify
expressions
+ * of conditions, which means it is not yet ready to write to
the gcno
+ * file. It will write its entries later, but needs to know
if it do it
+ * in the first place, which is controlled by the return value of
+ * coverage_begin_function. */
No * on ever line.
+ output_to_file = true;
+
/* Basic block flags */
offset = gcov_write_tag (GCOV_TAG_BLOCKS);
gcov_write_unsigned (n_basic_blocks_for_fn (cfun));
+ if (coverage_counter_alloc (GCOV_COUNTER_CONDS, 2 * nconds))
+ {
+ /* Add two extra variables to the function for the local
+ accumulators, which are zero'd on the entry of a new
conditional.
+ The local accumulators are shared between decisions in
order to
+ use less stack space. */
Shouldn't we able to allocate the stack space.
+ tree accu[2] = {
+ build_decl (UNKNOWN_LOCATION, VAR_DECL,
+ get_identifier ("__accu_t"), get_gcov_type ()),
+ build_decl (UNKNOWN_LOCATION, VAR_DECL,
+ get_identifier ("__accu_f"), get_gcov_type ()),
Can't those be just SSA names?
+ };
+
+ gcov_position_t offset {};
+ if (output_to_file)
+ offset = gcov_write_tag (GCOV_TAG_CONDS);
+
+ for (unsigned i = 0; i < nconds; ++i)
+ {
+ array_slice<basic_block> expr = cov_blocks (cov, i);
+ array_slice<gcov_type_unsigned> masks = cov_masks (cov, i);
+ gcc_assert (expr.is_valid ());
+ gcc_assert (masks.is_valid ());
+
+ int terms = instrument_decisions (expr, i, accu,
masks.begin ());
+ if (output_to_file)
+ {
+ gcov_write_unsigned (expr.front ()->index);
+ gcov_write_unsigned (terms);
+ }
+ }
+ if (output_to_file)
+ gcov_write_length (offset);
+ }
+ cov_free (cov);
+ }
+
/* For each edge not on the spanning tree, add counting code. */
if (profile_arc_flag
&& coverage_counter_alloc (GCOV_COUNTER_ARCS, num_instrumented))
{
unsigned n_instrumented;
- gimple_init_gcov_profiler ();
-
n_instrumented = instrument_edges (el);
gcc_assert (n_instrumented == num_instrumented);
if (flag_profile_values)
instrument_values (values);
-
- /* Commit changes done by instrumentation. */
- gsi_commit_edge_inserts ();
}
free_aux_for_edges ();
values.release ();
free_edge_list (el);
+ /* Commit changes done by instrumentation. */
+ gsi_commit_edge_inserts ();
+
coverage_end_function (lineno_checksum, cfg_checksum);
if (flag_branch_probabilities
&& (profile_status_for_fn (cfun) == PROFILE_READ))
@@ -1669,6 +1740,7 @@ init_branch_prob (void)
total_num_passes = 0;
total_num_times_called = 0;
total_num_branches = 0;
+ total_num_conds = 0;
for (i = 0; i < 20; i++)
total_hist_br_prob[i] = 0;
}
@@ -1708,5 +1780,7 @@ end_branch_prob (void)
(total_hist_br_prob[i] + total_hist_br_prob[19-i]) * 100
/ total_num_branches, 5*i, 5*i+5);
}
+ fprintf (dump_file, "Total number of conditions: %d\n",
+ total_num_conds);
}
}
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index da300d5f9e8..c8b917afb9a 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -58,6 +58,8 @@ along with GCC; see the file COPYING3. If not see
#include "alloc-pool.h"
#include "symbol-summary.h"
#include "symtab-thunks.h"
+#include "cfganal.h"
+#include "cfgloop.h"
static GTY(()) tree gcov_type_node;
static GTY(()) tree tree_interval_profiler_fn;
@@ -73,6 +75,1046 @@ static GTY(()) tree ic_tuple_var;
static GTY(()) tree ic_tuple_counters_field;
static GTY(()) tree ic_tuple_callee_field;
+namespace
+{
Maybe some overall comment what this code is doing (which you describe
in tthe email). Also it may make sense to place it to a new source
file: it is long enough.
+/* Some context and reused instances between function calls. Large
embedded
+ buffers are used to up-front request enough memory for most
programs and
+ merge them into a single allocation at the cost of using more
memory in the
+ average case. Some numbers from linux v5.13 which is assumed to
be a
+ reasonably diverse code base: 75% of the functions in linux have
less than
+ 16 nodes in the CFG and approx 2.5% have more than 64 nodes. The
functions
+ that go beyond a few dozen nodes tend to be very large (>100) and
so 64
+ seems like a good balance.
+
+ This is really just a performance balance of the cost of
allocation and
+ wasted memory. */
+struct conds_ctx
+{
+ /* Bitmap of the processed blocks. Bit n set means
basic_block->index has
+ been processed either explicitly or as a part of an
expression. */
+ auto_sbitmap marks;
+
+ /* This is both a reusable shared allocation which is also used
to return
+ single expressions, which means it for most code should only
hold a
+ couple of elements. */
+ auto_vec<basic_block, 32> blocks;
+
+ /* Map from basic_block->index to an ordering so that for a single
+ expression (a || b && c) => index_map[a] < index_map[b] <
index_map[c].
+ The values do not have to be consecutive and can be
interleaved by
+ values from other expressions, so comparisons only make sense
for blocks
+ that belong to the same expression. */
+ auto_vec<int, 64> index_map;
+
+ /* Pre-allocate bitmaps and vectors for per-function book
keeping. This is
+ pure instance reuse and the bitmaps carry no data between
function
+ calls. */
+ auto_vec<basic_block, 64> B1;
+ auto_vec<basic_block, 64> B2;
+ auto_sbitmap G1;
+ auto_sbitmap G2;
+ auto_sbitmap G3;
+
+ explicit conds_ctx (unsigned size) noexcept (true) : marks (size),
+ G1 (size), G2 (size), G3 (size)
+ {
+ bitmap_clear (marks);
+ }
+
+ /* Mark a node as processed so nodes are not processed twice for
example in
+ loops, gotos. */
+ void mark (const basic_block b) noexcept (true)
+ {
+ gcc_assert (!bitmap_bit_p (marks, b->index));
+ bitmap_set_bit (marks, b->index);
+ }
+
+ /* Mark nodes as processed so they are not processed twice. */
+ void mark (const vec<basic_block>& bs) noexcept (true)
+ {
+ for (const basic_block b : bs)
+ mark (b);
+ }
+
+ /* Check if all nodes are marked. A successful run should visit
& mark
+ every reachable node exactly once. */
+ bool all_marked (const vec<basic_block>& reachable) const
noexcept (true)
+ {
+ for (const basic_block b : reachable)
+ if (!bitmap_bit_p (marks, b->index))
+ return false;
+ return true;
+ }
+};
+
+/* Only instrument terms with fewer than number of bits in a (wide)
gcov
+ integer, which is probably 64. The algorithm itself does not
impose this
+ limitation, but it makes for a simpler implementation.
+
+ * Allocating the output data structure (coverage_counter_alloc
()) can
+ assume pairs of gcov_type_unsigned and not use a separate
length field.
+ * A pair gcov_type_unsigned can be used as accumulators.
+ * Updating accumulators is can use the bitwise operations |=, &=
and not
+ custom operators that work for arbitrary-sized bit-sets.
+
+ Most real-world code should be unaffected by this, but it is
possible
+ (especially for generated code) to exceed this limit. */
+#define CONDITIONS_MAX_TERMS (sizeof (gcov_type_unsigned) *
BITS_PER_UNIT)
You also need to consider target number of bits in the gcov type:
targets can change the 64bit default to something else.
There is gcov_type_node from which you can get number of bits.
+
+/* Special cases of the single_*_p and single_*_edge functions in
basic-block.h
+ that don't consider exception handling or other complex edges.
This helps
+ create a view of the CFG with only normal edges - if a basic
block has both
+ an outgoing fallthrough and exceptional edge [1], it should be
considered a
+ single-successor.
+
+ [1] if this is not possible, these functions can be removed and
replaced by
+ their basic-block.h cousins. */
This should be achievable with -fnon-call-exceptions. For example
division can throw an exception then.
+bool
+single (const vec<edge, va_gc> *edges)
Maybe single_p to keep naming similar to basic-block?
+{
+ int n = EDGE_COUNT (edges);
+ if (n == 0)
+ return false;
+
+ for (edge e : edges)
+ if (e->flags & EDGE_COMPLEX)
+ n -= 1;
+
+ return n == 1;
+}
+
+/* Get the single, non-complex edge. Behavior is undefined edges
have more
+ than 1 non-complex edges. */
+edge
+single_edge (const vec<edge, va_gc> *edges)
+{
Perhaps to keep behaviour consistent with basic-block.h you want to add
gcc_checking_assert (single_p (edges));
+/* Find the set {ancestors (p) intersect G} where ancestors is the
recursive
+ set of predecessors for p. Limiting to the ancestors that are
also in G
+ (see cond_reachable_from) and by q is an optimization as
ancestors outside G
+ have no effect when isolating expressions.
+
+ dfs_enumerate_from () does not work as the filter function needs
edge
+ information and dfs_enumerate_from () only considers blocks. */
+void
+ancestors_of (basic_block p, basic_block q, const sbitmap G, sbitmap
ancestors)
coding style does not really like using uppercase letter for variables.
I see it was a set in the paper, but one can give it bit more
descriptive name in the source code.
+{
+ if (!bitmap_bit_p (G, p->index))
+ return;
+
+ bitmap_set_bit (ancestors, p->index);
+ bitmap_set_bit (ancestors, q->index);
+ if (p == q)
+ return;
+
+ auto_vec<basic_block, 16> stack;
+ stack.safe_push (p);
+
+ while (!stack.is_empty ())
+ {
+ basic_block b = stack.pop ();
+ if (single (b->preds))
+ {
+ edge e = single_edge (b->preds);
+ e = contract_edge_up (e);
+ b = e->dest;
+ }
So this walks chain of single pred edges until something non-single is
found and only those are added ancessor bitmaps?
Why single_edge basic blocks are not inserted into the ancestor birmap?
+
+ for (edge e : b->preds)
+ {
+ basic_block src = e->src;
+ if (bitmap_bit_p (ancestors, e->src->index))
+ continue;
+ if (!bitmap_bit_p (G, e->src->index))
+ continue;
+ bitmap_set_bit (ancestors, src->index);
bitmap_set_bit returns boolean value whether the bit was newly set
or previously 1. So you can avoid bitmap_bit_p above.
diff --git a/libgcc/libgcov-merge.c b/libgcc/libgcov-merge.c
index 5d6e17d1483..eed3556373b 100644
--- a/libgcc/libgcov-merge.c
+++ b/libgcc/libgcov-merge.c
@@ -33,6 +33,11 @@ void __gcov_merge_add (gcov_type *counters
__attribute__ ((unused)),
unsigned n_counters __attribute__
((unused))) {}
#endif
+#ifdef L_gcov_merge_ior
+void __gcov_merge_ior (gcov_type *counters __attribute__ ((unused)),
+ unsigned n_counters __attribute__ ((unused))) {}
+#endif
Why this is necessary?
The profiling part looks good to me. I am worried about the pattern
matching of gimple to recognize conditionals especially after the early
optimizations was run. Does it work reasonably with -O2?
Perhaps it would make sense to schedule this kind of profiling early
shortly after build_ssa which would make it incompatible with profile
feedback, but I think that wouold be OK.
Richi, can you please look at the gimple matching part?
Honza