https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105
Bug ID: 99105
Summary: profile streaming scales poorly to projects with many
source files
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: gcov-profile
Assignee: unassigned at gcc dot gnu.org
Reporter: hubicka at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
Target Milestone: ---
Compared to clang we need significantly longer time to train Firefox (25
minutes compared to 7) and run clang
make check-clang
which takes 12 hours compared to 27 minutes.
Most of time is spent by kernel by IO. I suppose we really should consider
optionally producing per-binary rather then per-source file profile data dumps
and omit untrained parts of program.
This is perf top of running llvm testsuite (first with and second without
kernel symbols). Seems merging of topn is high in profile now.
Overhead Shared Object Symbol
8.58% libc-2.32.so [.] read
7.24% [kernel] [k] __x86_indirect_thunk_rax
7.15% [kernel] [k] entry_SYSCALL_64
6.43% [kernel] [k] __x64_sys_read
5.70% [kernel] [k] apparmor_file_permission
5.49% [kernel] [k] generic_file_buffered_read
4.45% [kernel] [k] btrfs_file_read_iter
4.10% [kernel] [k] syscall_return_via_sysret
3.31% [kernel] [k] new_sync_read
3.07% libc-2.32.so [.] _IO_file_xsgetn
2.77% [kernel] [k] find_get_entry
2.76% libc-2.32.so [.] _IO_fread
2.60% [kernel] [k] current_time
2.33% [kernel] [k] atime_needs_update
2.18% [kernel] [k] vfs_read
2.11% clang-11 [.] __gcov_merge_topn
2.02% [kernel] [k] pagecache_get_page
1.97% [kernel] [k] entry_SYSCALL_64_after_hwframe
1.89% clang-11 [.] gcov_read_words
1.76% [kernel] [k] __fsnotify_parent
1.67% [kernel] [k] syscall_exit_to_user_mode
1.60% [kernel] [k] ksys_read
1.40% [kernel] [k] security_file_permission
1.30% [kernel] [k] aa_file_perm
1.23% [kernel] [k] syscall_enter_from_user_mode
1.11% [kernel] [k] touch_atime
1.02% [kernel] [k] exit_to_user_mode_prepare
0.99% [kernel] [k] xas_load
0.95% [kernel] [k] xas_start
0.74% [kernel] [k] __fget_light
0.71% [kernel] [k] __fdget_pos
0.69% clang-11 [.] __gcov_read_counter
0.64% [kernel] [k] do_syscall_64
0.58% [kernel] [k] ktime_get_coarse_real_ts64
0.55% [kernel] [k] rw_verify_area
0.50% libc-2.32.so [.] _IO_sgetn
0.50% [kernel] [k] PageHuge
0.45% perf [.] rb_next
0.38% [kernel] [k] iov_iter_init
For a higher level overview, try: perf top --sort comm,dso
Overhead Shared Object Symbol
43.43% libc-2.32.so [.] read
12.00% libc-2.32.so [.] _IO_file_xsgetn
11.80% libc-2.32.so [.] _IO_fread
7.89% clang-11 [.] __gcov_merge_topn
7.28% clang-11 [.] gcov_read_words
2.32% clang-11 [.] __gcov_read_counter
2.28% libc-2.32.so [.] _IO_sgetn
2.08% FileCheck [.] __gcov_merge_topn
1.46% FileCheck [.] gcov_read_words
1.23% perf [.] rb_next
1.08% perf [.] __symbols__insert
0.87% libc-2.32.so [.] _IO_file_read
0.72% clang-11 [.] gcov_do_dump
0.38% FileCheck [.] __gcov_read_counter
0.28% perf [.] rust_demangle_callback
0.25% libc-2.32.so [.] _int_malloc
0.19% clang-11 [.] gcov_write_words
0.18% libc-2.32.so [.] __strchr_avx2
0.18% clang-11 [.] fread@plt
0.18% libc-2.32.so [.] __libc_calloc
0.17% perf [.] dso__load_sym
0.15% perf [.] symbol__new
0.14% perf [.] rb_insert_color
0.11% libc-2.32.so [.] __strlen_avx2
0.10% perf [.] 0x000000000087755b
0.08% libc-2.32.so [.]
__memmove_avx_unaligned_erms
0.08% perf [.] evsel__parse_sample
0.07% libc-2.32.so [.] sysmalloc
0.07% perf [.] symbols__fixup_end
0.07% perf [.] eprintf
0.07% libc-2.32.so [.]
__memset_avx2_unaligned_erms
0.07% libc-2.32.so [.] cfree@GLIBC_2.2.5
0.06% perf [.] bfd_demangle
0.06% perf [.] rust_demangle
0.05% perf [.] cplus_demangle
0.05% libc-2.32.so [.] _int_free
0.05% libpthread-2.32.so [.] __pthread_mutex_init
0.04% perf [.] cplus_demangle_v3
0.04% FileCheck [.] fread@plt
For a higher level overview, try: perf top --sort comm,dso
