On Fri, Apr 25, 2025 at 08:19:02PM +0530, Athira Rajeev wrote: > > On 14 Apr 2025, at 7:08 AM, Madhavan Srinivasan <ma...@linux.ibm.com> wrote: > > On 4/7/25 5:38 PM, Venkat Rao Bagalkote wrote: > >> On 07/04/25 12:10 am, Athira Rajeev wrote: > >>>> On 6 Apr 2025, at 10:04 PM, Likhitha Korrapati <likhi...@linux.ibm.com> > >>>> wrote:
> >>>> perf build break observed when using gcc 13-3 (FC39 ppc64le) > >>>> with the following error. > >>>> cpumap.c: In function 'perf_cpu_map__merge': > >>>> cpumap.c:414:20: error: argument 1 range [18446744069414584320, > >>>> 18446744073709551614] exceeds maximum object size 9223372036854775807 > >>>> [-Werror=alloc-size-larger-than=] > >>>> 414 | tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); > >>>> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > >>>> In file included from cpumap.c:4: > >>>> /usr/include/stdlib.h:672:14: note: in a call to allocation function > >>>> 'malloc' declared here > >>>> 672 | extern void *malloc (size_t __size) __THROW __attribute_malloc__ > >>>> | ^~~~~~ > >>>> cc1: all warnings being treated as errors > >>>> Error happens to be only in gcc13-3 and not in latest gcc 14. > >>>> Even though git-bisect pointed bad commit as: > >>>> 'commit f5b07010c13c ("libperf: Don't remove -g when EXTRA_CFLAGS are > >>>> used")', > >>>> issue is with tmp_len being "int". It holds number of cpus and making > >>>> it "unsigned int" fixes the issues. > >>>> After the fix: > >>>> CC util/pmu-flex.o > >>>> CC util/expr-flex.o > >>>> LD util/perf-util-in.o > >>>> LD perf-util-in.o > >>>> AR libperf-util.a > >>>> LINK perf > >>>> GEN python/perf.cpython-312-powerpc64le-linux-gnu.so > >>>> Signed-off-by: Likhitha Korrapati <likhi...@linux.ibm.com> > >>> Looks good to me > >>> Reviewed-by: Athira Rajeev <atraj...@linux.ibm.com> > >> Tested this patch on perf-tools-next repo, and this patch fixes the issue. > >> Tested-by: Venkat Rao Bagalkote <venka...@linux.ibm.com> > > Arnaldo, Namhyung, > > can you consider pulling this fix? since it is breaking the build in > > gcc13-3 or > > if you have any comments do let us know. This isn't the only place in that file where this pattern exists: ⬢ [acme@toolbx perf-tools-next]$ grep malloc tools/lib/perf/cpumap.c cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus); tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); ⬢ [acme@toolbx perf-tools-next]$ struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus) { RC_STRUCT(perf_cpu_map) *cpus; struct perf_cpu_map *result; if (nr_cpus == 0) return NULL; cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus); int perf_cpu_map__merge(struct perf_cpu_map **orig, struct perf_cpu_map *other) { struct perf_cpu *tmp_cpus; int tmp_len; int i, j, k; struct perf_cpu_map *merged; if (perf_cpu_map__is_subset(*orig, other)) return 0; if (perf_cpu_map__is_subset(other, *orig)) { perf_cpu_map__put(*orig); *orig = perf_cpu_map__get(other); return 0; } tmp_len = __perf_cpu_map__nr(*orig) + __perf_cpu_map__nr(other); tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig, struct perf_cpu_map *other) { struct perf_cpu *tmp_cpus; int tmp_len; int i, j, k; struct perf_cpu_map *merged = NULL; if (perf_cpu_map__is_subset(other, orig)) return perf_cpu_map__get(orig); if (perf_cpu_map__is_subset(orig, other)) return perf_cpu_map__get(other); tmp_len = max(__perf_cpu_map__nr(orig), __perf_cpu_map__nr(other)); tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); I'm trying to figure out why its only in perf_cpu_map__merge() that this triggers :-\ Maybe that max() call in perf_cpu_map__intersect() somehow makes the compiler happy. And in perf_cpu_map__alloc() all calls seems to validate it. But wouldn't turning this into a calloc() be better? Like: diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index 4454a5987570cfbc..99d21618a252ac0e 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -411,7 +411,7 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig, struct perf_cpu_map *other) } tmp_len = __perf_cpu_map__nr(*orig) + __perf_cpu_map__nr(other); - tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); + tmp_cpus = calloc(tmp_len, sizeof(struct perf_cpu)); if (!tmp_cpus) return -ENOMEM; ⬢ [acme@toolbx perf-tools-next]$ And better, do the max size that the compiler is trying to help us catch? - Arnaldo