On Wed, May 21, 2025 at 10:28 AM Likhitha Korrapati
<likhi...@linux.ibm.com> wrote:
>
> Hi Ian,
>
> On 5/21/25 21:15, Ian Rogers wrote:
> > On Wed, May 21, 2025 at 6:03 AM Likhitha Korrapati
> > <likhi...@linux.ibm.com> wrote:
> >>
> >> Hi Arnaldo,
> >>
> >> On 5/14/25 02:43, Arnaldo Carvalho de Melo wrote:
> >>> On Fri, May 02, 2025 at 01:14:32PM +0530, Mukesh Kumar Chaurasiya wrote:
> >>>> On Fri, Apr 25, 2025 at 02:46:43PM -0300, Arnaldo Carvalho de Melo wrote:
> >>>>> Maybe that max() call in perf_cpu_map__intersect() somehow makes the
> >>>>> compiler happy.
> >>>
> >>>>> And in perf_cpu_map__alloc() all calls seems to validate it.
> >>>
> >>>>> Like:
> >>>
> >>>>> +++ b/tools/lib/perf/cpumap.c
> >>>>> @@ -411,7 +411,7 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig, 
> >>>>> struct perf_cpu_map *other)
> >>>>>           }
> >>>>>
> >>>>>           tmp_len = __perf_cpu_map__nr(*orig) + 
> >>>>> __perf_cpu_map__nr(other);
> >>>>> -       tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> >>>>> +       tmp_cpus = calloc(tmp_len, sizeof(struct perf_cpu));
> >>>>>           if (!tmp_cpus)
> >>>>>                   return -ENOMEM;
> >>>
> >>>>> ⬢ [acme@toolbx perf-tools-next]$
> >>>
> >>>>> And better, do the max size that the compiler is trying to help us
> >>>>> catch?
> >>>
> >>>> Isn't it better to use perf_cpu_map__nr. That should fix this problem.
> >>>
> >>> Maybe, have you tried it?
> >>
> >> I have tried this method and it works.
> >>
> >> --- a/tools/lib/perf/cpumap.c
> >> +++ b/tools/lib/perf/cpumap.c
> >> @@ -410,7 +410,7 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig,
> >> struct perf_cpu_map *other)
> >>                   return 0;
> >>           }
> >>
> >> -       tmp_len = max(__perf_cpu_map__nr(*orig), 
> >> __perf_cpu_map__nr(other));
> >> +       tmp_len = perf_cpu_map__nr(*orig) +  perf_cpu_map__nr(other);
> >>           tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> >>           if (!tmp_cpus)
> >>                   return -ENOMEM;
> >>
> >> I will send a V2 with this change if this looks good.
> >
> > How is this different from the existing code:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/cpumap.c?h=perf-tools-next#n423
> > ```
> >          tmp_len = __perf_cpu_map__nr(*orig) + __perf_cpu_map__nr(other);
> >          tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> >          if (!tmp_cpus)
> >                  return -ENOMEM;
> > ```
> >
> > Thanks,
> > Ian
>
> I gave the wrong diff. Here is the corrected diff.
>
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -410,7 +410,7 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig,
> struct perf_cpu_map *other)
>                  return 0;
>          }
>
> -       tmp_len = __perf_cpu_map__nr(*orig) + __perf_cpu_map__nr(other);
> +       tmp_len = perf_cpu_map__nr(*orig) + perf_cpu_map__nr(other);
>          tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
>          if (!tmp_cpus)
>                  return -ENOMEM;
>
>
> I am using perf_cpu_map__nr instead of __perf_cpu_map__nr.

Ok, why is that a fix? The function declarations are near identical
and perf_cpu_map__nr is implemented in terms of __perf_cpu_map__nr:
```
static int __perf_cpu_map__nr(const struct perf_cpu_map *cpus)
{
        return RC_CHK_ACCESS(cpus)->nr;
}
int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
{
        return cpus ? __perf_cpu_map__nr(cpus) : 1;
}
```
My guess is that being static allows all of the code to be analyzed in
the compilation unit and thereby create the warning/error, your change
is just defeating the analysis. The analysis could easily kick in
again for Link Time Optimization. I'd prefer making these `__nr`
functions return `unsigned int` or size_t over changes like this.

Thanks,
Ian

> Thanks,
> Likhitha.
>
> >
> >> Thanks
> >> Likhitha.
> >>
> >>>
> >>>> One question I have, in perf_cpu_map__nr, the function is returning
> >>>> 1 in case *cpus is NULL. Is it ok to do that? wouldn't it cause problems?
> >>>
> >>> Indeed this better be documented, as by just looking at:
> >>>
> >>> int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
> >>> {
> >>>           return cpus ? __perf_cpu_map__nr(cpus) : 1;
> >>> }
> >>>
> >>> It really doesn't make much sense to say that a NULL map has one entry.
> >>>
> >>> But the next functions are:
> >>>
> >>> bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map)
> >>> {
> >>>           return map ? __perf_cpu_map__cpu(map, 0).cpu == -1 : true;
> >>> }
> >>>
> >>> bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map)
> >>> {
> >>>           if (!map)
> >>>                   return true;
> >>>
> >>>           return __perf_cpu_map__nr(map) == 1 && __perf_cpu_map__cpu(map, 
> >>> 0).cpu == -1;
> >>> }
> >>>
> >>> bool perf_cpu_map__is_empty(const struct perf_cpu_map *map)
> >>> {
> >>>           return map == NULL;
> >>> }
> >>>
> >>> So it seems that a NULL cpu map means "any/all CPU) and a map with just
> >>> one entry would have as its content "-1" that would mean "any/all CPU".
> >>>
> >>> Ian did work on trying to simplify/clarify this, so maybe he can chime
> >>> in :-)
> >>>
> >>> - Arnaldo
> >>
> >
>

Reply via email to