Commit-ID:  c917e0f259908e75bd2a65877e25f9d90c22c848
Gitweb:     https://git.kernel.org/tip/c917e0f259908e75bd2a65877e25f9d90c22c848
Author:     Song Liu <songliubrav...@fb.com>
AuthorDate: Mon, 12 Mar 2018 09:59:43 -0700
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Tue, 20 Mar 2018 08:58:47 +0100

perf/cgroup: Fix child event counting bug

When a perf_event is attached to parent cgroup, it should count events
for all children cgroups:

   parent_group   <---- perf_event
     \
      - child_group  <---- process(es)

However, in our tests, we found this perf_event cannot report reliable
results. Here is an example case:

  # create cgroups
  mkdir -p /sys/fs/cgroup/p/c
  # start perf for parent group
  perf stat -e instructions -G "p"

  # on another console, run test process in child cgroup:
  stressapptest -s 2 -M 1000 & echo $! > /sys/fs/cgroup/p/c/cgroup.procs

  # after the test process is done, stop perf in the first console shows

       <not counted>      instructions              p

The instruction should not be "not counted" as the process runs in the
child cgroup.

We found this is because perf_event->cgrp and cpuctx->cgrp are not
identical, thus perf_event->cgrp are not updated properly.

This patch fixes this by updating perf_cgroup properly for ancestor
cgroup(s).

Reported-by: Ephraim Park <ephiep...@fb.com>
Signed-off-by: Song Liu <songliubrav...@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Cc: <jo...@redhat.com>
Cc: <kernel-t...@fb.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Stephane Eranian <eran...@google.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Vince Weaver <vincent.wea...@maine.edu>
Link: http://lkml.kernel.org/r/20180312165943.1057894-1-songliubrav...@fb.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 kernel/events/core.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4b838470fac4..709a55b9ad97 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -724,9 +724,15 @@ static inline void __update_cgrp_time(struct perf_cgroup 
*cgrp)
 
 static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context 
*cpuctx)
 {
-       struct perf_cgroup *cgrp_out = cpuctx->cgrp;
-       if (cgrp_out)
-               __update_cgrp_time(cgrp_out);
+       struct perf_cgroup *cgrp = cpuctx->cgrp;
+       struct cgroup_subsys_state *css;
+
+       if (cgrp) {
+               for (css = &cgrp->css; css; css = css->parent) {
+                       cgrp = container_of(css, struct perf_cgroup, css);
+                       __update_cgrp_time(cgrp);
+               }
+       }
 }
 
 static inline void update_cgrp_time_from_event(struct perf_event *event)
@@ -754,6 +760,7 @@ perf_cgroup_set_timestamp(struct task_struct *task,
 {
        struct perf_cgroup *cgrp;
        struct perf_cgroup_info *info;
+       struct cgroup_subsys_state *css;
 
        /*
         * ctx->lock held by caller
@@ -764,8 +771,12 @@ perf_cgroup_set_timestamp(struct task_struct *task,
                return;
 
        cgrp = perf_cgroup_from_task(task, ctx);
-       info = this_cpu_ptr(cgrp->info);
-       info->timestamp = ctx->timestamp;
+
+       for (css = &cgrp->css; css; css = css->parent) {
+               cgrp = container_of(css, struct perf_cgroup, css);
+               info = this_cpu_ptr(cgrp->info);
+               info->timestamp = ctx->timestamp;
+       }
 }
 
 static DEFINE_PER_CPU(struct list_head, cgrp_cpuctx_list);

Reply via email to