Moving bpf_link_free call into delayed processing so we don't
need to wait for it when releasing the link.

For example bpf_tracing_link_release could take considerable
amount of time in bpf_trampoline_put function due to
synchronize_rcu_tasks call.

It speeds up bpftrace release time in following example:

Before:

 Performance counter stats for './src/bpftrace -ve kfunc:__x64_sys_s*
    { printf("test\n"); } i:ms:10 { printf("exit\n"); exit();}' (5 runs):

     3,290,457,628      cycles:k                                 ( +-  0.27% )
       933,581,973      cycles:u                                 ( +-  0.20% )

             50.25 +- 4.79 seconds time elapsed  ( +-  9.53% )

After:

 Performance counter stats for './src/bpftrace -ve kfunc:__x64_sys_s*
    { printf("test\n"); } i:ms:10 { printf("exit\n"); exit();}' (5 runs):

     2,535,458,767      cycles:k                                 ( +-  0.55% )
       940,046,382      cycles:u                                 ( +-  0.27% )

             33.60 +- 3.27 seconds time elapsed  ( +-  9.73% )

Signed-off-by: Jiri Olsa <jo...@kernel.org>
---
 kernel/bpf/syscall.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 1110ecd7d1f3..61ef29f9177d 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2346,12 +2346,8 @@ void bpf_link_put(struct bpf_link *link)
        if (!atomic64_dec_and_test(&link->refcnt))
                return;
 
-       if (in_atomic()) {
-               INIT_WORK(&link->work, bpf_link_put_deferred);
-               schedule_work(&link->work);
-       } else {
-               bpf_link_free(link);
-       }
+       INIT_WORK(&link->work, bpf_link_put_deferred);
+       schedule_work(&link->work);
 }
 
 static int bpf_link_release(struct inode *inode, struct file *filp)
-- 
2.26.2

Reply via email to