Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-08 Thread Oleg Nesterov
On 08/08, Liao, Chang wrote: > > > 在 2024/8/8 18:28, Oleg Nesterov 写道: > > --- x/kernel/events/uprobes.c > > +++ x/kernel/events/uprobes.c > > @@ -2308,9 +2308,10 @@ static void handle_singlestep(struct upr > > utask->state = UTASK_RUNNING; > > xol_free_insn_slot(current); > > > > - spin_

Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-08 Thread Liao, Chang
在 2024/8/8 18:28, Oleg Nesterov 写道: > On 08/08, Liao, Chang wrote: >> >> - pre_ssout() resets the deny signal flag >> >> - uprobe_deny_signal() sets the deny signal flag when TIF_SIGPENDING is >> cleared. >> >> - handle_singlestep() check the deny signal flag and restore >> TIF_SIGPENDIN

Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-08 Thread Oleg Nesterov
On 08/08, Liao, Chang wrote: > > - pre_ssout() resets the deny signal flag > > - uprobe_deny_signal() sets the deny signal flag when TIF_SIGPENDING is > cleared. > > - handle_singlestep() check the deny signal flag and restore TIF_SIGPENDING > if necessary. > > Does this approach look corre

Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-08 Thread Liao, Chang
在 2024/8/7 18:17, Oleg Nesterov 写道: > So. Liao, I am sorry, but I dislike your patch/approach in any case. > > UTASK_SSTEP_DENY_SIGNAL complicates the state machine. And I don't like the > fact > that set_thread_flag(TIF_SIGPENDING) is called twice, from handle_singlestep() > and uprobe_post_s

Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-07 Thread Oleg Nesterov
So. Liao, I am sorry, but I dislike your patch/approach in any case. UTASK_SSTEP_DENY_SIGNAL complicates the state machine. And I don't like the fact that set_thread_flag(TIF_SIGPENDING) is called twice, from handle_singlestep() and uprobe_post_sstep_notifier(), this complicates the logic even mor

Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-06 Thread Oleg Nesterov
On 08/06, Liao, Chang wrote: > > You're absolutely right. handle_signlestep() has chance to handle _DENY_SIGANL > unless it followed by setting TIF_UPROBE in uprobe_deny_signal(). This means > _DENY_SIGNAL is likey replaced during next uprobe single-stepping. > > I believe introducing _DENY_SIGNAL

Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-05 Thread Liao, Chang
在 2024/8/2 17:24, Oleg Nesterov 写道: > On 08/02, Liao, Chang wrote: >> >> >> 在 2024/8/1 22:06, Oleg Nesterov 写道: >>> On 08/01, Liao Chang wrote: @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs) int err = 0;

Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-02 Thread Oleg Nesterov
On 08/02, Liao, Chang wrote: > > > 在 2024/8/1 22:06, Oleg Nesterov 写道: > > On 08/01, Liao Chang wrote: > >> > >> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task > >> *utask, struct pt_regs *regs) > >>int err = 0; > >> > >>uprobe = utask->active_uprobe; > >> - if (

Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-01 Thread Liao, Chang
在 2024/8/1 22:06, Oleg Nesterov 写道: > On 08/01, Liao Chang wrote: >> >> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task >> *utask, struct pt_regs *regs) >> int err = 0; >> >> uprobe = utask->active_uprobe; >> -if (utask->state == UTASK_SSTEP_ACK) >> +

Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-01 Thread Oleg Nesterov
On 08/01, Liao Chang wrote: > > @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task > *utask, struct pt_regs *regs) > int err = 0; > > uprobe = utask->active_uprobe; > - if (utask->state == UTASK_SSTEP_ACK) > + switch (utask->state) { > + case UTASK_S

[PATCH] uprobes: Improve scalability by reducing the contention on siglock

2024-08-01 Thread Liao Chang
The profiling result of BPF selftest on ARM64 platform reveals the significant contention on the current->sighand->siglock within the handle_singlestep() is the scalability bottleneck. The reason is also very straightforward that all producer threads of benchmark have to contend the spinlock mentio