Since clearing a bit in thread_info is an atomic operation, the spinlock
is redundant and can be removed, reducing lock contention is good for
performance.
Signed-off-by: Liao Chang
---
kernel/events/uprobes.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/kernel/events/uprobes.c b/kernel
f which are from 4.5M/s to 6.4M/s and 3.3M/s to 5.1M/s individually.
[1] https://lore.kernel.org/all/20240731214256.3588718-1-and...@kernel.org
[2] https://lore.kernel.org/all/20240727094405.1362496-1-liaocha...@huawei.com
Signed-off-by: Liao Chang
---
include/linux/uprobes.h | 1 +
ke
vaddr, &is_swbp);
> if (!uprobe) {
> if (is_swbp > 0) {
> /* No matching uprobe; signal SIGTRAP. */
> @@ -2223,6 +2239,7 @@ static void handle_swbp(struct pt_regs *regs)
> */
> instruction_pointer_set(regs, bp_vaddr);
> }
> + srcu_read_unlock(&uprobes_srcu, srcu_idx);
> return;
> }
>
> @@ -2258,12 +2275,12 @@ static void handle_swbp(struct pt_regs *regs)
> if (arch_uprobe_skip_sstep(&uprobe->arch, regs))
> goto out;
>
> - if (!pre_ssout(uprobe, regs, bp_vaddr))
> - return;
> + if (pre_ssout(uprobe, regs, bp_vaddr))
> + goto out;
>
Regardless what pre_ssout() returns, it always reach the label 'out', so the
if block is unnecessary.
> - /* arch_uprobe_skip_sstep() succeeded, or restart if can't singlestep */
> out:
> - put_uprobe(uprobe);
> + /* arch_uprobe_skip_sstep() succeeded, or restart if can't singlestep */
> + srcu_read_unlock(&uprobes_srcu, srcu_idx);
> }
>
> /*
--
BR
Liao, Chang
在 2024/8/2 0:49, Andrii Nakryiko 写道:
> On Thu, Aug 1, 2024 at 5:23 AM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/1 5:42, Andrii Nakryiko 写道:
>>> To avoid unnecessarily taking a (brief) refcount on uprobe during
>>> breakpoint handling in handle_swbp
在 2024/8/1 22:06, Oleg Nesterov 写道:
> On 08/01, Liao Chang wrote:
>>
>> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task
>> *utask, struct pt_regs *regs)
>> int err = 0;
>>
>> uprobe = utask->active_uprobe;
&
nt_enable(struct trace_event_call *call,
> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> index 73a6b041bcce..928c73cde32e 100644
> --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> @@ -478,7 +478,8 @@ static void testmod_unregister_uprobe(void)
> mutex_lock(&testmod_uprobe_mutex);
>
> if (uprobe.uprobe) {
> - uprobe_unregister(uprobe.uprobe, &uprobe.consumer);
> + uprobe_unregister_nosync(uprobe.uprobe, &uprobe.consumer);
> + uprobe_unregister_sync();
> uprobe.offset = 0;
> uprobe.uprobe = NULL;
> }
--
BR
Liao, Chang
在 2024/8/6 4:01, Andrii Nakryiko 写道:
> On Fri, Aug 2, 2024 at 8:05 AM Andrii Nakryiko
> wrote:
>>
>> On Thu, Aug 1, 2024 at 7:41 PM Liao, Chang wrote:
>>>
>>>
>>>
>>> 在 2024/8/1 5:42, Andrii Nakryiko 写道:
>>>> From:
在 2024/8/2 17:24, Oleg Nesterov 写道:
> On 08/02, Liao, Chang wrote:
>>
>>
>> 在 2024/8/1 22:06, Oleg Nesterov 写道:
>>> On 08/01, Liao Chang wrote:
>>>>
>>>> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task
nal flag when TIF_SIGPENDING is
cleared.
- handle_singlestep() check the deny signal flag and restore TIF_SIGPENDING
if necessary.
Does this approach look correct to you,do do you have any other way to
implement the "flag"?
Thanks.
>
> Oleg.
>
> On 08/06, Oleg Nestero
e.
>
[...]
>>
>> (To clarify. In fact I think that a new TIF_ or even PF_ flag makes more
>> sense,
>> afaics it can have more users. But I don't think that uprobes can provide
>> enough
>> justification for that right now)
I also face the same choice when Oleg suggested me to add new flag to track the
denied
flag, due to I haven't encountered scenarios outside of uprobe that would deny
signal,
so I'm not confident of introduce new TIF_ flag without a fully understanding
of potential
potential impacts.
>>
>> Oleg.
>>
--
BR
Liao, Chang
he CC list for broader visibility and potential collaboration.
Thanks.
在 2024/7/27 17:44, Liao Chang 写道:
> The profiling result of single-thread model of selftests bench reveals
> performance bottlenecks in find_uprobe() and caches_clean_inval_pou() on
> ARM64. On my local testing mach
在 2024/8/8 18:28, Oleg Nesterov 写道:
> On 08/08, Liao, Chang wrote:
>>
>> - pre_ssout() resets the deny signal flag
>>
>> - uprobe_deny_signal() sets the deny signal flag when TIF_SIGPENDING is
>> cleared.
>>
>> - handle_single
Since clearing a bit in thread_info is an atomic operation, the spinlock
is redundant and can be removed, reducing lock contention is good for
performance.
Acked-by: Oleg Nesterov
Signed-off-by: Liao Chang
---
kernel/events/uprobes.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/kernel
rg
[2] https://lore.kernel.org/all/20240727094405.1362496-1-liaocha...@huawei.com
Signed-off-by: Liao Chang
---
include/linux/uprobes.h | 1 +
kernel/events/uprobes.c | 8 +---
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
ps://lore.kernel.org/all/20240801082407.1618451-1-liaocha...@huawei.com
Liao Chang (2):
uprobes: Remove redundant spinlock in uprobe_deny_signal()
uprobes: Remove the spinlock within handle_singlestep()
include/linux/uprobes.h | 1 +
kernel/events/uprobes.c | 10 +-
2 files changed
ing the patch until Andrii's changes are settle down.
>
> Oleg.
>
>
> On 08/08, Liao, Chang wrote:
>>
>> Hi Andrii and Oleg.
>>
>> This patch sent by me two weeks ago also aim to optimize the performance of
>> uprobe
>> on arm64. I
在 2024/8/9 2:26, Andrii Nakryiko 写道:
> On Thu, Aug 8, 2024 at 1:45 AM Liao, Chang wrote:
>>
>> Hi Andrii and Oleg.
>>
>> This patch sent by me two weeks ago also aim to optimize the performance of
>> uprobe
>> on arm64. I notice recent discussion
在 2024/8/9 2:26, Andrii Nakryiko 写道:
> On Thu, Aug 8, 2024 at 1:45 AM Liao, Chang wrote:
>>
>> Hi Andrii and Oleg.
>>
>> This patch sent by me two weeks ago also aim to optimize the performance of
>> uprobe
>> on arm64. I notice recent discussion
在 2024/8/10 2:40, Andrii Nakryiko 写道:
> On Fri, Aug 9, 2024 at 11:34 AM Andrii Nakryiko
> wrote:
>>
>> On Fri, Aug 9, 2024 at 12:16 AM Liao, Chang wrote:
>>>
>>>
>>>
>>> 在 2024/8/9 2:26, Andrii Nakryiko 写道:
>>>> On Thu, Aug 8,
在 2024/8/12 19:29, Oleg Nesterov 写道:
> On 08/09, Liao Chang wrote:
>>
>> --- a/include/linux/uprobes.h
>> +++ b/include/linux/uprobes.h
>> @@ -75,6 +75,7 @@ struct uprobe_task {
>>
>> struct uprobe *active_uprobe;
>>
在 2024/8/12 20:07, Oleg Nesterov 写道:
> On 08/09, Liao Chang wrote:
>>
>> Since clearing a bit in thread_info is an atomic operation, the spinlock
>> is redundant and can be removed, reducing lock contention is good for
>> performance.
>
> My ack still
在 2024/8/13 1:49, Andrii Nakryiko 写道:
> On Mon, Aug 12, 2024 at 4:11 AM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/9 2:26, Andrii Nakryiko 写道:
>>> On Thu, Aug 8, 2024 at 1:45 AM Liao, Chang wrote:
>>>>
>>>> Hi Andrii and Oleg.
>&g
在 2024/8/13 1:57, Andrii Nakryiko 写道:
> On Mon, Aug 12, 2024 at 5:05 AM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/10 2:40, Andrii Nakryiko 写道:
>>> On Fri, Aug 9, 2024 at 11:34 AM Andrii Nakryiko
>>> wrote:
>>>>
>>>> On Fri,
and 'ret'
variants has been significantly reduced. Due to the emulation of 'push'
instruction needs to access userspace memory, it spent more cycles than
the other.
[0]
https://lore.kernel.org/all/caef4bzao4eg6hr2hzxypn+7uer4chs0r99zln02ezz5yruv...@mail.gmail.com/
Signed-off-by:
Since clearing a bit in thread_info is an atomic operation, the spinlock
is redundant and can be removed, reducing lock contention is good for
performance.
Acked-by: Oleg Nesterov
Signed-off-by: Liao Chang
---
kernel/events/uprobes.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/kernel
rg
[2] https://lore.kernel.org/all/20240727094405.1362496-1-liaocha...@huawei.com
Acked-by: Oleg Nesterov
Signed-off-by: Liao Chang
---
include/linux/uprobes.h | 1 +
kernel/events/uprobes.c | 8 +---
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/include/linux/uprobes.h b/in
s://lore.kernel.org/all/20240727094405.1362496-1-liaocha...@huawei.com
[3] https://lore.kernel.org/all/20240801082407.1618451-1-liaocha...@huawei.com
Liao Chang (2):
uprobes: Remove redundant spinlock in uprobe_deny_signal()
uprobes: Remove the spinlock within handle_singlestep()
include/linux/u
在 2024/8/15 2:42, Andrii Nakryiko 写道:
> On Tue, Aug 13, 2024 at 9:17 PM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/13 1:49, Andrii Nakryiko 写道:
>>> On Mon, Aug 12, 2024 at 4:11 AM Liao, Chang wrote:
>>>>
>>>>
>>>>
>&g
在 2024/8/15 0:57, Andrii Nakryiko 写道:
> On Tue, Aug 13, 2024 at 9:17 PM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/13 1:49, Andrii Nakryiko 写道:
>>> On Mon, Aug 12, 2024 at 4:11 AM Liao, Chang wrote:
>>>>
>>>>
>>>>
>&g
Hi, Mark
My bad for taking so long to rely, I generally agree with your suggestions to
STP emulation.
在 2024/8/15 17:58, Mark Rutland 写道:
> On Wed, Aug 14, 2024 at 08:03:56AM +0000, Liao Chang wrote:
>> As Andrii pointed out, the uprobe/uretprobe selftest bench run into a
>> c
在 2024/8/16 0:53, Andrii Nakryiko 写道:
> On Wed, Aug 14, 2024 at 7:58 PM Liao, Chang wrote:
>>
>>
>>
>> 在 2024/8/15 2:42, Andrii Nakryiko 写道:
>>> On Tue, Aug 13, 2024 at 9:17 PM Liao, Chang wrote:
>>>>
>>>>
>>>>
>>
Hi, Mark
Would you like to discuss this patch further, or do you still believe emulating
STP to push FP/LR into the stack in kernel is not a good idea?
Thanks.
在 2024/8/21 15:55, Liao, Chang 写道:
> Hi, Mark
>
> My bad for taking so long to rely, I generally agree with your suggestions
在 2024/8/30 3:26, Andrii Nakryiko 写道:
> On Tue, Aug 27, 2024 at 4:34 AM Liao, Chang wrote:
>>
>> Hi, Mark
>>
>> Would you like to discuss this patch further, or do you still believe
>> emulating
>> STP to push FP/LR into the stack in kernel is not a go
'SIE', and reach __find_get_block where it requires the
interrupt must be enabled.
Fix this is very trivial, just restore the value of 'sstatus' in pt_regs
with backup one at 2) when the instruction being single stepped cause a
page fault.
Fixes: c22b0bcb1dd02 ("riscv: A
'SIE' is masked in this new saved irqflag. After kprobe is
serviced, the CPU 'sstatus' is restored with 'SIE' masked.
This overwritten 'sstatus' cause BUG_ON() in __find_get_block.
This bug is already fixed on arm64 by Jisheng Zhang.
Fixes: c22b0bcb1dd
35 matches
Mail list logo