* Oleg Nesterov <o...@redhat.com> [2012-09-03 17:26:09]:

> Afaics the usage of update_debugctlmsr() and TIF_BLOCKSTEP in
> step.c was always very wrong.
> 
> 1. update_debugctlmsr() was simply unneeded. The child sleeps
>    TASK_TRACED, __switch_to_xtra(next_p => child) should notice
>    TIF_BLOCKSTEP and set/clear DEBUGCTLMSR_BTF after resume if
>    needed.
> 
> 2. It is wrong. The state of DEBUGCTLMSR_BTF bit in CPU register
>    should always match the state of current's TIF_BLOCKSTEP bit.
> 
> 3. Even get_debugctlmsr() + update_debugctlmsr() itself does not
>    look right. Irq can change other bits in MSR_IA32_DEBUGCTLMSR
>    register or the caller can be preempted in between.
> 
> 4. It is not safe to play with TIF_BLOCKSTEP if task != current.
>    DEBUGCTLMSR_BTF and TIF_BLOCKSTEP should always match each
>    other if the task is running. The tracee is stopped but it
>    can be SIGKILL'ed right before set/clear_tsk_thread_flag().
> 
> However, now that uprobes uses user_enable_single_step(current)
> we can't simply remove update_debugctlmsr(). So this patch adds
> the additional "task == current" check and disables irqs to avoid
> the race with interrupts/preemption.
> 
> Unfortunately this patch doesn't solve the last problem, we need
> another fix. Probably we should teach ptrace_stop() to set/clear
> single/block stepping after resume.
> 
> And afaics there is yet another problem: perf can play with
> MSR_IA32_DEBUGCTLMSR from nmi, this obviously means that even
> __switch_to_xtra() has problems.
> 
> Signed-off-by: Oleg Nesterov <o...@redhat.com>
> ---
>  arch/x86/kernel/step.c |   14 +++++++++++++-
>  1 files changed, 13 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c
> index 7a51498..f89cdc6 100644
> --- a/arch/x86/kernel/step.c
> +++ b/arch/x86/kernel/step.c
> @@ -161,6 +161,16 @@ static void set_task_blockstep(struct task_struct *task, 
> bool on)
>  {
>       unsigned long debugctl;
> 
> +     /*
> +      * Ensure irq/preemption can't change debugctl in between.
> +      * Note also that both TIF_BLOCKSTEP and debugctl should
> +      * be changed atomically wrt preemption.
> +      * FIXME: this means that set/clear TIF_BLOCKSTEP is simply
> +      * wrong if task != current, SIGKILL can wakeup the stopped
> +      * tracee and set/clear can play with the running task, this
> +      * can confuse the next __switch_to_xtra().
> +      */
> +     local_irq_disable();
>       debugctl = get_debugctlmsr();
>       if (on) {
>               debugctl |= DEBUGCTLMSR_BTF;
> @@ -169,7 +179,9 @@ static void set_task_blockstep(struct task_struct *task, 
> bool on)
>               debugctl &= ~DEBUGCTLMSR_BTF;
>               clear_tsk_thread_flag(task, TIF_BLOCKSTEP);
>       }
> -     update_debugctlmsr(debugctl);
> +     if (task == current)
> +             update_debugctlmsr(debugctl);
> +     local_irq_enable();
>  }
> 
>  /*
> 

The changes look simple and neat. But I would prefer somebody with
better x86 knowledgde comment on this.

-- 
Thanks and Regards
Srikar

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to