On Tue, Sep 2, 2014 at 12:06 PM, David Vrabel <david.vra...@citrix.com>
wrote:
>
> From: Suresh Siddha <sbsid...@gmail.com>
>
> For non-eager fpu mode, thread's fpu state is allocated during the first
> fpu usage (in the context of device not available exception). This can be
> a blocking call and hence we enable interrupts (which were originally
> disabled when the exception happened), allocate memory and disable
> interrupts etc. While this saves 512 bytes or so per-thread, there
> are some issues in general.
>
> a.  Most of the future cases will be anyway using eager
> FPU (because of processor features like xsaveopt, LWP, MPX etc) and
> they do the allocation at the thread creation itself. Nice to have
> one common mechanism as all the state save/restore code is
> shared. Avoids the confusion and minimizes the subtle bugs
> in the core piece involved with context-switch.
>
> b. If a parent thread uses FPU, during fork() we allocate
> the FPU state in the child and copy the state etc. Shortly after this,
> during exec() we free it up, so that we can later allocate during
> the first usage of FPU. So this free/allocate might be slower
> for some workloads.
>
> c. math_state_restore() is called from multiple places
> and it is error pone if the caller expects interrupts to be disabled
> throughout the execution of math_state_restore(). Can lead to subtle
> bugs like Ubuntu bug #1265841.
>
> Memory savings will be small anyways and the code complexity
> introducing subtle bugs is not worth it. So remove
> the logic of non-eager fpu mem allocation at the first usage.
>


ping?

>
> Signed-off-by: Suresh Siddha <sbsid...@gmail.com>
> Signed-off-by: David Vrabel <david.vra...@citrix.com>
> ---
> v3:
> - Rebase on 3.17-rc3.
>
> v2:
> - Tweak condition for allocating the first thread's FPU state.
> ---
>  arch/x86/kernel/i387.c    |   20 +++++++++++---------
>  arch/x86/kernel/process.c |    6 ------
>  arch/x86/kernel/traps.c   |   16 ++--------------
>  arch/x86/kernel/xsave.c   |    2 --
>  4 files changed, 13 insertions(+), 31 deletions(-)
>
> diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
> index a9a4229..956ea86 100644
> --- a/arch/x86/kernel/i387.c
> +++ b/arch/x86/kernel/i387.c
> @@ -5,6 +5,7 @@
>   *  General FPU state handling cleanups
>   *     Gareth Hughes <gar...@valinux.com>, May 2000
>   */
> +#include <linux/bootmem.h>
>  #include <linux/module.h>
>  #include <linux/regset.h>
>  #include <linux/sched.h>
> @@ -197,6 +198,16 @@ void fpu_init(void)
>
>         mxcsr_feature_mask_init();
>         xsave_init();
> +
> +       /*
> +        * Now we know the final size of the xsave data, allocate the
> +        * FPU state area for the first task. All other tasks have
> +        * this allocated in arch_dup_task_struct().
> +        */
> +       if (!current->thread.fpu.state)
> +               current->thread.fpu.state =
> alloc_bootmem_align(xstate_size,
> +                               __alignof__(struct xsave_struct));
> +
>         eager_fpu_init();
>  }
>
> @@ -228,8 +239,6 @@ EXPORT_SYMBOL_GPL(fpu_finit);
>   */
>  int init_fpu(struct task_struct *tsk)
>  {
> -       int ret;
> -
>         if (tsk_used_math(tsk)) {
>                 if (cpu_has_fpu && tsk == current)
>                         unlazy_fpu(tsk);
> @@ -237,13 +246,6 @@ int init_fpu(struct task_struct *tsk)
>                 return 0;
>         }
>
> -       /*
> -        * Memory allocation at the first usage of the FPU and other state.
> -        */
> -       ret = fpu_alloc(&tsk->thread.fpu);
> -       if (ret)
> -               return ret;
> -
>         fpu_finit(&tsk->thread.fpu);
>
>         set_stopped_child_used_math(tsk);
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index f804dc9..4137f81 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -129,12 +129,6 @@ void flush_thread(void)
>         flush_ptrace_hw_breakpoint(tsk);
>         memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
>         drop_init_fpu(tsk);
> -       /*
> -        * Free the FPU state for non xsave platforms. They get reallocated
> -        * lazily at the first use.
> -        */
> -       if (!use_eager_fpu())
> -               free_thread_xstate(tsk);
>  }
>
>  static void hard_disable_TSC(void)
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 0d0e922..36fc898 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -652,20 +652,8 @@ void math_state_restore(void)
>  {
>         struct task_struct *tsk = current;
>
> -       if (!tsk_used_math(tsk)) {
> -               local_irq_enable();
> -               /*
> -                * does a slab alloc which can sleep
> -                */
> -               if (init_fpu(tsk)) {
> -                       /*
> -                        * ran out of memory!
> -                        */
> -                       do_group_exit(SIGKILL);
> -                       return;
> -               }
> -               local_irq_disable();
> -       }
> +       if (!tsk_used_math(tsk))
> +               init_fpu(tsk);
>
>         __thread_fpu_begin(tsk);
>
> diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
> index 940b142..eed95d6 100644
> --- a/arch/x86/kernel/xsave.c
> +++ b/arch/x86/kernel/xsave.c
> @@ -677,8 +677,6 @@ void xsave_init(void)
>
>  static inline void __init eager_fpu_init_bp(void)
>  {
> -       current->thread.fpu.state =
> -           alloc_bootmem_align(xstate_size, __alignof__(struct
> xsave_struct));
>         if (!init_xstate_buf)
>                 setup_init_fpu_buf();
>  }
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Reply via email to