On 2018/12/18 23:32, Thomas Gleixner wrote: > On Tue, 18 Dec 2018, Li, Aubrey wrote: > >> On 2018/12/18 22:14, Thomas Gleixner wrote: >>> On Tue, 18 Dec 2018, Aubrey Li wrote: >>>> diff --git a/arch/x86/include/asm/fpu/internal.h >>>> b/arch/x86/include/asm/fpu/internal.h >>>> index a38bf5a1e37a..8778ac172255 100644 >>>> --- a/arch/x86/include/asm/fpu/internal.h >>>> +++ b/arch/x86/include/asm/fpu/internal.h >>>> @@ -411,6 +411,13 @@ static inline int copy_fpregs_to_fpstate(struct fpu >>>> *fpu) >>>> { >>>> if (likely(use_xsave())) { >>>> copy_xregs_to_kernel(&fpu->state.xsave); >>>> + >>>> + /* >>>> + * AVX512 state is tracked here because its use is >>>> + * known to slow the max clock speed of the core. >>>> + */ >>>> + if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512) >>>> + fpu->avx512_timestamp = jiffies_64; >>> >>> Even if unlikely this is incorrect when running a 32 bit kernel because >>> there jiffies_64 cannot be atomically loaded vs. a concurrent update. See >>> the comment in include/linux/jiffies.h right above the jiffies_64 >>> declaration. >>> >> Yeah, I noticed this, because this is under use_xsave() condition, also need >> valid AVX512 state, so a 32 bit kernel won't enter this branch. > > What exactly prevents a 32bit kernel from having the AVX512 feature bit > set? And if it cannot be set on 32bit, then why are you compiling that code > in at all? >
I misunderstood, you mean 32bit kernel, not 32bit machine. Theoretically 32bit kernel can use AVX512, but not sure if anyone use it like this. get_jiffies_64() includes jiffies_lock ops so not good in context switch. So I want to use raw jiffies_64 here. jiffies is a good candidate but it has wraparound overflow issue. Other time source are expensive here. Should I limit the code only running on 64bit kernel? Thanks, -Aubrey