On 04/29/19 20:23, Kinney, Michael D wrote:
> Hi,
> 
> Thanks for the detailed feedback.
> 
> I did consider moving the CPUID check up a level so
> AsmLfence() would always do the requested instruction.
> This would have been a larger patch set that would
> introduce an IA32 and X64 specific version of
> SpeculationBarrier().
> 
> I agree that a build time PCD would is a better
> approach.  It should have a default value of enabled,
> and only platforms that use CPUs that do not support
> LFENCE would avoid the call to AsmLfence() and use an
> alternate serializing instruction.
> 
> Here is the proposed PCD:
> 
> [PcdsFixedAtBuild]
>   ## Indicates the type of instruction sequence to use
>   #  for a speculation barrier.  The default instruction
>   #  sequence is LFENCE.<BR>
>   #   0x00 - No operation.<BR>
>   #   0x01 - LFENCE (IA32/X64).<BR>
>   #   0x02 - CPUID  (IA32/X64).<BR>
>   #   Other - reserved
>   # @Prompt Speculation Barrier Type.
>   gEfiMdePkgTokenSpaceGuid.PcdSpeculationBarrierType|0x01|UINT8|0x30001018

Looks great to me.

Thanks,
Laszlo

> 
> Best regards,
> 
> Mike
> 
>> -----Original Message-----
>> From: Gao, Liming
>> Sent: Monday, April 29, 2019 7:10 AM
>> To: devel@edk2.groups.io; ler...@redhat.com;
>> brian.john...@hpe.com; Kinney, Michael D
>> <michael.d.kin...@intel.com>
>> Subject: RE: [edk2-devel] [Patch 1/4] MdePkg/BaseLib:
>> Verify SSE2 support in IA32 AsmLfence()
>>
>>> -----Original Message-----
>>> From: devel@edk2.groups.io
>> [mailto:devel@edk2.groups.io] On Behalf Of Laszlo Ersek
>>> Sent: Monday, April 29, 2019 7:15 PM
>>> To: devel@edk2.groups.io; brian.john...@hpe.com;
>> Kinney, Michael D <michael.d.kin...@intel.com>
>>> Cc: Gao, Liming <liming....@intel.com>
>>> Subject: Re: [edk2-devel] [Patch 1/4] MdePkg/BaseLib:
>> Verify SSE2 support in IA32 AsmLfence()
>>>
>>> On 04/26/19 23:08, Brian J. Johnson wrote:
>>>> On 4/26/19 3:27 PM, Laszlo Ersek wrote:
>>>>> On 04/25/19 19:53, Michael D Kinney wrote:
>>>>>> Use CPUID in IA32 implementation of AsmLfence() to
>> verify
>>>>>> that SSE2 is supported before using the LFENCE
>> instruction.
>>>>>>
>>>>>> Cc: Liming Gao <liming....@intel.com>
>>>>>> Signed-off-by: Michael D Kinney
>> <michael.d.kin...@intel.com>
>>>>>> ---
>>>>>>   MdePkg/Library/BaseLib/Ia32/Lfence.nasm | 14
>> +++++++++++++-
>>>>>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git
>> a/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
>>>>>> b/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
>>>>>> index 44478be35f..0a60ae1d57 100644
>>>>>> --- a/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
>>>>>> +++ b/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
>>>>>> @@ -1,5 +1,5 @@
>>>>>>
>>>>>> ;-------------------------------------------------
>> -----------------------------
>>>>>> ;
>>>>>> -; Copyright (c) 2018, Intel Corporation. All
>> rights reserved.<BR>
>>>>>> +; Copyright (c) 2018 - 2019, Intel Corporation.
>> All rights
>>>>>> reserved.<BR>
>>>>>>   ; SPDX-License-Identifier: BSD-2-Clause-Patent
>>>>>>   ;
>>>>>>   ; Module Name:
>>>>>> @@ -26,5 +26,17 @@
>>>>>>
>>>>>> ;-------------------------------------------------
>> -----------------------------
>>>>>>
>>>>>>   global ASM_PFX(AsmLfence)
>>>>>>   ASM_PFX(AsmLfence):
>>>>>> +    ;
>>>>>> +    ; Use CPUID instruction
>> (CPUID.01H:EDX.SSE2[bit 26] = 1) to test
>>>>>> whether the
>>>>>> +    ; processor supports SSE2
>> instruction.  Save/restore
>>>>>> non-volatile register
>>>>>> +    ; EBX that is modified by CPUID
>>>>>> +    ;
>>>>>> +    push    ebx
>>>>>> +    mov     eax, 1
>>>>>> +    cpuid
>>>>>> +    bt      edx, 26
>>>>>> +    jnc     Done
>>>>>>       lfence
>>>>>> +Done:
>>>>>> +    pop     ebx
>>>>>>       ret
>>>>>>
>>>>>
>>>>> The SDM seems to confirm that lfence depends on
>> SSE2.
>>>>>
>>>>> However, that raises another question:
>>>>>
>>>>> AsmLfence() is used for implementing
>> SpeculationBarrier() on IA32/X64.
>>>>> And so I wonder: the plaforms where lfence is
>> unavailable, do they *not*
>>>>> need a speculation barrier at all, or do they need
>> a *different*
>>>>> implementation?
>>>>>
>>>>> Thanks,
>>>>> Laszlo
>>>>
>>>> Several issues:
>>>>
>>>> This patch introduces stronger fencing than is
>> required.  The SDM says,
>>>> "Reads or writes cannot be reordered with I/O
>> instructions, locked
>>>> instructions, or serializing instructions." (vol 3a,
>> sec 8.2.2.)  The
>>>> cpuid instruction is a "serializing instruction"
>> (sec 8.3).  So the
>>>> cpuid is essentially a load fence plus a store fence
>> (plus other
>>>> functionality, such as draining the memory
>> queues.)  That makes the
>>>> lfence following it redundant.
>>>>
>>>> Cpuid is a fairly heavyweight instruction due to its
>> serializing
>>>> behavior.  It provides the load fencing semantics of
>> lfence, but can
>>>> introduce a significant performance hit if it's
>> called often.  Lfence is
>>>> a lot lighter weight instruction.  So using cpuid in
>> AsmLfence may make
>>>> it a lot slower than the caller expects.
>>>>
>>>> Also, skipping a fencing instruction if it's not
>> supported doesn't seem
>>>> like the right approach in any case.  The caller
>> expects AsmLfence to
>>>> provide certain fencing semantics... the
>> implementation isn't free to
>>>> just do nothing (unless it's on an architecture
>> where load fencing is
>>>> not required, since memory is always
>> ordered.)  Plus, the routine is
>>>> called "AsmLfence", so I'd expect it to always use
>> lfence, and cause an
>>>> exception if the instruction isn't implemented.  I'd
>> think the callers
>>>> should be modified to call AsmLfence or routines
>> using other
>>>> instructions (such as cpuid) to provide fencing
>> according to the CPU
>>>> architecture they are on.
>>>
>>> That appears the right approach to me -- let AsmLfence
>> do what its name
>>> says, and let platforms pick the right implementation
>> for
>>> SpeculationBarrier -- possibly at build time, possibly
>> at boot time.
>>>
>>>> Is there a way to make this a compile-time
>> decision?  I haven't tried to
>>>> track down all the callers of AsmLfence....
>>>
>>> Right now AsmLfence is only called from
>>> "MdePkg/Library/BaseLib/X86SpeculationBarrier.c".
>>>
>> This is a good suggestion to decide it at build time.
>> One PCD can be introduced to control its logic.
>>> SpeculationBarrier() is called in SMM drivers / lib
>> instances:
>>> - FaultTolerantWriteSmm,
>> FaultTolerantWriteStandaloneMm
>>> - SmmLockBoxLib
>>> - VariableSmm, VariableStandaloneMm
>>> - PiSmmCpuDxeSmm
>>>
>>> Given that there's a mode switch anyway, the runtime
>> cost of a simple
>>> CPUID-based implementation might be tolerable.
>>>
>>> Thanks,
>>> Laszlo
>>>
>>> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#39836): https://edk2.groups.io/g/devel/message/39836
Mute This Topic: https://groups.io/mt/31345225/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to