On Mon, Mar 20, 2017 at 03:26:54PM -0500, Yazen Ghannam wrote:
> From: Yazen Ghannam <yazen.ghan...@amd.com>
> 
> Deferred errors on AMD systems may get an Action Optional severity with the
> goal of being handled by the SRAO notifier block. However, the process of
> determining if an address is usable is different between Intel and AMD. So
> define vendor-specific functions for this.
> 
> Also, check for the AO severity before determining if an address is usable
> to possibly save some cycles.
> 
> Signed-off-by: Yazen Ghannam <yazen.ghan...@amd.com>
> ---
> Link: 
> http://lkml.kernel.org/r/1486760120-60944-3-git-send-email-yazen.ghan...@amd.com
> 
> v1->v2:
> - New in v2. Based on v1 patch 3.
> - Update SRAO notifier block to handle errors from SMCA systems.
> 
>  arch/x86/kernel/cpu/mcheck/mce.c | 52 
> ++++++++++++++++++++++++++++++----------
>  1 file changed, 40 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c 
> b/arch/x86/kernel/cpu/mcheck/mce.c
> index 5e365a2..1a2669d 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -547,19 +547,49 @@ static void mce_report_event(struct pt_regs *regs)
>   * be somewhat complicated (e.g. segment offset would require an instruction
>   * parser). So only support physical addresses up to page granuality for now.
>   */
> -static int mce_usable_address(struct mce *m)
> +static int mce_usable_address_intel(struct mce *m, unsigned long *pfn)

So this function is basically saying whether the address is usable but
then it is also returning it if so.

And then it is using an I/O argument. Yuck.

So it sounds to me like this functionality needs redesign: something
like have a get_usable_address() function (the "mce_" prefix is not
really needed as it is static) which returns an invalid value when it
determines that it doesn't have a usable address and the address itself
if it succeeds.

>  {
> -     if (!(m->status & MCI_STATUS_MISCV) || !(m->status & MCI_STATUS_ADDRV))
> +     if (!(m->status & MCI_STATUS_MISCV))
>               return 0;
> -
> -     /* Checks after this one are Intel-specific: */
> -     if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
> -             return 1;
> -
>       if (MCI_MISC_ADDR_LSB(m->misc) > PAGE_SHIFT)
>               return 0;
>       if (MCI_MISC_ADDR_MODE(m->misc) != MCI_MISC_ADDR_PHYS)
>               return 0;
> +
> +     *pfn = m->addr >> PAGE_SHIFT;
> +     return 1;
> +}
> +
> +/* Only support this on SMCA systems and errors logged from a UMC. */
> +static int mce_usable_address_amd(struct mce *m, unsigned long *pfn)
> +{
> +     u8 umc;
> +     u16 nid = cpu_to_node(m->extcpu);
> +     u64 addr;
> +
> +     if (!mce_flags.smca)
> +             return 0;

So on !SMCA systems there'll be no usable address ever! Even with
MCI_STATUS_ADDRV set.

Please *test* your stuff on all affected hardware before submitting.

> +
> +     umc = find_umc_channel(m);
> +
> +     if (umc < 0 || umc_normaddr_to_sysaddr(m->addr, nid, umc, &addr))
> +             return 0;
> +
> +     *pfn = addr >> PAGE_SHIFT;
> +     return 1;
> +}
> +
> +static int mce_usable_address(struct mce *m, unsigned long *pfn)
> +{
> +     if (!(m->status & MCI_STATUS_ADDRV))
> +             return 0;

What happened to the MCI_STATUS_MISCV bit check?

> +     if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> +             return mce_usable_address_intel(m, pfn);
> +
> +     if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
> +             return mce_usable_address_amd(m, pfn);
> +
>       return 1;

We definitely don't want to say that the address is usable on a third
vendor. It would be most likely a lie even if we never reach this code
on a third vendor.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

Reply via email to