[PATCH 2/6] EDAC/amd64: Gather hardware information early

2019-10-18 Thread Ghannam, Yazen
From: Yazen Ghannam Split out gathering hardware information from init_one_instance() into a separate function get_hardware_info(). This is necessary so that the information can be cached earlier and used to check if memory is populated and if ECC is enabled on a node. Signed-off-by: Yazen Ghan

[PATCH 4/6] EDAC/amd64: Use cached data when checking for ECC

2019-10-18 Thread Ghannam, Yazen
From: Yazen Ghannam ...now that the data is available earlier. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20190821235938.118710-10-yazen.ghan...@amd.com rfc -> v1: * No change. drivers/edac/amd64_edac.c | 20 1 file changed, 8 insertions(+), 12 dele

[PATCH 1/6] EDAC/amd64: Make struct amd64_family_type global

2019-10-18 Thread Ghannam, Yazen
From: Yazen Ghannam The struct amd64_family_type doesn't change between multiple nodes and instances of the modules, so make it global. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20190821235938.118710-9-yazen.ghan...@amd.com rfc -> v1: * New patch based on suggestion from

[PATCH 6/6] EDAC/amd64: Set grain per DIMM

2019-10-18 Thread Ghannam, Yazen
From: Yazen Ghannam The following commit introduced a warning on error reports without a non-zero grain value. 3724ace582d9 ("EDAC/mc: Fix grain_bits calculation") The amd64_edac_mod module does not provide a value, so the warning will be given on the first reported memory error. Set the gra

[PATCH 5/6] EDAC/amd64: Check for memory before fully initializing an instance

2019-10-18 Thread Ghannam, Yazen
From: Yazen Ghannam Return early before checking for ECC if the node does not have any populated memory. Free any cached hardware data before returning. Also, return 0 in this case since this is not a failure. Other nodes may have memory and the module should attempt to load an instance for them

[PATCH 0/6] AMD64 EDAC: Check for nodes without memory, etc.

2019-10-18 Thread Ghannam, Yazen
From: Yazen Ghannam Hi Boris, This set contains the next revision of the RFC patches I included with the last AMD64 EDAC updates. I dropped the RFC tags, and I added a couple of new patches. Most of these patches address the issue where the module check and complains about DRAM ECC on nodes wit

RE: [PATCH 0/6] AMD64 EDAC: Check for nodes without memory, etc.

2019-10-21 Thread Ghannam, Yazen
> -Original Message- > From: linux-kernel-ow...@vger.kernel.org > On Behalf Of Borislav Petkov > Sent: Monday, October 21, 2019 10:48 AM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH 0/6] AMD64 EDAC: Ch

[PATCH 3/5] EDAC/amd64: Recognize x16 Symbol Size

2019-02-19 Thread Ghannam, Yazen
From: Yazen Ghannam Future AMD systems may support x16 symbol sizes. Recognize if a system is using x16 symbol size. Also, simplify the print statement. Note that a x16 syndrome vector table is not necessary like with x4 or x8. This is because systems that support x16 symbol sizes will be SMCA

[PATCH 1/5] EDAC/amd64: Add Fam17hMod30h PCI IDs

2019-02-19 Thread Ghannam, Yazen
From: Yazen Ghannam Add the new Family 17h Model 30h PCI IDs to the AMD64 EDAC module. This also fixes a probe failure that appeared when some other PCI IDs for Fam17hMod30h were added to the AMD NB code. Fixes: be3518a16ef2 (x86/amd_nb: Add PCI device IDs for family 17h, model 30h) Signed-off-

[PATCH 5/5] EDAC/amd64: Adjust printed Chip Select sizes when interleaved

2019-02-19 Thread Ghannam, Yazen
From: Yazen Ghannam AMD systems may support Chip Select interleaving. However, on Fam17h+ this was not taken into account when printing the Chip Select sizes. Add support to detect if Chip Selects are interleaved on Fam17h+, and adjust the sizes accordingly. Signed-off-by: Yazen Ghannam --- d

[PATCH 4/5] EDAC/amd64: Support more than two Controllers for Chip Select handling

2019-02-19 Thread Ghannam, Yazen
From: Yazen Ghannam The struct chip_select array that's used for saving Chip Select bases and masks is fixed at length of two. There should be one struct chip_select for each controller, so this array should be increased to support systems that may have more than two controllers. Increase the si

[PATCH 2/5] EDAC/amd64: Support more than two UMCs

2019-02-19 Thread Ghannam, Yazen
From: Yazen Ghannam The first few models of Family 17h all had 2 UMCs per Die, so we treated this as a fixed value. However, future systems may have more UMCs per Die. Related to this, we were finding the channel number and base address of a UMC by matching on fixed, known values. However, a pat

[PATCH 3/5] EDAC, mce_amd: Add new error descriptions for some SMCA bank types

2019-02-01 Thread Ghannam, Yazen
From: Yazen Ghannam Some SMCA bank types on future systems will report new error types even though the bank type is not treated as a new version. These new error types will reported by bits that are reserved in past systems. Add the new error descriptions to the lists in edac_mce_amd. Signed-of

[PATCH 4/5] EDAC, mce_amd: Match error descriptions to latest documentation

2019-02-01 Thread Ghannam, Yazen
From: Yazen Ghannam Update the error descriptions to match the latest documentation for easier searching. In some cases the changes are small and in other cases the changes may be total rewording of the description. No functional change. Signed-off-by: Yazen Ghannam --- drivers/edac/mce_amd.c

[PATCH 5/5] EDAC, mce_amd: Print ExtErrorCode and description on a single line

2019-02-01 Thread Ghannam, Yazen
From: Yazen Ghannam Save a log line by printing the extended error code and the description on a single line. This is similar to how errors are printed in other subsystems, e.g. "#, description". If we don't have a valid description then only the number/code is printed. Signed-off-by: Yazen Ghan

[PATCH 2/5] x86/MCE/AMD: Add new McaTypes for CS, PSP, and SMU

2019-02-01 Thread Ghannam, Yazen
From: Yazen Ghannam The existing CS, PSP, and SMU SMCA bank types will see new versions (as indicated by their McaTypes) in future SMCA systems. Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the same names as the older versions, since they are logically the same to the user. S

[PATCH 0/5] Add new SMCA bank types

2019-02-01 Thread Ghannam, Yazen
From: Yazen Ghannam This series adds decoding for some new SMCA bank types (MP5, NBIO, and PCIE) and also for some new versions of existing bank types (CS, PSP, and SMU). This series also adds new error type descriptions for existing SMCA bank types that aren't getting a version bump. And also s

[PATCH 1/5] x86/MCE/AMD: Add new MP5, NBIO, and PCIE SMCA bank types

2019-02-01 Thread Ghannam, Yazen
From: Yazen Ghannam Add the (HWID, MCATYPE) tuples and names for the new MP5, NBIO, and PCIE SMCA bank types. Also, add their respective error descriptions to edac_mce_amd. Signed-off-by: Yazen Ghannam --- arch/x86/include/asm/mce.h| 3 +++ arch/x86/kernel/cpu/mce/amd.c | 12

RE: [PATCH 5/5] EDAC, mce_amd: Print ExtErrorCode and description on a single line

2019-02-04 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov > Sent: Sunday, February 3, 2019 6:22 AM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; b...@suse.de; > tony.l...@intel.com; x...@kernel.org > Subject: Re: [PATCH 5/5] EDAC, mce_amd: P

RE: [PATCH v3 4/6] x86/MCE: Make number of MCA banks per_cpu

2019-05-21 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov > Sent: Saturday, May 18, 2019 6:26 AM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; b...@suse.de; > tony.l...@intel.com; x...@kernel.org > Subject: Re: [PATCH v3 4/6] x86/MCE: Mak

RE: [PATCH v3 4/6] x86/MCE: Make number of MCA banks per_cpu

2019-05-22 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Tuesday, May 21, 2019 6:09 PM > To: Luck, Tony > Cc: Ghannam, Yazen ; linux-e...@vger.kernel.org; > linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re

[PATCH] x86/MCE: Statically allocate mce_banks_array

2019-05-23 Thread Ghannam, Yazen
From: Yazen Ghannam The MCE control data is stored in an array of struct mce_banks. This array has historically been shared by all CPUs and it was allocated dynamically during the first CPU's init sequence. However, starting with 5b0883f5c7be ("x86/MCE: Make mce_banks a per-CPU array")

RE: [PATCH] x86/MCE: Statically allocate mce_banks_array

2019-05-23 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov > Sent: Thursday, May 23, 2019 3:28 PM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; > tony.l...@intel.com; x...@kernel.org > Subject: Re: [PATCH] x86/MCE: Statically allocate mce_b

RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-05-17 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Friday, May 17, 2019 2:35 PM > To: Luck, Tony > Cc: Ghannam, Yazen ; linux-e...@vger.kernel.org; > linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re

RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-05-23 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov > Sent: Friday, May 17, 2019 3:02 PM > To: Ghannam, Yazen > Cc: Luck, Tony ; linux-e...@vger.kernel.org; > linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that g

[PATCH 5/5] x86/MCE: Save MCA control bits that get set in hardware

2019-04-07 Thread Ghannam, Yazen
From: Yazen Ghannam The OS is expected to write all bits in MCA_CTL. However, only implemented bits get set in the hardware. Read back MCA_CTL so that the value in the hardware is saved and reported through sysfs. Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/mce/core.c | 15 ++

[PATCH 2/5] x86/MCE: Handle MCA controls in a per_cpu way

2019-04-07 Thread Ghannam, Yazen
From: Yazen Ghannam Current AMD systems have unique MCA banks per logical CPU even though the type of the banks may all align to the same bank number. Each CPU will have control of a set of MCA banks in the hardware and these are not shared with other CPUs. For example, bank 0 may be the Load-St

[PATCH 1/5] x86/MCE: Make struct mce_banks[] static

2019-04-07 Thread Ghannam, Yazen
From: Yazen Ghannam The struct mce_banks[] array is only used in mce/core.c so move the definition of struct mce_bank to mce/core.c and make the array static. Also, change the "init" field to bool type. Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/mce/core.c | 11 ++- arch

[PATCH 3/5] x86/MCE/AMD: Don't cache block addresses on SMCA systems

2019-04-07 Thread Ghannam, Yazen
From: Yazen Ghannam On legacy systems, the addresses of the MCA_MISC* registers need to be recursively discovered based on a Block Pointer field in the registers. On Scalable MCA systems, the register space is fixed, and particular addresses can be derived by regular offsets for bank and registe

[PATCH 4/5] x86/MCE: Make number of MCA banks per_cpu

2019-04-07 Thread Ghannam, Yazen
From: Yazen Ghannam The number of MCA banks is provided per logical CPU. Historically, this number has been the same across all CPUs, but this is not an architectural guarantee. Future AMD systems may have MCA bank counts that vary between logical CPUs in a system. This issue was partially addre

[PATCH 0/5] Handle MCA banks in a per_cpu way

2019-04-07 Thread Ghannam, Yazen
From: Yazen Ghannam The focus of this patchset is define and use the MCA bank structures and bank count per logical CPU. With the exception of patch 4, this set applies to systems in production today. Patch 1: Moves the declaration of struct mce_banks[] to the only file it's used. Patch 2: Spl

RE: [PATCH 5/6] acpi/cppc: Add support for optional CPPC registers

2019-03-29 Thread Ghannam, Yazen
@vger.kernel.org > Cc: Ghannam, Yazen ; l...@kernel.org; > viresh.ku...@linaro.org; Moore, Robert > ; Schmauss, Erik ; > r...@rjwysocki.net > Subject: Re: [PATCH 5/6] acpi/cppc: Add support for optional CPPC registers > > On Fri, 2019-03-22 at 20:26 +, Natarajan,

RE: [PATCH] tools/power turbostat: Make interval calculation per thread to reduce jitter

2019-04-23 Thread Ghannam, Yazen
> -Original Message- > From: linux-kernel-ow...@vger.kernel.org > On Behalf Of Ghannam, Yazen > Sent: Monday, March 25, 2019 12:33 PM > To: linux...@vger.kernel.org > Cc: Ghannam, Yazen ; linux-kernel@vger.kernel.org; > l...@kernel.org > Subject: [PATCH] too

[PATCH v3 2/6] x86/MCE: Handle MCA controls in a per_cpu way

2019-04-30 Thread Ghannam, Yazen
From: Yazen Ghannam Current AMD systems have unique MCA banks per logical CPU even though the type of the banks may all align to the same bank number. Each CPU will have control of a set of MCA banks in the hardware and these are not shared with other CPUs. For example, bank 0 may be the Load-St

[PATCH v3 4/6] x86/MCE: Make number of MCA banks per_cpu

2019-04-30 Thread Ghannam, Yazen
From: Yazen Ghannam The number of MCA banks is provided per logical CPU. Historically, this number has been the same across all CPUs, but this is not an architectural guarantee. Future AMD systems may have MCA bank counts that vary between logical CPUs in a system. This issue was partially addre

[PATCH v3 3/6] x86/MCE/AMD: Don't cache block addresses on SMCA systems

2019-04-30 Thread Ghannam, Yazen
From: Yazen Ghannam On legacy systems, the addresses of the MCA_MISC* registers need to be recursively discovered based on a Block Pointer field in the registers. On Scalable MCA systems, the register space is fixed, and particular addresses can be derived by regular offsets for bank and registe

[PATCH v3 0/6] Handle MCA banks in a per_cpu way

2019-04-30 Thread Ghannam, Yazen
From: Yazen Ghannam The focus of this patchset is define and use the MCA bank structures and bank count per logical CPU. With the exception of patch 4, this set applies to systems in production today. Patch 1: Moves the declaration of struct mce_banks[] to the only file it's used. Patch 2: Spl

[PATCH v3 6/6] x86/MCE: Treat MCE bank as initialized if control bits set in hardware

2019-04-30 Thread Ghannam, Yazen
From: Yazen Ghannam The OS is expected to write all bits to MCA_CTL for each bank. However, some banks may be unused in which case the registers for such banks are Read-as-Zero/Writes-Ignored. Also, the OS may not write any control bits because of quirks, etc. A bank can be considered uninitiali

[PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-04-30 Thread Ghannam, Yazen
From: Yazen Ghannam The OS is expected to write all bits in MCA_CTL. However, only implemented bits get set in the hardware. Read back MCA_CTL so that the value in the hardware is saved and reported through sysfs. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20190411201743.

[PATCH v3 1/6] x86/MCE: Make struct mce_banks[] static

2019-04-30 Thread Ghannam, Yazen
From: Yazen Ghannam The struct mce_banks[] array is only used in mce/core.c so move the definition of struct mce_bank to mce/core.c and make the array static. Also, change the "init" field to bool type. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20190411201743.43195-2-yaz

RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-06-07 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov > Sent: Monday, May 27, 2019 6:29 PM > To: Ghannam, Yazen > Cc: Luck, Tony ; linux-e...@vger.kernel.org; > linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that g

RE: [PATCH] tools/power turbostat: Make interval calculation per thread to reduce jitter

2019-06-07 Thread Ghannam, Yazen
> -Original Message- > From: Ghannam, Yazen > Sent: Tuesday, April 23, 2019 12:53 PM > To: Ghannam, Yazen ; linux...@vger.kernel.org; > len.br...@intel.com > Cc: linux-kernel@vger.kernel.org; Len Brown > Subject: RE: [PATCH] tools/power turbostat: Make interval calc

RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-06-07 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Friday, June 7, 2019 11:37 AM > To: Ghannam, Yazen > Cc: Luck, Tony ; linux-e...@vger.kernel.org; > linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re

RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-06-07 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Friday, June 7, 2019 11:59 AM > To: Ghannam, Yazen > Cc: Luck, Tony ; linux-e...@vger.kernel.org; > linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re

[PATCH v4 2/5] x86/MCE: Make mce_banks a per-CPU array

2019-06-07 Thread Ghannam, Yazen
From: Yazen Ghannam Current AMD systems have unique MCA banks per logical CPU even though the type of the banks may all align to the same bank number. Each CPU will have control of a set of MCA banks in the hardware and these are not shared with other CPUs. For example, bank 0 may be the Load-St

[PATCH v4 0/5] Handle MCA banks in a per_cpu way

2019-06-07 Thread Ghannam, Yazen
From: Yazen Ghannam The focus of this patchset is define and use the MCA bank structures and bank count per logical CPU. With the exception of patch 4, this set applies to systems in production today. Patch 1: Moves the declaration of struct mce_banks[] to the only file it's used. Patch 2: Spl

[PATCH v4 4/5] x86/MCE: Make the number of MCA banks a per-CPU variable

2019-06-07 Thread Ghannam, Yazen
From: Yazen Ghannam The number of MCA banks is provided per logical CPU. Historically, this number has been the same across all CPUs, but this is not an architectural guarantee. Future AMD systems may have MCA bank counts that vary between logical CPUs in a system. This issue was partially addre

[PATCH v4 1/5] x86/MCE: Make struct mce_banks[] static

2019-06-07 Thread Ghannam, Yazen
From: Yazen Ghannam The struct mce_banks[] array is only used in mce/core.c so move its definition there and make it static. Also, change the "init" field to bool type. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20190430203206.104163-2-yazen.ghan...@amd.com v3->v4: * No c

[PATCH v4 3/5] x86/MCE/AMD: Don't cache block addresses on SMCA systems

2019-06-07 Thread Ghannam, Yazen
From: Yazen Ghannam On legacy systems, the addresses of the MCA_MISC* registers need to be recursively discovered based on a Block Pointer field in the registers. On Scalable MCA systems, the register space is fixed, and particular addresses can be derived by regular offsets for bank and registe

[PATCH v4 5/5] x86/MCE: Determine MCA banks' init state properly

2019-06-07 Thread Ghannam, Yazen
From: Yazen Ghannam The OS is expected to write all bits to MCA_CTL for each bank, thus enabling error reporting in all banks. However, some banks may be unused in which case the registers for such banks are Read-as-Zero/Writes-Ignored. Also, the OS may avoid setting some control bits because of

[PATCH v2 3/7] EDAC/amd64: Initialize DIMM info for systems with more than two channels

2019-07-09 Thread Ghannam, Yazen
From: Yazen Ghannam Currently, the DIMM info for AMD Family 17h systems is initialized in init_csrows(). This function is shared with legacy systems, and it has a limit of two channel support. This prevents initialization of the DIMM info for a number of ranks, so there will be missing ranks in

[PATCH v2 0/7] AMD64 EDAC fixes

2019-07-09 Thread Ghannam, Yazen
From: Yazen Ghannam Hi Boris, This set contains a few fixes for some changes merged in v5.2. There are also a couple of fixes for older issues. In addition, there are a couple of patches to add support for Asymmetric Dual-Rank DIMMs. Thanks, Yazen Link: https://lkml.kernel.org/r/20190531234501

[PATCH v2 7/7] EDAC/amd64: Support Asymmetric Dual-Rank DIMMs

2019-07-09 Thread Ghannam, Yazen
From: Yazen Ghannam Future AMD systems will support "Asymmetric" Dual-Rank DIMMs. These are DIMMs were the ranks are of different sizes. The even rank will use the Primary Even Chip Select registers and the odd rank will use the Secondary Odd Chip Select registers. Recognize if a Secondary Odd

[PATCH v2 4/7] EDAC/amd64: Find Chip Select memory size using Address Mask

2019-07-09 Thread Ghannam, Yazen
From: Yazen Ghannam Chip Select memory size reporting on AMD Family 17h was recently fixed in order to account for interleaving. However, the current method is not robust. The Chip Select Address Mask can be used to find the memory size. There are a few cases. 1) For single-rank, use the addres

[PATCH v2 2/7] EDAC/amd64: Recognize DRAM device type with EDAC_CTL_CAP

2019-07-09 Thread Ghannam, Yazen
From: Yazen Ghannam AMD Family 17h systems support x4 and x16 DRAM devices. However, the device type is not checked when setting EDAC_CTL_CAP. Set the appropriate EDAC_CTL_CAP flag based on the device type. Fixes: 2d09d8f301f5 ("EDAC, amd64: Determine EDAC MC capabilities on Fam17h") Signed-off

[PATCH v2 6/7] EDAC/amd64: Cache secondary Chip Select registers

2019-07-09 Thread Ghannam, Yazen
From: Yazen Ghannam AMD Family 17h systems have a set of secondary Chip Select Base Addresses and Address Masks. These do not represent unique Chip Selects, rather they are used in conjunction with the primary Chip Select registers in certain use cases. Cache these secondary Chip Select register

[PATCH v2 1/7] EDAC/amd64: Support more than two controllers for chip selects handling

2019-07-09 Thread Ghannam, Yazen
From: Yazen Ghannam The struct chip_select array that's used for saving chip select bases and masks is fixed at length of two. There should be one struct chip_select for each controller, so this array should be increased to support systems that may have more than two controllers. Increase the si

[PATCH v2 5/7] EDAC/amd64: Decode syndrome before translating address

2019-07-09 Thread Ghannam, Yazen
From: Yazen Ghannam AMD Family 17h systems currently require address translation in order to report the system address of a DRAM ECC error. This is currently done before decoding the syndrome information. The syndrome information does not depend on the address translation, so the proper EDAC csro

RE: [PATCH 2/2] x86/MCE/AMD, EDAC/mce_amd: Don't report L1 BTB MCA errors on some Family 17h models

2019-03-11 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Monday, March 11, 2019 1:21 PM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; Borislav Petkov ; Tony Luck > ; x...@kernel.org; linux- > ker...@vger.kern

RE: [PATCH] x86, mce: Fix machine_check_poll() tests for which errors to log

2019-03-11 Thread Ghannam, Yazen
> -Original Message- > From: linux-kernel-ow...@vger.kernel.org > On Behalf Of Tony Luck > Sent: Monday, March 11, 2019 1:51 PM > To: Borislav Petkov > Cc: Tony Luck ; x...@kernel.org; > linux-kernel@vger.kernel.org; Ashok Raj > Subject: [PATCH] x86, mce: Fix machine_check_poll() tests

RE: [PATCH] x86, mce: Fix machine_check_poll() tests for which errors to log

2019-03-11 Thread Ghannam, Yazen
> -Original Message- > From: Luck, Tony > Sent: Monday, March 11, 2019 3:42 PM > To: Ghannam, Yazen > Cc: Borislav Petkov ; x...@kernel.org; > linux-kernel@vger.kernel.org; Ashok Raj > Subject: Re: [PATCH] x86, mce: Fix machine_check_poll() tests for which > e

[PATCH 2/2] EDAC/mce_amd: Decode MCA_STATUS in bit definition order

2019-02-12 Thread Ghannam, Yazen
From: Yazen Ghannam Reorder how we decode the bits in MCA_STATUS to follow how their defined in the register. The order is as follows: Bit : Decode 61 : UC 59 : MiscV 58 : AddrV 57 : PCC 55 : TCC 53 : SyndV 46 : CECC 45 : UECC 44 : Deferred 43 : Poison 40 : Scrub Signed-off-by: Yaze

[PATCH 1/2] EDAC/mce_amd: Decode MCA_STATUS[Scrub] bit

2019-02-12 Thread Ghannam, Yazen
From: Yazen Ghannam Previous AMD systems have had a bit in MCA_STATUS to indicate that an error was detected on a scrub operation. However, this bit was defined differently within different banks and families/models. Starting with Family 17h, MCA_STATUS[40] is either Reserved/Read-as-Zero or def

[PATCH v2 2/2] x86/MCE/AMD: Don't report L1 BTB MCA errors on some Family 17h models

2019-03-21 Thread Ghannam, Yazen
From: Yazen Ghannam AMD Family 17h Models 10h-2Fh may report a high number of L1 BTB MCA errors under certain conditions. The errors are benign and can safely be ignored. However, the high error rate may cause the MCA threshold counter to overflow causing a high rate of thresholding interrupts. I

[PATCH v2 1/2] x86/MCE: Add function to allow filtering of MCA errors

2019-03-21 Thread Ghannam, Yazen
From: Yazen Ghannam Some systems may report spurious MCA errors. In general, spurious MCA errors may be disabled by clearing a particular bit in MCA_CTL. However, clearing a bit in MCA_CTL may not be recommended for some errors, so the only option is to ignore them. An MCA error is printed and h

RE: [PATCH v3 1/6] EDAC/amd64: Add Family 17h Model 30h PCI IDs

2019-03-21 Thread Ghannam, Yazen
Hi Boris, Any comments on this set? Thanks, Yazen > -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Ghannam, Yazen > Sent: Thursday, February 28, 2019 9:36 AM > To: linux-e...@vger.kernel.org > Cc: Ghannam, Yazen ; linux-kernel@vge

[PATCH] ACPI / processor: Set P_LVL{2,3} idle state descriptions

2019-02-17 Thread Ghannam, Yazen
From: Yazen Ghannam The ACPI idle driver will fallback to using the legacy P_LVL* SystemIO method of entering C-states if the _CST method is disabled and P_BLK is defined. However, in this case the C2 and C3 states won't have a description set, so the user will see "" when reading the description

Re: [PATCH v2 1/2] x86/MCE: Add function to allow filtering of MCA errors

2019-03-22 Thread Ghannam, Yazen
On 3/22/2019 12:24 PM, Borislav Petkov wrote: > On Thu, Mar 21, 2019 at 08:25:17PM +0000, Ghannam, Yazen wrote: >> From: Yazen Ghannam >> >> Some systems may report spurious MCA errors. In general, spurious MCA >> errors may be disabled by clearing a particu

Re: [PATCH v2 2/2] x86/MCE/AMD: Don't report L1 BTB MCA errors on some Family 17h models

2019-03-22 Thread Ghannam, Yazen
On 3/22/2019 12:34 PM, Borislav Petkov wrote: > On Thu, Mar 21, 2019 at 08:25:18PM +0000, Ghannam, Yazen wrote: >> From: Yazen Ghannam >> >> AMD Family 17h Models 10h-2Fh may report a high number of L1 BTB MCA >> errors under certain conditions. The errors are benign a

RE: [PATCH v2 2/2] x86/MCE/AMD: Don't report L1 BTB MCA errors on some Family 17h models

2019-03-22 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Friday, March 22, 2019 2:32 PM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; > tony.l...@intel.com; x...@kernel.org; ra

[PATCH v3 2/3] x86/MCE/AMD: Don't report L1 BTB MCA errors on some Family 17h models

2019-03-22 Thread Ghannam, Yazen
From: Yazen Ghannam AMD Family 17h Models 10h-2Fh may report a high number of L1 BTB MCA errors under certain conditions. The errors are benign and can safely be ignored. However, the high error rate may cause the MCA threshold counter to overflow causing a high rate of thresholding interrupts. I

[PATCH v3 3/3] x86/MCE: Group AMD function prototypes in

2019-03-22 Thread Ghannam, Yazen
From: Yazen Ghannam There are two groups of "ifdef CONFIG_X86_MCE_AMD" function prototypes in . Merge these two groups. No functional change. Signed-off-by: Yazen Ghannam --- v2->v3 * This patch is new and unrelated to the other two. I just happened to notice this issue when making other ch

[PATCH v3 1/3] x86/MCE: Add function to allow filtering of MCA errors

2019-03-22 Thread Ghannam, Yazen
From: Yazen Ghannam Some systems may report spurious MCA errors. In general, spurious MCA errors may be disabled by clearing a particular bit in MCA_CTL. However, clearing a bit in MCA_CTL may not be recommended for some errors, so the only option is to ignore them. An MCA error is printed and h

Re: [PATCH v3 2/3] x86/MCE/AMD: Don't report L1 BTB MCA errors on some Family 17h models

2019-03-22 Thread Ghannam, Yazen
On 3/22/2019 3:28 PM, Ghannam, Yazen wrote: > From: Yazen Ghannam > > AMD Family 17h Models 10h-2Fh may report a high number of L1 BTB MCA > errors under certain conditions. The errors are benign and can safely be > ignored. However, the high error rate may cause the MCA thresho

RE: [PATCH v3 2/3] x86/MCE/AMD: Don't report L1 BTB MCA errors on some Family 17h models

2019-03-22 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Friday, March 22, 2019 3:55 PM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; > tony.l...@intel.com; x...@kernel.org; ra

RE: [PATCH v3 3/6] EDAC/amd64: Support more than two Unified Memory Controllers

2019-03-23 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Saturday, March 23, 2019 7:16 AM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH v3 3/6] EDAC/amd64:

[PATCH v4 2/2] x86/MCE/AMD: Don't report L1 BTB MCA errors on some Family 17h models

2019-03-25 Thread Ghannam, Yazen
From: Yazen Ghannam AMD Family 17h Models 10h-2Fh may report a high number of L1 BTB MCA errors under certain conditions. The errors are benign and can safely be ignored. However, the high error rate may cause the MCA threshold counter to overflow causing a high rate of thresholding interrupts. I

[PATCH v4 1/2] x86/MCE: Add function to allow filtering of MCA errors

2019-03-25 Thread Ghannam, Yazen
From: Yazen Ghannam Some systems may report spurious MCA errors. In general, spurious MCA errors may be disabled by clearing a particular bit in MCA_CTL. However, clearing a bit in MCA_CTL may not be recommended for some errors, so the only option is to ignore them. An MCA error is printed and h

[PATCH] tools/power turbostat: Make interval calculation per thread to reduce jitter

2019-03-25 Thread Ghannam, Yazen
From: Yazen Ghannam Turbostat currently normalizes TSC and other values by dividing by an interval. This interval is the delta between the start of one global (all counters on all CPUs) sampling and the start of another. However, this introduces a lot of jitter into the data. In order to reduce

[PATCH] EDAC/amd64: Use maximum channel count for the EDAC channel layer size

2019-03-25 Thread Ghannam, Yazen
From: Yazen Ghannam The AMD64 EDAC module current hardcodes the EDAC channel layer size (count) to two. Future AMD systems may have more channels than this. Set the EDAC channel layer size equal to the maximum number of channels possible for the system. On Family 17h and later, this is set in th

[PATCH 1/2] x86/MCE/AMD: Export smca_get_bank_type()

2019-03-07 Thread Ghannam, Yazen
From: Yazen Ghannam Export the smca_get_bank_type() function so it can be used in the AMD MCE decoder module. Cc: # 4.14.x Signed-off-by: Yazen Ghannam --- arch/x86/include/asm/mce.h| 1 + arch/x86/kernel/cpu/mce/amd.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a

[PATCH 2/2] x86/MCE/AMD, EDAC/mce_amd: Don't report L1 BTB MCA errors on some Family 17h models

2019-03-07 Thread Ghannam, Yazen
From: Yazen Ghannam AMD Family 17h Models 10h-2Fh may report a high number of L1 BTB MCA errors under certain conditions. The errors are benign and can safely be ignored. However, the high error rate may cause the MCA threshold counter to overflow causing a high rate of thresholding interrupts. I

RE: [PATCH v3 3/8] efi: Decode IA32/X64 Processor Error Info Structure

2018-04-02 Thread Ghannam, Yazen
> -Original Message- > From: Ard Biesheuvel > Sent: Friday, March 30, 2018 7:25 AM > To: Ghannam, Yazen > Cc: Borislav Petkov ; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org; x...@kernel.org; tony.l...@intel.com > Subject: Re: [PATCH v3 3/8] efi: Deco

RE: [PATCH] x86/MCE, EDAC/mce_amd: Save all aux registers on SMCA systems

2018-04-20 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov > Sent: Wednesday, April 18, 2018 1:14 PM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; > tony.l...@intel.com; x...@kernel.org > Subject: Re: [PATCH] x86/MCE, EDAC/mce_amd: Save

RE: [PATCH 3/3] x86/MCE/AMD: Get address from already initialized block

2018-04-17 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org ow...@vger.kernel.org> On Behalf Of Johannes Hirte > Sent: Monday, April 16, 2018 7:56 AM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; b...@suse.de; > t

RE: [PATCH] x86/MCE, EDAC/mce_amd: Save all aux registers on SMCA systems

2018-04-17 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov > Sent: Tuesday, April 17, 2018 1:21 PM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; > tony.l...@intel.com; x...@kernel.org > Subject: Re: [PATCH] x86/MCE, EDAC/mce_amd: Save all au

RE: [RFC PATCH] x86/CPU/AMD: Bring back Compute Unit ID

2017-02-01 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov [mailto:b...@alien8.de] > Sent: Wednesday, February 1, 2017 3:03 PM > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > index 548da5a8013e..f06fa338076b 100644 > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c

RE: [RFC PATCH] x86/CPU/AMD: Bring back Compute Unit ID

2017-02-01 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov [mailto:b...@alien8.de] > Sent: Wednesday, February 1, 2017 4:44 PM > > > To get around this we can set cu_id for all TOPOEXT systems, and update > > cpu_core_id, etc. for SMT enabled systems. This way we can just change > > cpu_core_id to cu_id

RE: [PATCH v2 4/4] x86/mce: Add AMD SMCA support to SRAO notifier

2017-03-21 Thread Ghannam, Yazen
> -Original Message- > From: Ghannam, Yazen > Sent: Monday, March 20, 2017 4:27 PM [...] > +/* Only support this on SMCA systems and errors logged from a UMC. */ > +static int mce_usable_address_amd(struct mce *m, unsigned long *pfn) { > + u8 umc; This should be

RE: [PATCH v2 4/4] x86/mce: Add AMD SMCA support to SRAO notifier

2017-03-22 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov [mailto:b...@alien8.de] > Sent: Wednesday, March 22, 2017 5:13 PM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; Tony Luck ; > x...@kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH v2 4/4] x86/mce: Ad

RE: [PATCH v2 2/4] x86/mce/AMD; EDAC,amd64: Move find_umc_channel() to AMD mcheck

2017-03-22 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov [mailto:b...@alien8.de] > Sent: Wednesday, March 22, 2017 5:17 PM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; Tony Luck ; > x...@kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH v2 2/4] x86/mce/

RE: [RFC PATCH] x86/CPU/AMD: Bring back Compute Unit ID

2017-02-02 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov [mailto:b...@alien8.de] > Sent: Thursday, February 2, 2017 7:11 AM > > Context switches have dropped, cache misses are the same and we have a > rise in cpu-migrations. That last bit is interesting and I don't have an > answer yet. Maybe peterz h

RE: [RFC PATCH] x86/CPU/AMD: Bring back Compute Unit ID

2017-02-02 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov [mailto:b...@alien8.de] > Sent: Thursday, February 2, 2017 1:11 PM > > Yazen, what BD generation is your machine? > The processors are revision C0. Also, I forgot to mention it's a 2P G34 system. Thanks, Yazen

RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-05-16 Thread Ghannam, Yazen
> -Original Message- > From: Luck, Tony > Sent: Thursday, May 16, 2019 10:52 AM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; b...@suse.de; > x...@kernel.org > Subject: Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that g

RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-05-16 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Thursday, May 16, 2019 11:57 AM > To: Ghannam, Yazen > Cc: Luck, Tony ; linux-e...@vger.kernel.org; > linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re

RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-05-16 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Thursday, May 16, 2019 12:21 PM > To: Ghannam, Yazen > Cc: Luck, Tony ; linux-e...@vger.kernel.org; > linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re

RE: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in hardware

2019-05-17 Thread Ghannam, Yazen
> -Original Message- > From: linux-edac-ow...@vger.kernel.org On > Behalf Of Borislav Petkov > Sent: Friday, May 17, 2019 5:10 AM > To: Luck, Tony > Cc: Ghannam, Yazen ; linux-e...@vger.kernel.org; > linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re

RE: [PATCH 2/8] EDAC/amd64: Support more than two controllers for chip selects handling

2019-06-13 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov > Sent: Thursday, June 13, 2019 9:17 AM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH 2/8] EDAC/amd64: Support more than two controllers for > chip selects hand

RE: [PATCH 1/8] EDAC/amd64: Fix number of DIMMs and Chip Select bases/masks on Family17h

2019-06-13 Thread Ghannam, Yazen
> -Original Message- > From: Borislav Petkov > Sent: Thursday, June 13, 2019 8:58 AM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH 1/8] EDAC/amd64: Fix number of DIMMs and Chip Select > bases/masks on Famil

RE: [PATCH 2/8] EDAC/amd64: Support more than two controllers for chip selects handling

2019-06-14 Thread Ghannam, Yazen
> -Original Message- > From: linux-kernel-ow...@vger.kernel.org > On Behalf Of Borislav Petkov > Sent: Thursday, June 13, 2019 5:23 PM > To: Ghannam, Yazen > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH 2/8] EDAC/amd64:

<    1   2   3   >