>
> On 12.11.23 01:46, Phillip Susi wrote:
> > I had been testing some things on a post 6.6-rc5 kernel for a week or
> > two and then when I pulled to a post 6.6 release kernel, I found that
> > system suspend was broken. It seems that the radeon driver failed to
> > suspend, leaving the display d
ACPI Boot Error Record Table (BERT) is being used by the kernel to
report errors that occurred in a previous boot. On some modern AMD
systems, these very errors within the BERT are reported through the
x86 Common Platform Error Record (CPER) format which consists of one
or more Processor Context In
From: Avadhut Naik
Currently, exporting new additional machine check error information
involves adding new fields for the same at the end of the struct mce.
This additional information can then be consumed through mcelog or
tracepoint.
However, as new MSRs are being added (and will be added in t
AMD systems with the SUCCOR feature can send an APIC LVT interrupt for
deferred errors. The LVT offset is 0x2 by convention, i.e. this is the
default as listed in hardware documentation.
However, the MCA registers may list a different LVT offset for this
interrupt. The kernel should honor the valu
A new "FRU Text in MCA" feature is defined where the Field Replaceable
Unit (FRU) Text for a device is represented by a string in the new
MCA_SYND1 and MCA_SYND2 registers. This feature is supported per MCA
bank, and it is advertised by the McaFruTextInMca bit (MCA_CONFIG[9]).
The FRU Text is popu
AMD systems optionally support MCA Thresholding which provides the
ability for hardware to send an interrupt when a set error threshold is
reached. This feature counts errors of all severities, but it is
commonly used to report correctable errors with an interrupt rather than
polling.
Scalable MCA
From: Avadhut Naik
AMD's Scalable MCA systems viz. Genoa will include two new registers:
MCA_SYND1 and MCA_SYND2.
These registers will include supplemental error information in addition
to the existing MCA_SYND register. The data within the registers is
considered valid if MCA_STATUS[SyndV] is s
AMD systems optionally support a Deferred error interrupt. The interrupt
should be used as another signal to trigger MCA polling. This is similar
to how other MCA interrupts are handled.
Deferred errors do not require any special handling related to the
interrupt, e.g. resetting or rearming the in
Scalable MCA systems use the per-bank MCA_CONFIG register to enable
deferred error interrupts. This is done as part of SMCA configuration.
Currently, the deferred error interrupt handler is set up after SMCA
configuration.
Move the deferred error interrupt handler set up before SMCA
configuration
The "long names" for SMCA banks are only used by the MCE decoder module.
Move them out of the arch code and into the decoder module.
Signed-off-by: Yazen Ghannam
---
arch/x86/include/asm/mce.h| 1 -
arch/x86/kernel/cpu/mce/amd.c | 74 ++-
drivers/edac/mce_am
Switch to bitops to help with clarity. Also, avoid an unnecessary
wrmsr() for SMCA systems.
Use the updated name for MSR 0xC000_0410 to match the documentation for
Family 0x17 and later systems.
This MSR is used for setting up both Deferred and MCA Thresholding
interrupts on current systems. So r
The current SMCA configuration function does more than just configure
SMCA features. It also detects and caches the SMCA bank types.
However, the bank type caching flow will be removed during the init path
clean up.
Define a new function that only configures SMCA features. This will
operate on th
AMD systems optionally support MCA Thresholding. This feature is
discovered by checking capability bits in the MCA_MISC* registers.
Currently, MCA Thresholding is set up in two passes. The first is during
CPU init where available banks are detected, and the "bank_map" variable
is updated. The seco
AMD systems optionally support an MCA Thresholding interrupt. The
interrupt should be used as another signal to trigger MCA polling. This
is similar to how the Intel Corrected Machine Check interrupt (CMCI) is
handled.
AMD MCA Thresholding is managed using the MCA_MISC registers within an
MCA bank
Scalable MCA systems use values within the MCA_IPID register to describe
a bank's type. Other information is not needed.
Currently, the bank types are cached during boot and this information is
used during boot and run time. The cached values are per-CPU and
per-bank. The boot path needs the cache
The type of an Scalable MCA bank should be determined solely using the
values in its MCA_IPID register.
Define and use a helper function to determine if a bank represents a GPU
Unified Memory Controller (UMC), and where the exact bank type is not
needed.
Use bitops and rename old mask until remov
Scalable MCA systems use values in the MCA_IPID register to describe the
type of hardware for an MCA bank. This information is used when
bank-specific actions or decoding are needed. Otherwise,
microarchitectural information, like MCA_STATUS bits, should be used.
Currently, the bank type informati
Quirks break micro-architectural definitions. Therefore, quirk
conditions don't need to follow micro-architectural requirements.
Currently, there is a quirk to filter some errors from the
Instruction Fetch (IF) unit on specific models. The IF unit is
represented by MCA bank 1 for these models. Rel
Current AMD systems may report MCA errors using the ACPI Boot Error
Record Table (BERT). The BERT entries for MCA errors will be an x86
Common Platform Error Record (CPER) with an MSR register context that
matches the MCAX/SMCA register space.
However, the BERT will not necessarily be processed on
Generally, MCA information for an error is gathered on the CPU that
reported the error. In this case, CPU-specific information from the
running CPU will be correct.
However, this will be incorrect if the MCA information is gathered while
running on a CPU that didn't report the error. One example i
AMD systems generally allow MCA "simulation" where MCA registers can be
written with valid data and the full MCA handling flow can be tested by
software.
However, the Platform on Scalable MCA systems, may prevent software
from writing data to the MCA registers. There is no architectural way to
det
Hi all,
This set is a collection of logically independent updates that make
changes to common code. I've collected them to resolve conflicts and
ordering. Furthermore, this is the first half of a larger set. The
second half is focused on refactoring the AMD MCA Thresholding feature
support. So I d
Setting register to force ordering to prevent read/write or write/read
hazards for un-cached modes.
Signed-off-by: Alex Sierra
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c| 22 +--
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c| 8 +++
.../include/asic_reg/gc/gc_11_0_0
23 matches
Mail list logo