On 4/3/2025 10:48 AM, Alex Deucher wrote:
On Wed, Apr 2, 2025 at 11:12 PM Mario Limonciello <supe...@kernel.org> wrote:
From: Mario Limonciello <mario.limoncie...@amd.com>
AMD RX580 when added AMD Phenom 2 has problems with overheating. This is due to
I don't think this is entirely accurate. I think the GPU gets hot
because the device hangs due to a problem with changing the PCIe
clocks.
changes with PCIe dynamic switching introduced by commit 466a7d115326e
("drm/amd: Use the first non-dGPU PCI device for BW limits").
To avoid risks of other issues with old hardware require at least Zen hardware
for AMD side to enable PCIe dynamic switching.
I'm pretty sure PCIe reclocking worked on pre-Zen hardware. We've
supported this on our GPUs going back at least 15 or more years. I
suspect the actual problem is that some links may not reliably train
at the full bandwidth on some motherboards. Forcing a higher link
speed may cause problems.
That seems odd to me it would advertise a higher link speed than it
could train at.
Maybe it would be better to limit the max
PCIe link rate to whatever the link is currently trained to. IIRC,
PCIe links will train at the fastest link possible by default. The
previous behavior was to limit the max clock to the slowest link in
the topology to save power, but then we changed it to use the fastest
link possible based on the PCIe link caps. Perhaps limiting it to the
fastest currently trained link rate would be better.
I mean that's essentially what happens when
amdgpu_device_pcie_dynamic_switching_supported() returns that it doesn't
work.
If your theory is right; maybe what we really need is a pile of DMI
quirks for M/B that are having this problem.
Alex
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4098
Fixes: 466a7d115326e ("drm/amd: Use the first non-dGPU PCI device for BW
limits")
Signed-off-by: Mario Limonciello <mario.limoncie...@amd.com>
---
v2:
* Cover more hardware
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a30111d2c3ea0..caa44ee788c8f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1854,6 +1854,9 @@ bool amdgpu_device_seamless_boot_supported(struct
amdgpu_device *adev)
*
*
https://edc.intel.com/content/www/us/en/design/products/platforms/details/raptor-lake-s/13th-generation-core-processors-datasheet-volume-1-of-2/005/pci-express-support/
* https://gitlab.freedesktop.org/drm/amd/-/issues/2663
+ *
+ * AMD Phenom II X6 1090T has a similar issue
+ * https://gitlab.freedesktop.org/drm/amd/-/issues/4098
*/
static bool amdgpu_device_pcie_dynamic_switching_supported(struct
amdgpu_device *adev)
{
@@ -1866,6 +1869,8 @@ static bool
amdgpu_device_pcie_dynamic_switching_supported(struct amdgpu_device
if (c->x86_vendor == X86_VENDOR_INTEL)
return false;
+ if (c->x86_vendor == X86_VENDOR_AMD &&
!cpu_feature_enabled(X86_FEATURE_ZEN))
+ return false;
#endif
return true;
}
--
2.43.0