Fix the initialization and usage of capability values and mask.
SMU_CAPS_MASK indicates mask value, and SMU_CAPS represent the
capability value.
Signed-off-by: Lijo Lazar
Fixes: 9bb53d2ce109 ("drm/amd/pm: Add capability flags for SMU v13.0.6")
---
.../drm/amd/pm/swsmu/smu13/smu_v13
Add capability flags for SMU v13.0.6 variants. Initialize the flags
based on firmware support. As there are multiple IP versions maintained,
it is more manageable with one time initialization caps flags based on
IP version and firmware feature support.
Signed-off-by: Lijo Lazar
---
drivers/gpu
RRMT could get dynamically enabled/disabled by PSP firmware. Read the
status from register for reading RRMT status. For VFs, this is not
accessible, hence assume that it's always disabled for now.
Signed-off-by: Lijo Lazar
Reviewed-by: Sathishkumar S
---
drivers/gpu/drm/amd/amdgpu/amdgpu_
RRMT could get dynamically enabled/disabled by PSP firmware. Read the
status from register for reading RRMT status. For VFs, this is not
accessible, hence assume that it's always disabled for now.
Signed-off-by: Lijo Lazar
Reviewed-by: Sathishkumar S
---
drivers/gpu/drm/amd/a
Add RRMT control register offset for VCN v4.0.3
Signed-off-by: Lijo Lazar
Reviewed-by: Sathishkumar S
---
drivers/gpu/drm/amd/include/asic_reg/vcn/vcn_4_0_3_offset.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/include/asic_reg/vcn
Context empty interrupt is enabled for SDMA 4.4.2. Add a handler for
context empty interrupt so that it is disposed of fast, and not
propagated to KFD layer.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 1 +
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 22
'add ip block' causes a confusion if the blocks are disabled later with
ip_block_mask. Instead change to 'detected' and also add device context.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
1 file changed, 2 insertions(+), 2 deleti
Driver has different ways to fetch VBIOS. If one of the methods doesn't
find an authentic one, it will show misleading info messages eventhough
a subsequent method finds a valid VBIOS. Keep the message level at debug
and add device context.
Signed-off-by: Lijo Lazar
---
drivers/gpu/dr
VF device sets the RAS flag when mailbox data can't be read properly.
There is no conclusive way to tell if the real source is RAS error.
Therefore VF schedules a KFD based reset which doesn't set RAS source.
SKip checking RAS source for any VF scheduled recovery.
Signed-off-by:
-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 58 +
1 file changed, 35 insertions(+), 23 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
index 4d90e3f0bd17..6a9e26905edf 100644
--- a/drivers/gpu/drm/amd
Some boards use longer File Ids.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h
index bc58dca18035
FRU info is expected to be non-NULL if FRU sys files are created.
Simplify the check.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
b/drivers/gpu
GFXOFF is not valid for these IP versions. Also, SDMA v4.4.2 is not in
GFX domain.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 4
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2 --
2 files changed, 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
As per power team, there is no need to impose a lower bound on arcturus
power limit. Any unreasonable limit set will result in frequent
throttling.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff
As per power team, there is no need to impose a lower bound on arcturus
power limit. Any unreasonable limit set will result in frequent
throttling.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff
Write pointer could be 32-bit or 64-bit. Use the correct size during
initialization.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
b/drivers/gpu/drm
ed to look for a fatal error. Skip fatal error checking
in such cases.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/aldebaran.c| 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 15 -
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 55 ++-
drivers/gp
o identify post reset reinitialization
phase. This only provides a device level identification, IP/features may
choose to track their state independently also.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/aldebaran.c | 4
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
d
Some in_reset checks are infact checking whether the state is
reinitialization after reset. Replace with reset_in_recovery calls to
identify that it's really checking for recovery stage after reset.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
driver
Reset sequence indicates that hardware already ran into a bad state.
Avoid sending unmap queue request to reset KCQ. This will also cover RAS
error scenarios which need a reset to recover, hence remove the check.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 10
newer code.
Signed-off-by: Lijo Lazar
Fixes: 6c10b5cc4eaa ("drm/amdgpu: Remove duplicate code in gfx_v8_0.c")
---
v2: Add same changes to map queue also (Le Ma)
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 13 -
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 63 +++--
code.
Signed-off-by: Lijo Lazar
Fixes: 6c10b5cc4eaa ("drm/amdgpu: Remove duplicate code in gfx_v8_0.c")
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 13 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 47 ++
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c |
Reset sequence indicates that hardware already ran into a bad state.
Avoid sending unmap queue request to reset KCQ. This will also cover RAS
error scenarios which need a reset to recover, hence remove the check.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 10
For DPX mode, the number of memory partitions supported should be less
than or equal to 2.
Signed-off-by: Lijo Lazar
Fixes: 1589c82a1085 ("drm/amdgpu: Check memory ranges for valid xcp mode")
---
drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 2 +-
1 file changed, 1 insertion(+),
For RAS errors, source of error is known. Skip the core dump of IP
states.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index
Populate the compatible NPS modes also for providing partition
configuration details through sysfs.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.h| 1 +
drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 11 +++
2 files changed, 12 insertions(+)
diff --git a
Make amdgpu_gfx_sysfs_init/fini functions as common entry points for all
gfx related sysfs nodes.
Signed-off-by: Lijo Lazar
---
v2: Check cleaner shader capability only for creation of run_cleaner_shader
attribute.
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 36 -
drivers
Make amdgpu_gfx_sysfs_init/fini functions as common entry points for all
gfx related sysfs nodes.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 37 ++---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 2 --
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 5
function.
Signed-off-by: Lijo Lazar
Reported-by: Hao Zhou
Fixes: 1b665567fd6d ("drm/amdgpu: Add reset on init handler for XGMI")
---
v2: Rename save function to a more appropriate amdgpu_vcn_save_vcpu_bo (Leo)
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 6 ++
drivers/gpu/drm/
On a hive, NPS request is placed by the first one for all devices in the
hive. If the request fails, mark the mode as UNKNOWN so that subsequent
devices on unload don't request it. Also, fix the mutex double lock
issue in error condition, should have been mutex_unlock.
Signed-off-by: Lijo
On a hive, NPS request is placed by the first one for all devices in the
hive. If the request fails, mark the mode as UNKNOWN so that subsequent
devices on unload don't request it. Also, fix the mutex double lock
issue in error condition, should have been mutex_unlock.
Signed-off-by: Lijo
function.
Signed-off-by: Lijo Lazar
Reported-by: Hao Zhou
Fixes: 1b665567fd6d ("drm/amdgpu: Add reset on init handler for XGMI")
---
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 6 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 26 ++-
drivers/gpu/drm/amd/amdgpu/am
Zero-initialize mqd backup memory, otherwise the check for
'already-backed-up' could go wrong.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
b/d
In certain cases - ex: when a reset is required on initialization - XCP
manager won't have a valid partition mode. In such cases, use SPX as the
default selected mode for which partition configuration details are
populated.
Signed-off-by: Lijo Lazar
Reported-by: Hao Zhou
Fixes: c7de570
Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and
GC v9.4.4 currently support this. NPS switch is only supported if an SOC
supports multiple NPS modes.
Signed-off-by: Lijo Lazar
Signed-off-by: Rajneesh Bhardwaj
Reviewed-by: Feifei Xu
---
v2: Add NULL check for
When reset on initialization is requested, wait for the reset to finish.
In cases where module is loaded after boot, this makes sure all
initialization work is done after a successful return of modprobe.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 9 -
1
Avoid comparing TOS version on APUs. On APUs driver doesn't take care of
TOS load.
Fixes: 2edc5ecbf1a9 ("drm/amdgpu: Add interface for TOS reload cases")
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-
Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and
GC v9.4.4 currently support this. NPS switch is only supported if an SOC
supports multiple NPS modes.
Signed-off-by: Lijo Lazar
Signed-off-by: Rajneesh Bhardwaj
Reviewed-by: Feifei Xu
---
drivers/gpu/drm/amd/amdgpu
Add a callback to check if there is any condition detected by GMC block
for reset on init. One case is if a pending NPS change request is
detected. If reset is done because of NPS switch, refresh NPS info from
discovery table.
Signed-off-by: Lijo Lazar
---
v2:
Move NPS request check ahead of TOS
memory partition sysfs logic to be more
generic.
Signed-off-by: Lijo Lazar
Reviewed-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 114
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 ++
2 files changed, 104 insertions(+), 16 deletions(-)
diff -
If a user has requested NPS mode switch, place the request through PSP
during unload of the driver. For devices which are part of a hive, all
requests are placed together. If one of them fails, revert back to the
current NPS mode.
Signed-off-by: Lijo Lazar
Signed-off-by: Rajneesh Bhardwaj
Add a common interface in GMC to request NPS mode through PSP. Also add
a variable in hive and gmc control to track the last requested mode.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: Lijo Lazar
Reviewed-by: Feifei Xu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 16
In certain use cases, NPS data needs to be refreshed again from
discovery table. Add API parameter to refresh NPS data from discovery
table.
Signed-off-by: Lijo Lazar
Reviewed-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +++
drivers/gpu/drm/amd
Implement PSP ring command interface for memory partitioning on the fly
on the supported asics.
Signed-off-by: Rajneesh Bhardwaj
Reviewed-by: Feifei Xu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 +
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 +
drivers/gpu/drm/amd
eifei)
Lijo Lazar (7):
drm/amdgpu: Add option to refresh NPS data
drm/amdgpu: Add PSP interface for NPS switch
drm/amdgpu: Add gmc interface to request NPS mode
drm/amdgpu: Add sysfs interfaces for NPS mode
drm/amdgpu: Place NPS mode request on unload
drm/amdgpu: Check gmc requiremen
eifei)
Lijo Lazar (7):
drm/amdgpu: Add option to refresh NPS data
drm/amdgpu: Add PSP interface for NPS switch
drm/amdgpu: Add gmc interface to request NPS mode
drm/amdgpu: Add sysfs interfaces for NPS mode
drm/amdgpu: Place NPS mode request on unload
drm/amdgpu: Check gmc requiremen
Implement PSP ring command interface for memory partitioning on the fly
on the supported asics.
Signed-off-by: Rajneesh Bhardwaj
Reviewed-by: Feifei Xu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 +
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 +
drivers/gpu/drm/amd
In certain use cases, NPS data needs to be refreshed again from
discovery table. Add API parameter to refresh NPS data from discovery
table.
Signed-off-by: Lijo Lazar
Reviewed-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +++
drivers/gpu/drm/amd
Enable sysfs node for current compute partition mode on VFs also.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 29 +++--
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 12 --
2 files changed, 31 insertions(+), 10 deletions(-)
diff --git a
Use the memory ranges published in discovery table to deduce NPS mode
of GC v9.4.3 VFs.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 12 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 2 +-
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 30
Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and
GC v9.4.4 currently support this. NPS switch is only supported if an SOC
supports multiple NPS modes.
Signed-off-by: Lijo Lazar
Signed-off-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h | 1 +
drivers
Add a callback to check if there is any condition detected by GMC block
for reset on init. One case is if a pending NPS change request is
detected. If reset is done because of NPS switch, refresh NPS info from
discovery table.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu
If a user has requested NPS mode switch, place the request through PSP
during unload of the driver. For devices which are part of a hive, all
requests are placed together. If one of them fails, revert back to the
current NPS mode.
Signed-off-by: Lijo Lazar
Signed-off-by: Rajneesh Bhardwaj
memory partition sysfs logic to be more
generic.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 114
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 ++
2 files changed, 104 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/amd/
Add a common interface in GMC to request NPS mode through PSP. Also add
a variable in hive and gmc control to track the last requested mode.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 16
drivers/gpu/drm/amd/amdgpu
Implement PSP ring command interface for memory partitioning on the fly
on the supported asics.
Signed-off-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 +
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 +
drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 1
In certain use cases, NPS data needs to be refreshed again from
discovery table. Add API parameter to refresh NPS data from discovery
table.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h | 2
ch is pending and initiates a mode-1
reset.
7) During resume after a reset, NPS ranges are read again from discovery table.
8) Driver detects the new NPS mode and makes a compatible compute partition mode
switch if required.
Lijo Lazar (7):
drm/amdgpu: Add option to refresh NPS data
drm/amdgpu: Ad
Fix instance mask calculation for VCN IP. There are cases where VCN
instance could be shared across partitions. Fix here so that other
blocks don't need to check for any shared instances based on partition
mode.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
A reset on initialization will be needed if a new PSP TOS needs to be
loaded than the one currently active on the system. This is possible
only on SOCs which support a full device reset which results in unload
of active PSP TOS.
Signed-off-by: Lijo Lazar
Reviewed-by: Feifei Xu
Reviewed-by: Alex
Drop delayed reset work handler as it is no longer used.
Signed-off-by: Lijo Lazar
Reviewed-by: Feifei Xu
Reviewed-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 --
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 80 -
2 files changed, 84 deletions
Add a separate function to read badpage data during initialization.
Reading bad pages will need hardware access and cannot be done during
reset. Hence in cases where device needs a full reset during
init itself, attempting to read will cause a deadlock.
Signed-off-by: Lijo Lazar
Reviewed-by
Add interface to check if a different TOS needs to be loaded than the
one which is which is already active on the SOC. Presently the interface
is restricted to specific variants of PSPv13.0.
Signed-off-by: Lijo Lazar
Reviewed-by: Feifei Xu
Reviewed-by: Alex Deucher
---
drivers/gpu/drm/amd
Add XGMI reset on init support to aldebaran and SOCs with GC v9.4.3.
Signed-off-by: Lijo Lazar
Reviewed-by: Feifei Xu
Reviewed-by: Alex Deucher
---
v2:
Use renamed variable
drivers/gpu/drm/amd/amdgpu/aldebaran.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd
Use XGMI hive information to rely on resetting XGMI devices on
initialization rather than using mgpu structure. mgpu structure may have
other devices as well.
Signed-off-by: Lijo Lazar
Reviewed-by: Feifei Xu
---
v2:
Use consistent naming scheme for functions/variables (Alex Deucher
In some cases, device needs to be reset before first use. Add handlers
for doing device reset during driver init sequence.
Signed-off-by: Lijo Lazar
Reviewed-by: Feifei Xu
---
v2:
Use consistent naming scheme for functions/variables (Alex Deucher)
drivers/gpu/drm/amd/amdgpu/amdgpu.h
Move the reinitialization part after a reset to another function. No
functional changes.
Signed-off-by: Lijo Lazar
Reviewed-by: Feifei Xu
Acked-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 150 -
2
Drop pending_reset flag in gmc block. Instead use init level to
determine which type of init is preferred - in this case MINIMAL.
Signed-off-by: Lijo Lazar
---
v2:
Fix logical issue while replacing pending_reset flag in smuv11 (Feifei)
Use renamed init level id
Add init levels to define the level to which device needs to be
initialized.
Signed-off-by: Lijo Lazar
---
v2:
Add comments describing init levels
Drop unnecessary assignment
Rename AMDGPU_INIT_LEVEL_MINIMAL to AMDGPU_INIT_LEVEL_MINIMAL_XGMI
drivers/gpu/drm/amd/amdgpu
scenario where device is going to be reset.
The series adds an API interface to check if a PSP TOS reload is required.
v2:
Fix logical issue while replacing pending_reset flag with init level
Use consistent naming for functions/variables
Lijo Lazar (10):
drm/amdgpu: Add init
EXTERNAL_REG_INTERNAL_OFFSET/EXTERNAL_REG_WRITE_ADDR should be used in
pairs. If an external register shoudln't be written, both packets
shouldn't be sent.
Fixes: a78b48146972 ("drm/amdgpu: Skip PCTL0_MMHUB_DEEPSLEEP_IB write in
jpegv4.0.3 under SRIOV")
Signed-off-by: Lijo
A reset on initialization will be needed if a new PSP TOS needs to be
loaded than the one currently active on the system. This is possible
only on SOCs which support a full device reset which results in unload
of active PSP TOS.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/soc15.c
Add XGMI reset on init support to aldebaran and SOCs with GC v9.4.3.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/aldebaran.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/aldebaran.c
b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
index b0f95a7649bf
Add interface to check if a different TOS needs to be loaded than the
one which is which is already active on the SOC. Presently the interface
is restricted to specific variants of PSPv13.0.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 13 +
drivers/gpu
Drop delayed reset work handler as it is no longer used.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 --
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 80 -
2 files changed, 84 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
b
In some cases, device needs to be reset before first use. Add handlers
for doing device reset during driver init sequence.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 148 ++
drivers/gpu/drm/amd
Use XGMI hive information to rely on resetting XGMI devices on
initialization rather than using mgpu structure. mgpu structure may have
other devices as well.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c| 6
Move the reinitialization part after a reset to another function. No
functional changes.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 150 -
2 files changed, 89 insertions(+), 63 deletions
Add a separate function to read badpage data during initialization.
Reading bad pages will need hardware access and cannot be done during
reset. Hence in cases where device needs a full reset during
init itself, attempting to read will cause a deadlock.
Signed-off-by: Lijo Lazar
---
drivers/gpu
Drop pending_reset flag in gmc block. Instead use init level to
determine which type of init is preferred - in this case MINIMAL.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 33 ---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 -
drivers
Add init levels to define the level to which device needs to be
initialized.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 14 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 54 ++
2 files changed, 68 insertions(+)
diff --git a/drivers/gpu
scenario where device is going to be reset.
The series adds an API interface to check if a PSP TOS reload is required.
Lijo Lazar (10):
drm/amdgpu: Add init levels
drm/amdgpu: Use init level for pending_reset flag
drm/amdgpu: Separate reinitialization after reset
drm/amdgpu: Add reset
On VFs and SOCs with GC 9.4.4, VCN RRMT is disabled.
Only local register offsets should be used on JPEG v4.0.3 as they cannot
handle remote access to other AIDs. Since only local offsets are used,
the special write to MCM_ADDR register is no longer needed.
Signed-off-by: Lijo Lazar
---
v2
Add p2s table support for a new revision of SMUv13.0.6.
Signed-off-by: Lijo Lazar
Reviewed-by: Hawking Zhang
Reviewed-by: Asad Kamal
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu
Only local register offsets should be used on JPEG v4.0.3 as they cannot
handle remote access to other AIDs. Since only local offsets are used,
the special write to MCM_ADDR register is no longer needed.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 32
On EFI BIOSes, PCI ROM may be exported through EFI_PCI_IO_PROTOCOL and
expansion ROM BARs may not be enabled. Choose to read from EFI exported
ROM data before reading PCI Expansion ROM BAR.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 10 +-
1 file changed, 5
If there are multiple nodes per kfd device, add nodeid to location_id to
differentiate.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
b/drivers/gpu/drm/amd
Spurious events are seen, temporarily ignore the events altogether.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
b/drivers/gpu/drm/amd/pm/swsmu/smu13
For SOCs with GFX v9.4.3, a VF may have multiple compute partitions.
Fetch the partition information during init and initialize partition
nodes. There is no support to switch partition mode in VF mode, hence
disable the same.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
Convert some pr_* to some dev_* APIs to identify the device.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c | 3 +-
drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 21 ---
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 8 ++-
.../gpu/drm/amd/amdkfd
Cache the PCI state before bus master is disabled. The saved state is
later used for other cases like restoring config space after mode-2
reset.
Signed-off-by: Lijo Lazar
Fixes: 5c03e5843e6b ("drm/amdgpu:add smu mode1/2 support for aldebaran")
---
drivers/gpu/drm/amd/amdgpu/amdgpu_de
If reg list is already loaded on PSP 13.0.2 SOCs, psp will give
TEE_ERR_CANCEL response on second time load. Avoid printing warn
message for it.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 +
drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 5
Skip scheduling coredump when gpu reset is intentionally triggered
through debugfs.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
KFD uses crc16 for gpu_id generation.
Fixes: 6dbc6469ab0b ("drm/amdkfd: Ensure gpu_id is unique")
Reported-by: kernel test robot
Closes:
https://lore.kernel.org/oe-kbuild-all/202405211405.tidtwibx-...@intel.com/
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/Kconfig | 1
On arcturus, allow changing xgmi plpd policy through
'pm_policy/xgmi_plpd' sysfs interface.
Signed-off-by: Lijo Lazar
Reviewed-by: Hawking Zhang
Reviewed-by: Asad Kamal
---
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 7 ++--
.../gpu/drm/amd/pm/swsmu/smu11/arcturus_
Remove unused callback to set PLPD policy and its implementation from
arcturus, aldebaran and SMUv13.0.6 SOCs.
Signed-off-by: Lijo Lazar
Reviewed-by: Hawking Zhang
Reviewed-by: Asad Kamal
---
drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 6 ---
.../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
Replace the legacy interface with amdgpu_dpm_set_pm_policy to set XGMI
PLPD mode. Also, xgmi_plpd_policy sysfs node is not used by any client.
Remove that as well.
Signed-off-by: Lijo Lazar
Reviewed-by: Hawking Zhang
Reviewed-by: Asad Kamal
---
v2: No change
v3: Rebase to remove
Add documentation about the newly added pm_policy node in sysfs.
Signed-off-by: Lijo Lazar
---
v5: Update documentation to reflect pm_policy nodes and sub nodes for each
policy type
Documentation/gpu/amdgpu/thermal.rst | 6
drivers/gpu/drm/amd/pm/amdgpu_pm.c | 53
On aldebaran, allow changing xgmi plpd policy through
'pm_policy/xgmi_plpd' sysfs interface.
Signed-off-by: Lijo Lazar
Reviewed-by: Hawking Zhang
Reviewed-by: Asad Kamal
---
.../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 36 +++
1 file changed, 36 insertions(+)
di
Add support to select pstate policy in SOCs with SMUv13.0.6
Signed-off-by: Lijo Lazar
eviewed-by: Hawking Zhang
Reviewed-by: Asad Kamal
---
v2,v3: No change
v4: Use macro for policy type name
.../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 2 +
.../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
1 - 100 of 362 matches
Mail list logo