Re: [PATCH v9 00/53] fix CONFIG_DRM_USE_DYNAMIC_DEBUG=y

2024-07-16 Thread jim . cromie
On Mon, Jul 15, 2024 at 4:05 AM Łukasz Bartosik  wrote:
>
> On Sat, Jul 13, 2024 at 11:45 PM  wrote:
> >
> > On Fri, Jul 12, 2024 at 9:44 AM Łukasz Bartosik  wrote:
> > >
> > > On Wed, Jul 3, 2024 at 12:14 AM  wrote:
> > > >
> > > > On Tue, Jul 2, 2024 at 4:01 PM Luis Chamberlain  
> > > > wrote:
> > > > >
> > > > > On Tue, Jul 02, 2024 at 03:56:50PM -0600, Jim Cromie wrote:
> > > > > > This fixes dynamic-debug support for DRM.debug, added via classmaps.
> > > > > > commit bb2ff6c27bc9 (drm: Disable dynamic debug as broken)
> > > > > >
> > > > > > CONFIG_DRM_USE_DYNAMIC_DEBUG=y was marked broken because 
> > > > > > drm.debug=val
> > > > > > was applied when drm.ko was modprobed; too early for the yet-to-load
> > > > > > drivers, which thus missed the enablement.  My testing with
> > > > > > /etc/modprobe.d/ entries and modprobes with dyndbg=$querycmd options
> > > > > > obscured this omission.
> > > > > >
> > > > > > The fix is to replace invocations of DECLARE_DYNDBG_CLASSMAP with
> > > > > > DYNDBG_CLASSMAP_DEFINE for core, and DYNDBG_CLASSMAP_USE for 
> > > > > > drivers.
> > > > > > The distinction allows dyndbg to also handle the users properly.
> > > > > >
> > > > > > DRM is the only current classmaps user, and is not really using it,
> > > > > > so if you think DRM could benefit from zero-off-cost debugs based on
> > > > > > static-keys, please test.
> > > > > >
> > > > > > HISTORY
> > > > > >
> > > > > > 9/4/22  - ee879be38bc8..ace7c4bbb240 commited - classmaps-v1 dyndbg 
> > > > > > parts
> > > > > > 9/11/22 - 0406faf25fb1..16deeb8e18ca commited - classmaps-v1 drm 
> > > > > > parts
> > > > > >
> > > > > > https://lore.kernel.org/lkml/y3xurogav4i7b...@kroah.com/
> > > > > > greg k-h says:
> > > > > > This should go through the drm tree now.  The rest probably should 
> > > > > > also
> > > > > > go that way and not through my tree as well.
> > > > >
> > > > > Can't this just be defined as a coccinelle smpl patch? Must easier
> > > > > to read than 53 patches?
> > > > >
> > > >
> > > > perhaps it could - Im not sure that would be easier to review
> > > > than a file-scoped struct declaration or reference per driver
> > > >
> > > > Also, I did it hoping to solicit more Tested-by:s with drm.debug=0x1ff
> > > >
> > > > Jim
> > > >
> > >
> > > Jim,
> > >
> > > When testing different combinations of Y/M for TEST_DYNAMIC_DEBUG and
> > > TEST_DYNAMIC_DEBUG_SUBMOD in virtme-ng I spotted test failures:
> > >
> > > When the TEST_DYNAMIC_DEBUG=M and TEST_DYNAMIC_DEBUG_SUBMOD=M -
> > > BASIC_TESTS, COMMA_TERMINATOR_TESTS, TEST_PERCENT_SPLITTING,
> > > TEST_MOD_SUBMOD selftests passed
> > > When the TEST_DYNAMIC_DEBUG=Y and TEST_DYNAMIC_DEBUG_SUBMOD=M -
> > > BASIC_TESTS, COMMA_TERMINATOR_TESTS selftests passed, however
> > > TEST_PERCENT_SPLITTING selftest fails with ": ./dyndbg_selftest.sh:270
> > > check failed expected 1 on =pf, got 0"
> > > When the TEST_DYNAMIC_DEBUG=Y and TEST_DYNAMIC_DEBUG_SUBMOD=Y -
> > > BASIC_TESTS, COMMA_TERMINATOR_TESTS selftests passed, however
> > > TEST_PERCENT_SPLITTING selftest fails also with ":
> > > ./dyndbg_selftest.sh:270 check failed expected 1 on =pf, got 0"
> > >
> > > Have I missed something ?
> > >
> >
> > I am not seeing those 2 failures on those 2 configs.
> >
> > most of my recent testing has been on x86-defconfig + minimals,
> > built and run using/inside virtme-ng
> >
> > the last kernel I installed on this hw was june 16, I will repeat that,
> > and report soon if I see the failure outside the vm
> >
> > I'll also send you my script, to maybe speed isolation of the differences.
> >
>
> Jim,
>
> I know why I saw these failures.
> I ran dyndbg_selftest.sh directly in thw directory
> tools/testing/selftests/dynamic_debug/.

thats odd.
I mostly run it from src-root,
also whereever make selftest target is/works (I forgot)

I went into that subdir and ran it there
I got no test differences / failures.

IIRC, the failure was on line 270, just after a modprobe.
can you further isolate it ?

> All works as expected when I run it from the top kernel directory.
> Here are the results:
>
> When the TEST_DYNAMIC_DEBUG=M and TEST_DYNAMIC_DEBUG_SUBMOD=M -
> BASIC_TESTS, COMMA_TERMINATOR_TESTS, TEST_PERCENT_SPLITTING,
> TEST_MOD_SUBMOD selftests passed
>
> When the TEST_DYNAMIC_DEBUG=Y and TEST_DYNAMIC_DEBUG_SUBMOD=M -
> BASIC_TESTS and COMMA_TERMINATOR_TESTS selftests passed,
> TEST_PERCENT_SPLITTING and TEST_PERCENT_SPLITTING selftests were
> skipped
>
> When the TEST_DYNAMIC_DEBUG=Y and TEST_DYNAMIC_DEBUG_SUBMOD=Y -
> BASIC_TESTS and COMMA_TERMINATOR_TESTS selftests passed,
> TEST_PERCENT_SPLITTING and TEST_PERCENT_SPLITTING selftests were
> skipped


thank you for running these config-combo tests.

are you doing these in a VM ?
and since Im asking, Ive done these combos on virtme-ng builds,
also installed & running on 2 x86 boxen.

could you add DRM=m and a driver too,
and boot with drm.debug=0x1ff, dynamic_debug.verbose=3
the debug output should show all the class-work o

[PATCH 1/3] drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3

2024-07-16 Thread Jane Jian
From: Lijo Lazar 

JPEG v4.0.3 doesn't support HDP flush when RRMT is enabled. Instead,
mmsch fw will do the flush.

This change is necessary for JPEG v4.0.3, no need for backward compatibility

Signed-off-by: Lijo Lazar 
Signed-off-by: Jane Jian 
---
 drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c 
b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
index 04d8966423de..30a143ab592d 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
@@ -621,6 +621,13 @@ static uint64_t jpeg_v4_0_3_dec_ring_get_wptr(struct 
amdgpu_ring *ring)
ring->pipe ? (0x40 * ring->pipe - 0xc80) : 0);
 }
 
+static void jpeg_v4_0_3_ring_emit_hdp_flush(struct amdgpu_ring *ring)
+{
+   /* JPEG engine access for HDP flush doesn't work when RRMT is enabled.
+* This is a workaround to avoid any HDP flush through JPEG ring.
+*/
+}
+
 /**
  * jpeg_v4_0_3_dec_ring_set_wptr - set write pointer
  *
@@ -1072,6 +1079,7 @@ static const struct amdgpu_ring_funcs 
jpeg_v4_0_3_dec_ring_vm_funcs = {
.emit_ib = jpeg_v4_0_3_dec_ring_emit_ib,
.emit_fence = jpeg_v4_0_3_dec_ring_emit_fence,
.emit_vm_flush = jpeg_v4_0_3_dec_ring_emit_vm_flush,
+   .emit_hdp_flush = jpeg_v4_0_3_ring_emit_hdp_flush,
.test_ring = amdgpu_jpeg_dec_ring_test_ring,
.test_ib = amdgpu_jpeg_dec_ring_test_ib,
.insert_nop = jpeg_v4_0_3_dec_ring_nop,
-- 
2.34.1



[PATCH 3/3] drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF

2024-07-16 Thread Jane Jian
For VCN/JPEG 4.0.3, use only the local addressing scheme.

- Mask bit higher than AID0 range
- Remove gmc v9 mmhub vmid replacement, since the bit will be masked later in 
register write/wait

Signed-off-by: Jane Jian 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c|  5 ---
 drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 19 --
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c  | 46 ++--
 3 files changed, 60 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index b73136d390cc..2c7b4002ed72 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -844,11 +844,6 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
req = hub->vm_inv_eng0_req + hub->eng_distance * eng;
ack = hub->vm_inv_eng0_ack + hub->eng_distance * eng;
 
-   if (vmhub >= AMDGPU_MMHUB0(0))
-   inst = 0;
-   else
-   inst = vmhub;
-
/* This is necessary for SRIOV as well as for GFXOFF to function
 * properly under bare metal
 */
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c 
b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
index 30a143ab592d..ad524ddc9760 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
@@ -32,6 +32,9 @@
 #include "vcn/vcn_4_0_3_sh_mask.h"
 #include "ivsrcid/vcn/irqsrcs_vcn_4_0.h"
 
+#define NORMALIZE_JPEG_REG_OFFSET(offset) \
+   (offset & 0x1)
+
 enum jpeg_engin_status {
UVD_PGFSM_STATUS__UVDJ_PWR_ON  = 0,
UVD_PGFSM_STATUS__UVDJ_PWR_OFF = 2,
@@ -824,7 +827,13 @@ void jpeg_v4_0_3_dec_ring_emit_ib(struct amdgpu_ring *ring,
 void jpeg_v4_0_3_dec_ring_emit_reg_wait(struct amdgpu_ring *ring, uint32_t reg,
uint32_t val, uint32_t mask)
 {
-   uint32_t reg_offset = (reg << 2);
+   uint32_t reg_offset;
+
+   /* For VF, only local offsets should be used */
+   if (amdgpu_sriov_vf(ring->adev))
+   reg = NORMALIZE_JPEG_REG_OFFSET(reg);
+
+   reg_offset = (reg << 2);
 
amdgpu_ring_write(ring, 
PACKETJ(regUVD_JRBC_RB_COND_RD_TIMER_INTERNAL_OFFSET,
0, 0, PACKETJ_TYPE0));
@@ -865,7 +874,13 @@ void jpeg_v4_0_3_dec_ring_emit_vm_flush(struct amdgpu_ring 
*ring,
 
 void jpeg_v4_0_3_dec_ring_emit_wreg(struct amdgpu_ring *ring, uint32_t reg, 
uint32_t val)
 {
-   uint32_t reg_offset = (reg << 2);
+   uint32_t reg_offset;
+
+   /* For VF, only local offsets should be used */
+   if (amdgpu_sriov_vf(ring->adev))
+   reg = NORMALIZE_JPEG_REG_OFFSET(reg);
+
+   reg_offset = (reg << 2);
 
amdgpu_ring_write(ring, 
PACKETJ(regUVD_JRBC_EXTERNAL_REG_INTERNAL_OFFSET,
0, 0, PACKETJ_TYPE0));
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
index 101b120f6fbd..9bae95538b62 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
@@ -45,6 +45,9 @@
 #define VCN_VID_SOC_ADDRESS_2_00x1fb00
 #define VCN1_VID_SOC_ADDRESS_3_0   0x48300
 
+#define NORMALIZE_VCN_REG_OFFSET(offset) \
+   (offset & 0x1)
+
 static int vcn_v4_0_3_start_sriov(struct amdgpu_device *adev);
 static void vcn_v4_0_3_set_unified_ring_funcs(struct amdgpu_device *adev);
 static void vcn_v4_0_3_set_irq_funcs(struct amdgpu_device *adev);
@@ -1375,6 +1378,43 @@ static uint64_t vcn_v4_0_3_unified_ring_get_wptr(struct 
amdgpu_ring *ring)
regUVD_RB_WPTR);
 }
 
+static void vcn_v4_0_3_enc_ring_emit_reg_wait(struct amdgpu_ring *ring, 
uint32_t reg,
+   uint32_t val, uint32_t mask)
+{
+   /* For VF, only local offsets should be used */
+   if (amdgpu_sriov_vf(ring->adev))
+   reg = NORMALIZE_VCN_REG_OFFSET(reg);
+
+   amdgpu_ring_write(ring, VCN_ENC_CMD_REG_WAIT);
+   amdgpu_ring_write(ring, reg << 2);
+   amdgpu_ring_write(ring, mask);
+   amdgpu_ring_write(ring, val);
+}
+
+static void vcn_v4_0_3_enc_ring_emit_wreg(struct amdgpu_ring *ring, uint32_t 
reg, uint32_t val)
+{
+   /* For VF, only local offsets should be used */
+   if (amdgpu_sriov_vf(ring->adev))
+   reg = NORMALIZE_VCN_REG_OFFSET(reg);
+
+   amdgpu_ring_write(ring, VCN_ENC_CMD_REG_WRITE);
+   amdgpu_ring_write(ring, reg << 2);
+   amdgpu_ring_write(ring, val);
+}
+
+static void vcn_v4_0_3_enc_ring_emit_vm_flush(struct amdgpu_ring *ring,
+   unsigned int vmid, uint64_t pd_addr)
+{
+   struct amdgpu_vmhub *hub = &ring->adev->vmhub[ring->vm_hub];
+
+   pd_addr = amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
+
+   /* wait for reg writes */
+   vcn_v4_0_3_enc_ring_emit_reg_wait(ring, hub->ctx0_ptb_addr_lo32 +
+   vmid * hub->ctx_addr_distance,
+ 

[PATCH 2/3] drm/amdgpu: Add empty HDP flush function to VCN v4.0.3

2024-07-16 Thread Jane Jian
From: Lijo Lazar 

VCN 4.0.3 does not HDP flush with RRMT enabled. Instead, mmsch
will do the HDP flush.

This change is necessary for VCN v4.0.3, no need for backward compatibility

Signed-off-by: Lijo Lazar 
Signed-off-by: Jane Jian 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
index f53054e39ebb..101b120f6fbd 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
@@ -1375,6 +1375,13 @@ static uint64_t vcn_v4_0_3_unified_ring_get_wptr(struct 
amdgpu_ring *ring)
regUVD_RB_WPTR);
 }
 
+static void vcn_v4_0_3_ring_emit_hdp_flush(struct amdgpu_ring *ring)
+{
+   /* VCN engine access for HDP flush doesn't work when RRMT is enabled.
+* This is a workaround to avoid any HDP flush through VCN ring.
+*/
+}
+
 /**
  * vcn_v4_0_3_unified_ring_set_wptr - set enc write pointer
  *
@@ -1415,6 +1422,7 @@ static const struct amdgpu_ring_funcs 
vcn_v4_0_3_unified_ring_vm_funcs = {
.emit_ib = vcn_v2_0_enc_ring_emit_ib,
.emit_fence = vcn_v2_0_enc_ring_emit_fence,
.emit_vm_flush = vcn_v2_0_enc_ring_emit_vm_flush,
+   .emit_hdp_flush = vcn_v4_0_3_ring_emit_hdp_flush,
.test_ring = amdgpu_vcn_enc_ring_test_ring,
.test_ib = amdgpu_vcn_unified_ring_test_ib,
.insert_nop = amdgpu_ring_insert_nop,
-- 
2.34.1



Re: [PATCH 3/3] drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF

2024-07-16 Thread Lazar, Lijo



On 7/16/2024 1:29 PM, Jane Jian wrote:
> For VCN/JPEG 4.0.3, use only the local addressing scheme.
> 
> - Mask bit higher than AID0 range
> - Remove gmc v9 mmhub vmid replacement, since the bit will be masked later in 
> register write/wait
> 
> Signed-off-by: Jane Jian 
> ---
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c|  5 ---
>  drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 19 --
>  drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c  | 46 ++--
>  3 files changed, 60 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index b73136d390cc..2c7b4002ed72 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -844,11 +844,6 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
> *adev, uint32_t vmid,
>   req = hub->vm_inv_eng0_req + hub->eng_distance * eng;
>   ack = hub->vm_inv_eng0_ack + hub->eng_distance * eng;
>  
> - if (vmhub >= AMDGPU_MMHUB0(0))
> - inst = 0;
> - else
> - inst = vmhub;
> -

This doesn't look correct. This is also used to identify the KIQ to be
used to perform flush operation and it goes through master XCC in case
of MMHUB.

Thanks,
Lijo

>   /* This is necessary for SRIOV as well as for GFXOFF to function
>* properly under bare metal
>*/
> diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c 
> b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
> index 30a143ab592d..ad524ddc9760 100644
> --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
> +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
> @@ -32,6 +32,9 @@
>  #include "vcn/vcn_4_0_3_sh_mask.h"
>  #include "ivsrcid/vcn/irqsrcs_vcn_4_0.h"
>  
> +#define NORMALIZE_JPEG_REG_OFFSET(offset) \
> + (offset & 0x1)
> +
>  enum jpeg_engin_status {
>   UVD_PGFSM_STATUS__UVDJ_PWR_ON  = 0,
>   UVD_PGFSM_STATUS__UVDJ_PWR_OFF = 2,
> @@ -824,7 +827,13 @@ void jpeg_v4_0_3_dec_ring_emit_ib(struct amdgpu_ring 
> *ring,
>  void jpeg_v4_0_3_dec_ring_emit_reg_wait(struct amdgpu_ring *ring, uint32_t 
> reg,
>   uint32_t val, uint32_t mask)
>  {
> - uint32_t reg_offset = (reg << 2);
> + uint32_t reg_offset;
> +
> + /* For VF, only local offsets should be used */
> + if (amdgpu_sriov_vf(ring->adev))
> + reg = NORMALIZE_JPEG_REG_OFFSET(reg);
> +
> + reg_offset = (reg << 2);
>  
>   amdgpu_ring_write(ring, 
> PACKETJ(regUVD_JRBC_RB_COND_RD_TIMER_INTERNAL_OFFSET,
>   0, 0, PACKETJ_TYPE0));
> @@ -865,7 +874,13 @@ void jpeg_v4_0_3_dec_ring_emit_vm_flush(struct 
> amdgpu_ring *ring,
>  
>  void jpeg_v4_0_3_dec_ring_emit_wreg(struct amdgpu_ring *ring, uint32_t reg, 
> uint32_t val)
>  {
> - uint32_t reg_offset = (reg << 2);
> + uint32_t reg_offset;
> +
> + /* For VF, only local offsets should be used */
> + if (amdgpu_sriov_vf(ring->adev))
> + reg = NORMALIZE_JPEG_REG_OFFSET(reg);
> +
> + reg_offset = (reg << 2);
>  
>   amdgpu_ring_write(ring, 
> PACKETJ(regUVD_JRBC_EXTERNAL_REG_INTERNAL_OFFSET,
>   0, 0, PACKETJ_TYPE0));
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c 
> b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
> index 101b120f6fbd..9bae95538b62 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
> @@ -45,6 +45,9 @@
>  #define VCN_VID_SOC_ADDRESS_2_0  0x1fb00
>  #define VCN1_VID_SOC_ADDRESS_3_0 0x48300
>  
> +#define NORMALIZE_VCN_REG_OFFSET(offset) \
> + (offset & 0x1)
> +
>  static int vcn_v4_0_3_start_sriov(struct amdgpu_device *adev);
>  static void vcn_v4_0_3_set_unified_ring_funcs(struct amdgpu_device *adev);
>  static void vcn_v4_0_3_set_irq_funcs(struct amdgpu_device *adev);
> @@ -1375,6 +1378,43 @@ static uint64_t 
> vcn_v4_0_3_unified_ring_get_wptr(struct amdgpu_ring *ring)
>   regUVD_RB_WPTR);
>  }
>  
> +static void vcn_v4_0_3_enc_ring_emit_reg_wait(struct amdgpu_ring *ring, 
> uint32_t reg,
> + uint32_t val, uint32_t mask)
> +{
> + /* For VF, only local offsets should be used */
> + if (amdgpu_sriov_vf(ring->adev))
> + reg = NORMALIZE_VCN_REG_OFFSET(reg);
> +
> + amdgpu_ring_write(ring, VCN_ENC_CMD_REG_WAIT);
> + amdgpu_ring_write(ring, reg << 2);
> + amdgpu_ring_write(ring, mask);
> + amdgpu_ring_write(ring, val);
> +}
> +
> +static void vcn_v4_0_3_enc_ring_emit_wreg(struct amdgpu_ring *ring, uint32_t 
> reg, uint32_t val)
> +{
> + /* For VF, only local offsets should be used */
> + if (amdgpu_sriov_vf(ring->adev))
> + reg = NORMALIZE_VCN_REG_OFFSET(reg);
> +
> + amdgpu_ring_write(ring, VCN_ENC_CMD_REG_WRITE);
> + amdgpu_ring_write(ring, reg << 2);
> + amdgpu_ring_write(ring, val);
> +}
> +
> +static void vcn_v4_0_3_enc_ring_emit_vm_flush(struct amdgpu_ring *ring,
> + 

[PATCH 3/3] drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF

2024-07-16 Thread Jane Jian
For VCN/JPEG 4.0.3, use only the local addressing scheme.

- Mask bit higher than AID0 range

v2
remain the case for mmhub use master XCC

Signed-off-by: Jane Jian 
---
 drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 19 --
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c  | 46 ++--
 2 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c 
b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
index 30a143ab592d..ad524ddc9760 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
@@ -32,6 +32,9 @@
 #include "vcn/vcn_4_0_3_sh_mask.h"
 #include "ivsrcid/vcn/irqsrcs_vcn_4_0.h"
 
+#define NORMALIZE_JPEG_REG_OFFSET(offset) \
+   (offset & 0x1)
+
 enum jpeg_engin_status {
UVD_PGFSM_STATUS__UVDJ_PWR_ON  = 0,
UVD_PGFSM_STATUS__UVDJ_PWR_OFF = 2,
@@ -824,7 +827,13 @@ void jpeg_v4_0_3_dec_ring_emit_ib(struct amdgpu_ring *ring,
 void jpeg_v4_0_3_dec_ring_emit_reg_wait(struct amdgpu_ring *ring, uint32_t reg,
uint32_t val, uint32_t mask)
 {
-   uint32_t reg_offset = (reg << 2);
+   uint32_t reg_offset;
+
+   /* For VF, only local offsets should be used */
+   if (amdgpu_sriov_vf(ring->adev))
+   reg = NORMALIZE_JPEG_REG_OFFSET(reg);
+
+   reg_offset = (reg << 2);
 
amdgpu_ring_write(ring, 
PACKETJ(regUVD_JRBC_RB_COND_RD_TIMER_INTERNAL_OFFSET,
0, 0, PACKETJ_TYPE0));
@@ -865,7 +874,13 @@ void jpeg_v4_0_3_dec_ring_emit_vm_flush(struct amdgpu_ring 
*ring,
 
 void jpeg_v4_0_3_dec_ring_emit_wreg(struct amdgpu_ring *ring, uint32_t reg, 
uint32_t val)
 {
-   uint32_t reg_offset = (reg << 2);
+   uint32_t reg_offset;
+
+   /* For VF, only local offsets should be used */
+   if (amdgpu_sriov_vf(ring->adev))
+   reg = NORMALIZE_JPEG_REG_OFFSET(reg);
+
+   reg_offset = (reg << 2);
 
amdgpu_ring_write(ring, 
PACKETJ(regUVD_JRBC_EXTERNAL_REG_INTERNAL_OFFSET,
0, 0, PACKETJ_TYPE0));
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
index 101b120f6fbd..9bae95538b62 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
@@ -45,6 +45,9 @@
 #define VCN_VID_SOC_ADDRESS_2_00x1fb00
 #define VCN1_VID_SOC_ADDRESS_3_0   0x48300
 
+#define NORMALIZE_VCN_REG_OFFSET(offset) \
+   (offset & 0x1)
+
 static int vcn_v4_0_3_start_sriov(struct amdgpu_device *adev);
 static void vcn_v4_0_3_set_unified_ring_funcs(struct amdgpu_device *adev);
 static void vcn_v4_0_3_set_irq_funcs(struct amdgpu_device *adev);
@@ -1375,6 +1378,43 @@ static uint64_t vcn_v4_0_3_unified_ring_get_wptr(struct 
amdgpu_ring *ring)
regUVD_RB_WPTR);
 }
 
+static void vcn_v4_0_3_enc_ring_emit_reg_wait(struct amdgpu_ring *ring, 
uint32_t reg,
+   uint32_t val, uint32_t mask)
+{
+   /* For VF, only local offsets should be used */
+   if (amdgpu_sriov_vf(ring->adev))
+   reg = NORMALIZE_VCN_REG_OFFSET(reg);
+
+   amdgpu_ring_write(ring, VCN_ENC_CMD_REG_WAIT);
+   amdgpu_ring_write(ring, reg << 2);
+   amdgpu_ring_write(ring, mask);
+   amdgpu_ring_write(ring, val);
+}
+
+static void vcn_v4_0_3_enc_ring_emit_wreg(struct amdgpu_ring *ring, uint32_t 
reg, uint32_t val)
+{
+   /* For VF, only local offsets should be used */
+   if (amdgpu_sriov_vf(ring->adev))
+   reg = NORMALIZE_VCN_REG_OFFSET(reg);
+
+   amdgpu_ring_write(ring, VCN_ENC_CMD_REG_WRITE);
+   amdgpu_ring_write(ring, reg << 2);
+   amdgpu_ring_write(ring, val);
+}
+
+static void vcn_v4_0_3_enc_ring_emit_vm_flush(struct amdgpu_ring *ring,
+   unsigned int vmid, uint64_t pd_addr)
+{
+   struct amdgpu_vmhub *hub = &ring->adev->vmhub[ring->vm_hub];
+
+   pd_addr = amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
+
+   /* wait for reg writes */
+   vcn_v4_0_3_enc_ring_emit_reg_wait(ring, hub->ctx0_ptb_addr_lo32 +
+   vmid * hub->ctx_addr_distance,
+   lower_32_bits(pd_addr), 0x);
+}
+
 static void vcn_v4_0_3_ring_emit_hdp_flush(struct amdgpu_ring *ring)
 {
/* VCN engine access for HDP flush doesn't work when RRMT is enabled.
@@ -1421,7 +1461,7 @@ static const struct amdgpu_ring_funcs 
vcn_v4_0_3_unified_ring_vm_funcs = {
.emit_ib_size = 5, /* vcn_v2_0_enc_ring_emit_ib */
.emit_ib = vcn_v2_0_enc_ring_emit_ib,
.emit_fence = vcn_v2_0_enc_ring_emit_fence,
-   .emit_vm_flush = vcn_v2_0_enc_ring_emit_vm_flush,
+   .emit_vm_flush = vcn_v4_0_3_enc_ring_emit_vm_flush,
.emit_hdp_flush = vcn_v4_0_3_ring_emit_hdp_flush,
.test_ring = amdgpu_vcn_enc_ring_test_ring,
.test_ib = amdgpu_vcn_unified_ring_test_ib,
@@ -1430,8 +1470,8 @@ static c

[PATCH 2/2] drm/amdgpu: Add address alignment support to DCC buffers

2024-07-16 Thread Arunpravin Paneer Selvam
Add address alignment support to the DCC VRAM buffers.

v2:
  - adjust size based on the max_texture_channel_caches values
only for GFX12 DCC buffers.
  - used AMDGPU_GEM_CREATE_GFX12_DCC flag to apply change only
for DCC buffers.
  - roundup non power of two DCC buffer adjusted size to nearest
power of two number as the buddy allocator does not support non
power of two alignments. This applies only to the contiguous
DCC buffers.

v3:(Alex)
  - rewrite the max texture channel caches comparison code in an
algorithmic way to determine the alignment size.

v4:(Alex)
  - Move the logic from amdgpu_vram_mgr_dcc_alignment() to gmc_v12_0.c
and add a new gmc func callback for dcc alignment. If the callback
is non-NULL, call it to get the alignment, otherwise, use the default.

v5:(Alex)
  - Set the Alignment to a default value if the callback doesn't exist.
  - Add the callback to amdgpu_gmc_funcs.

Signed-off-by: Arunpravin Paneer Selvam 
Acked-by: Alex Deucher 
Acked-by: Christian König 
Reviewed-by: Frank Min 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h  |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 36 ++--
 drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c   | 15 
 3 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index febca3130497..49dfcf112ac1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -156,6 +156,8 @@ struct amdgpu_gmc_funcs {
  uint64_t addr, uint64_t *flags);
/* get the amount of memory used by the vbios for pre-OS console */
unsigned int (*get_vbios_fb_size)(struct amdgpu_device *adev);
+   /* get the DCC buffer alignment */
+   u64 (*get_dcc_alignment)(struct amdgpu_device *adev);
 
enum amdgpu_memory_partition (*query_mem_partition_mode)(
struct amdgpu_device *adev);
@@ -363,6 +365,7 @@ struct amdgpu_gmc {
(adev)->gmc.gmc_funcs->override_vm_pte_flags\
((adev), (vm), (addr), (pte_flags))
 #define amdgpu_gmc_get_vbios_fb_size(adev) 
(adev)->gmc.gmc_funcs->get_vbios_fb_size((adev))
+#define amdgpu_gmc_get_dcc_alignment(adev) 
((adev)->gmc.gmc_funcs->get_dcc_alignment((adev)))
 
 /**
  * amdgpu_gmc_vram_full_visible - Check if full VRAM is visible through the BAR
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index f91cc149d06c..aa9dca12371c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -512,6 +512,16 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
vres->flags |= DRM_BUDDY_RANGE_ALLOCATION;
 
remaining_size = (u64)vres->base.size;
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
+   bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) {
+   u64 adjust_size;
+
+   if (adev->gmc.gmc_funcs->get_dcc_alignment) {
+   adjust_size = amdgpu_gmc_get_dcc_alignment(adev);
+   remaining_size = roundup_pow_of_two(remaining_size + 
adjust_size);
+   vres->flags |= DRM_BUDDY_TRIM_DISABLE;
+   }
+   }
 
mutex_lock(&mgr->lock);
while (remaining_size) {
@@ -521,8 +531,12 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
min_block_size = mgr->default_page_size;
 
size = remaining_size;
-   if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
-   !(size & (((u64)pages_per_block << PAGE_SHIFT) - 1)))
+
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
+   bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC)
+   min_block_size = size;
+   else if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
+!(size & (((u64)pages_per_block << PAGE_SHIFT) - 1)))
min_block_size = (u64)pages_per_block << PAGE_SHIFT;
 
BUG_ON(min_block_size < mm->chunk_size);
@@ -553,6 +567,24 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
}
mutex_unlock(&mgr->lock);
 
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
+   bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) {
+   struct drm_buddy_block *dcc_block;
+   u64 dcc_start, alignment;
+
+   dcc_block = amdgpu_vram_mgr_first_block(&vres->blocks);
+   dcc_start = amdgpu_vram_mgr_block_start(dcc_block);
+
+   if (adev->gmc.gmc_funcs->get_dcc_alignment) {
+   alignment = amdgpu_gmc_get_dcc_alignment(adev);
+   /* Adjust the start address for DCC buffers only */
+   dcc_start = roundup(dcc_start, alignment);
+

[PATCH 1/2] drm/buddy: Add start address support to trim function

2024-07-16 Thread Arunpravin Paneer Selvam
- Add a new start parameter in trim function to specify exact
  address from where to start the trimming. This would help us
  in situations like if drivers would like to do address alignment
  for specific requirements.

- Add a new flag DRM_BUDDY_TRIM_DISABLE. Drivers can use this
  flag to disable the allocator trimming part. This patch enables
  the drivers control trimming and they can do it themselves
  based on the application requirements.

v1:(Matthew)
  - check new_start alignment with min chunk_size
  - use range_overflows()

Signed-off-by: Arunpravin Paneer Selvam 
Acked-by: Alex Deucher 
Acked-by: Christian König 
---
 drivers/gpu/drm/drm_buddy.c  | 25 +++--
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c |  2 +-
 include/drm/drm_buddy.h  |  2 ++
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 6a8e45e9d0ec..103c185bb1c8 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -851,6 +851,7 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm,
  * drm_buddy_block_trim - free unused pages
  *
  * @mm: DRM buddy manager
+ * @start: start address to begin the trimming.
  * @new_size: original size requested
  * @blocks: Input and output list of allocated blocks.
  * MUST contain single block as input to be trimmed.
@@ -866,11 +867,13 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm,
  * 0 on success, error code on failure.
  */
 int drm_buddy_block_trim(struct drm_buddy *mm,
+u64 *start,
 u64 new_size,
 struct list_head *blocks)
 {
struct drm_buddy_block *parent;
struct drm_buddy_block *block;
+   u64 block_start, block_end;
LIST_HEAD(dfs);
u64 new_start;
int err;
@@ -882,6 +885,9 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
 struct drm_buddy_block,
 link);
 
+   block_start = drm_buddy_block_offset(block);
+   block_end = block_start + drm_buddy_block_size(mm, block);
+
if (WARN_ON(!drm_buddy_block_is_allocated(block)))
return -EINVAL;
 
@@ -894,6 +900,20 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
if (new_size == drm_buddy_block_size(mm, block))
return 0;
 
+   new_start = block_start;
+   if (start) {
+   new_start = *start;
+
+   if (new_start < block_start)
+   return -EINVAL;
+
+   if (!IS_ALIGNED(new_start, mm->chunk_size))
+   return -EINVAL;
+
+   if (range_overflows(new_start, new_size, block_end))
+   return -EINVAL;
+   }
+
list_del(&block->link);
mark_free(mm, block);
mm->avail += drm_buddy_block_size(mm, block);
@@ -904,7 +924,6 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
parent = block->parent;
block->parent = NULL;
 
-   new_start = drm_buddy_block_offset(block);
list_add(&block->tmp_link, &dfs);
err =  __alloc_range(mm, &dfs, new_start, new_size, blocks, NULL);
if (err) {
@@ -1066,7 +1085,8 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
} while (1);
 
/* Trim the allocated block to the required size */
-   if (original_size != size) {
+   if (!(flags & DRM_BUDDY_TRIM_DISABLE) &&
+   original_size != size) {
struct list_head *trim_list;
LIST_HEAD(temp);
u64 trim_size;
@@ -1083,6 +1103,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
}
 
drm_buddy_block_trim(mm,
+NULL,
 trim_size,
 trim_list);
 
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c 
b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index fe3779fdba2c..423b261ea743 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -150,7 +150,7 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager 
*man,
} while (remaining_size);
 
if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
-   if (!drm_buddy_block_trim(mm, vres->base.size, &vres->blocks))
+   if (!drm_buddy_block_trim(mm, NULL, vres->base.size, 
&vres->blocks))
size = vres->base.size;
}
 
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index 2a74fa9d0ce5..9689a7c5dd36 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -27,6 +27,7 @@
 #define DRM_BUDDY_CONTIGUOUS_ALLOCATIONBIT(2)
 #define DRM_BUDDY_CLEAR_ALLOCATION BIT(3)
 #define DRM_BUDDY_CLEARED  BIT(4)
+#define DRM_BUDDY_TRIM_DISABLE BIT(5)
 
 struct drm_buddy_block {
 #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63,

Re: [PATCH] drm/buddy: Add start address support to trim function

2024-07-16 Thread Paneer Selvam, Arunpravin

Hi Matthew,

On 7/10/2024 6:20 PM, Matthew Auld wrote:

On 10/07/2024 07:03, Paneer Selvam, Arunpravin wrote:

Thanks Alex.

Hi Matthew,
Any comments?


Do we not pass the required address alignment when allocating the 
pages in the first place?
If address alignment is really useful, we can add that in the 
drm_buddy_alloc_blocks() function.


Thanks,
Arun.




Thanks,
Arun.

On 7/9/2024 1:42 AM, Alex Deucher wrote:

On Thu, Jul 4, 2024 at 4:40 AM Arunpravin Paneer Selvam
 wrote:

- Add a new start parameter in trim function to specify exact
   address from where to start the trimming. This would help us
   in situations like if drivers would like to do address alignment
   for specific requirements.

- Add a new flag DRM_BUDDY_TRIM_DISABLE. Drivers can use this
   flag to disable the allocator trimming part. This patch enables
   the drivers control trimming and they can do it themselves
   based on the application requirements.

v1:(Matthew)
   - check new_start alignment with min chunk_size
   - use range_overflows()

Signed-off-by: Arunpravin Paneer Selvam 


Series is:
Acked-by: Alex Deucher 

I'd like to take this series through the amdgpu tree if there are no
objections as it's required for display buffers on some chips and I'd
like to make sure it lands in 6.11.

Thanks,

Alex


---
  drivers/gpu/drm/drm_buddy.c  | 25 +++--
  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c |  2 +-
  include/drm/drm_buddy.h  |  2 ++
  3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 94f8c34fc293..8cebe1fa4e9d 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -851,6 +851,7 @@ static int __alloc_contig_try_harder(struct 
drm_buddy *mm,

   * drm_buddy_block_trim - free unused pages
   *
   * @mm: DRM buddy manager
+ * @start: start address to begin the trimming.
   * @new_size: original size requested
   * @blocks: Input and output list of allocated blocks.
   * MUST contain single block as input to be trimmed.
@@ -866,11 +867,13 @@ static int __alloc_contig_try_harder(struct 
drm_buddy *mm,

   * 0 on success, error code on failure.
   */
  int drm_buddy_block_trim(struct drm_buddy *mm,
+    u64 *start,
  u64 new_size,
  struct list_head *blocks)
  {
 struct drm_buddy_block *parent;
 struct drm_buddy_block *block;
+   u64 block_start, block_end;
 LIST_HEAD(dfs);
 u64 new_start;
 int err;
@@ -882,6 +885,9 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
  struct drm_buddy_block,
  link);

+   block_start = drm_buddy_block_offset(block);
+   block_end = block_start + drm_buddy_block_size(mm, block);
+
 if (WARN_ON(!drm_buddy_block_is_allocated(block)))
 return -EINVAL;

@@ -894,6 +900,20 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
 if (new_size == drm_buddy_block_size(mm, block))
 return 0;

+   new_start = block_start;
+   if (start) {
+   new_start = *start;
+
+   if (new_start < block_start)
+   return -EINVAL;
+
+   if (!IS_ALIGNED(new_start, mm->chunk_size))
+   return -EINVAL;
+
+   if (range_overflows(new_start, new_size, block_end))
+   return -EINVAL;
+   }
+
 list_del(&block->link);
 mark_free(mm, block);
 mm->avail += drm_buddy_block_size(mm, block);
@@ -904,7 +924,6 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
 parent = block->parent;
 block->parent = NULL;

-   new_start = drm_buddy_block_offset(block);
 list_add(&block->tmp_link, &dfs);
 err =  __alloc_range(mm, &dfs, new_start, new_size, 
blocks, NULL);

 if (err) {
@@ -1066,7 +1085,8 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
 } while (1);

 /* Trim the allocated block to the required size */
-   if (original_size != size) {
+   if (!(flags & DRM_BUDDY_TRIM_DISABLE) &&
+   original_size != size) {
 struct list_head *trim_list;
 LIST_HEAD(temp);
 u64 trim_size;
@@ -1083,6 +1103,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
 }

 drm_buddy_block_trim(mm,
+    NULL,
  trim_size,
  trim_list);

diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c 
b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c

index fe3779fdba2c..423b261ea743 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -150,7 +150,7 @@ static int xe_ttm_vram_mgr_new(struct 
ttm_resource_manager *man,

 } while (remaining_size);

 if (place->flags & TTM_PL_FLAG_CONT

Re: [PATCH] drm/buddy: Add start address support to trim function

2024-07-16 Thread Matthew Auld

On 16/07/2024 10:50, Paneer Selvam, Arunpravin wrote:

Hi Matthew,

On 7/10/2024 6:20 PM, Matthew Auld wrote:

On 10/07/2024 07:03, Paneer Selvam, Arunpravin wrote:

Thanks Alex.

Hi Matthew,
Any comments?


Do we not pass the required address alignment when allocating the 
pages in the first place?
If address alignment is really useful, we can add that in the 
drm_buddy_alloc_blocks() function.


I mean don't we already pass the min page size, which should give us 
matching physical address alignment?




Thanks,
Arun.




Thanks,
Arun.

On 7/9/2024 1:42 AM, Alex Deucher wrote:

On Thu, Jul 4, 2024 at 4:40 AM Arunpravin Paneer Selvam
 wrote:

- Add a new start parameter in trim function to specify exact
   address from where to start the trimming. This would help us
   in situations like if drivers would like to do address alignment
   for specific requirements.

- Add a new flag DRM_BUDDY_TRIM_DISABLE. Drivers can use this
   flag to disable the allocator trimming part. This patch enables
   the drivers control trimming and they can do it themselves
   based on the application requirements.

v1:(Matthew)
   - check new_start alignment with min chunk_size
   - use range_overflows()

Signed-off-by: Arunpravin Paneer Selvam 


Series is:
Acked-by: Alex Deucher 

I'd like to take this series through the amdgpu tree if there are no
objections as it's required for display buffers on some chips and I'd
like to make sure it lands in 6.11.

Thanks,

Alex


---
  drivers/gpu/drm/drm_buddy.c  | 25 +++--
  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c |  2 +-
  include/drm/drm_buddy.h  |  2 ++
  3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 94f8c34fc293..8cebe1fa4e9d 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -851,6 +851,7 @@ static int __alloc_contig_try_harder(struct 
drm_buddy *mm,

   * drm_buddy_block_trim - free unused pages
   *
   * @mm: DRM buddy manager
+ * @start: start address to begin the trimming.
   * @new_size: original size requested
   * @blocks: Input and output list of allocated blocks.
   * MUST contain single block as input to be trimmed.
@@ -866,11 +867,13 @@ static int __alloc_contig_try_harder(struct 
drm_buddy *mm,

   * 0 on success, error code on failure.
   */
  int drm_buddy_block_trim(struct drm_buddy *mm,
+    u64 *start,
  u64 new_size,
  struct list_head *blocks)
  {
 struct drm_buddy_block *parent;
 struct drm_buddy_block *block;
+   u64 block_start, block_end;
 LIST_HEAD(dfs);
 u64 new_start;
 int err;
@@ -882,6 +885,9 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
  struct drm_buddy_block,
  link);

+   block_start = drm_buddy_block_offset(block);
+   block_end = block_start + drm_buddy_block_size(mm, block);
+
 if (WARN_ON(!drm_buddy_block_is_allocated(block)))
 return -EINVAL;

@@ -894,6 +900,20 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
 if (new_size == drm_buddy_block_size(mm, block))
 return 0;

+   new_start = block_start;
+   if (start) {
+   new_start = *start;
+
+   if (new_start < block_start)
+   return -EINVAL;
+
+   if (!IS_ALIGNED(new_start, mm->chunk_size))
+   return -EINVAL;
+
+   if (range_overflows(new_start, new_size, block_end))
+   return -EINVAL;
+   }
+
 list_del(&block->link);
 mark_free(mm, block);
 mm->avail += drm_buddy_block_size(mm, block);
@@ -904,7 +924,6 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
 parent = block->parent;
 block->parent = NULL;

-   new_start = drm_buddy_block_offset(block);
 list_add(&block->tmp_link, &dfs);
 err =  __alloc_range(mm, &dfs, new_start, new_size, 
blocks, NULL);

 if (err) {
@@ -1066,7 +1085,8 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
 } while (1);

 /* Trim the allocated block to the required size */
-   if (original_size != size) {
+   if (!(flags & DRM_BUDDY_TRIM_DISABLE) &&
+   original_size != size) {
 struct list_head *trim_list;
 LIST_HEAD(temp);
 u64 trim_size;
@@ -1083,6 +1103,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
 }

 drm_buddy_block_trim(mm,
+    NULL,
  trim_size,
  trim_list);

diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c 
b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c

index fe3779fdba2c..423b261ea743 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@

[PATCH v1 0/5] devcoredump for sdma v5.0 and sdma 6.0

2024-07-16 Thread Sunil Khatri
*** BLURB HERE ***

Sunil Khatri (5):
  drm/amdgpu: Add sdma_v6_0 ip dump for devcoredump
  drm/amdgpu: add print support for sdma_v_6_0 ip_dump
  drm/amdgpu: fix the extra space between two functions
  drm/amdgpu: Add sdma_v5_0 ip dump for devcoredump
  drm/amdgpu: add print support for sdma_v_5_0 ip_dump

 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 104 +++
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c |   1 +
 drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 112 +
 3 files changed, 217 insertions(+)

-- 
2.34.1



[PATCH v1 1/5] drm/amdgpu: Add sdma_v6_0 ip dump for devcoredump

2024-07-16 Thread Sunil Khatri
Add ip dump for sdma_v6_0 for devcoredump for all
instances of sdma.

Signed-off-by: Sunil Khatri 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 90 ++
 1 file changed, 90 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
index dab4c2db8c9d..102de209f120 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
@@ -57,6 +57,63 @@ MODULE_FIRMWARE("amdgpu/sdma_6_1_2.bin");
 #define SDMA0_HYP_DEC_REG_END 0x589a
 #define SDMA1_HYP_DEC_REG_OFFSET 0x20
 
+static const struct amdgpu_hwip_reg_entry sdma_reg_list_6_0[] = {
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_STATUS_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_STATUS1_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_STATUS2_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_STATUS3_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_STATUS4_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_STATUS5_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_STATUS6_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_UCODE_CHECKSUM),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_RB_RPTR_FETCH_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_RB_RPTR_FETCH),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_UTCL1_RD_STATUS),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_UTCL1_WR_STATUS),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_UTCL1_RD_XNACK0),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_UTCL1_RD_XNACK1),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_UTCL1_WR_XNACK0),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_UTCL1_WR_XNACK1),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_RB_CNTL),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_RB_RPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_RB_RPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_RB_WPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_RB_WPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_IB_OFFSET),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_IB_BASE_LO),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_IB_BASE_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_IB_CNTL),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_IB_RPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_IB_SUB_REMAIN),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE0_DUMMY_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE_STATUS0),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_RB_CNTL),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_RB_RPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_RB_RPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_RB_WPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_RB_WPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_IB_OFFSET),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_IB_BASE_LO),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_IB_BASE_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_IB_RPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_IB_SUB_REMAIN),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE1_DUMMY_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_RB_CNTL),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_RB_RPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_RB_RPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_RB_WPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_RB_WPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_IB_OFFSET),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_IB_BASE_LO),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_IB_BASE_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_IB_RPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_IB_SUB_REMAIN),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_QUEUE2_DUMMY_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_INT_STATUS),
+   SOC15_REG_ENTRY_STR(GC, 0, regGRBM_STATUS2),
+   SOC15_REG_ENTRY_STR(GC, 0, regSDMA0_CHICKEN_BITS),
+};
+
 static void sdma_v6_0_set_ring_funcs(struct amdgpu_device *adev);
 static void sdma_v6_0_set_buffer_funcs(struct amdgpu_device *adev);
 static void sdma_v6_0_set_vm_pte_funcs(struct amdgpu_device *adev);
@@ -1239,6 +1296,8 @@ static int sdma_v6_0_sw_init(void *handle)
struct amdgpu_ring *ring;
int r, i;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+   uint32_t reg_count = ARRAY_SIZE(sdma_reg_list_6_0);
+   uint32_t *ptr;
 
/* SDMA trap event */
r = amdgpu_irq_add_id(adev, SOC21_IH_CLIENTID_GFX,
@@ -1274,6 +1333,13 @@ static int sdma_v6_0_sw_init(void *handle)
return -EINVAL;
}
 
+   /* Allocate memory for SDMA IP Dump buffer */
+   ptr = kcalloc(adev->sdma.num_instances * reg_count, sizeof(uint32_t), 
GFP_KERNEL);
+   if (ptr)
+   adev->sdma.ip_dump = ptr;
+   else
+   DRM_ERROR("Failed to allocated memory for SDMA IP Dump\n");
+
return r;
 }
 
@@ -1287,6 +1353,8 @@ static

[PATCH v1 2/5] drm/amdgpu: add print support for sdma_v_6_0 ip_dump

2024-07-16 Thread Sunil Khatri
Add print support for ip dump for sdma_v_6_0 in
devcoredump.

Signed-off-by: Sunil Khatri 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
index 102de209f120..208a1fa9d4e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
@@ -1556,6 +1556,27 @@ static void sdma_v6_0_get_clockgating_state(void 
*handle, u64 *flags)
 {
 }
 
+static void sdma_v6_0_print_ip_state(void *handle, struct drm_printer *p)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+   int i, j;
+   uint32_t reg_count = ARRAY_SIZE(sdma_reg_list_6_0);
+   uint32_t instance_offset;
+
+   if (!adev->sdma.ip_dump)
+   return;
+
+   drm_printf(p, "num_instances:%d\n", adev->sdma.num_instances);
+   for (i = 0; i < adev->sdma.num_instances; i++) {
+   instance_offset = i * reg_count;
+   drm_printf(p, "\nInstance:%d\n", i);
+
+   for (j = 0; j < reg_count; j++)
+   drm_printf(p, "%-50s \t 0x%08x\n", 
sdma_reg_list_6_0[j].reg_name,
+  adev->sdma.ip_dump[instance_offset + j]);
+   }
+}
+
 static void sdma_v6_0_dump_ip_state(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
@@ -1595,6 +1616,7 @@ const struct amd_ip_funcs sdma_v6_0_ip_funcs = {
.set_powergating_state = sdma_v6_0_set_powergating_state,
.get_clockgating_state = sdma_v6_0_get_clockgating_state,
.dump_ip_state = sdma_v6_0_dump_ip_state,
+   .print_ip_state = sdma_v6_0_print_ip_state,
 };
 
 static const struct amdgpu_ring_funcs sdma_v6_0_ring_funcs = {
-- 
2.34.1



[PATCH v1 4/5] drm/amdgpu: Add sdma_v5_0 ip dump for devcoredump

2024-07-16 Thread Sunil Khatri
Add ip dump for sdma_v5_0 for devcoredump for all
instances of sdma.

Signed-off-by: Sunil Khatri 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 82 ++
 1 file changed, 82 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index b7d33d78bce0..cb324a90b310 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -59,6 +59,55 @@ MODULE_FIRMWARE("amdgpu/cyan_skillfish2_sdma1.bin");
 #define SDMA0_HYP_DEC_REG_END 0x5893
 #define SDMA1_HYP_DEC_REG_OFFSET 0x20
 
+static const struct amdgpu_hwip_reg_entry sdma_reg_list_5_0[] = {
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_STATUS_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_STATUS1_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_STATUS2_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_STATUS3_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_UCODE_CHECKSUM),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RB_RPTR_FETCH_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RB_RPTR_FETCH),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_UTCL1_RD_STATUS),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_UTCL1_WR_STATUS),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_UTCL1_RD_XNACK0),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_UTCL1_RD_XNACK1),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_UTCL1_WR_XNACK0),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_UTCL1_WR_XNACK1),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_RB_CNTL),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_RB_RPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_RB_RPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_RB_WPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_RB_WPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_IB_OFFSET),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_IB_BASE_LO),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_IB_BASE_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_IB_CNTL),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_IB_RPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_IB_SUB_REMAIN),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_GFX_DUMMY_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_PAGE_RB_CNTL),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_PAGE_RB_RPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_PAGE_RB_RPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_PAGE_RB_WPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_PAGE_RB_WPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_PAGE_IB_OFFSET),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_PAGE_IB_BASE_LO),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_PAGE_IB_BASE_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_PAGE_DUMMY_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RLC0_RB_CNTL),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RLC0_RB_RPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RLC0_RB_RPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RLC0_RB_WPTR),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RLC0_RB_WPTR_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RLC0_IB_OFFSET),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RLC0_IB_BASE_LO),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RLC0_IB_BASE_HI),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_RLC0_DUMMY_REG),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_INT_STATUS),
+   SOC15_REG_ENTRY_STR(GC, 0, mmSDMA0_VM_CNTL),
+   SOC15_REG_ENTRY_STR(GC, 0, mmGRBM_STATUS2)
+};
+
 static void sdma_v5_0_set_ring_funcs(struct amdgpu_device *adev);
 static void sdma_v5_0_set_buffer_funcs(struct amdgpu_device *adev);
 static void sdma_v5_0_set_vm_pte_funcs(struct amdgpu_device *adev);
@@ -1341,6 +1390,8 @@ static int sdma_v5_0_sw_init(void *handle)
struct amdgpu_ring *ring;
int r, i;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+   uint32_t reg_count = ARRAY_SIZE(sdma_reg_list_5_0);
+   uint32_t *ptr;
 
/* SDMA trap event */
r = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_SDMA0,
@@ -1378,6 +1429,13 @@ static int sdma_v5_0_sw_init(void *handle)
return r;
}
 
+   /* Allocate memory for SDMA IP Dump buffer */
+   ptr = kcalloc(adev->sdma.num_instances * reg_count, sizeof(uint32_t), 
GFP_KERNEL);
+   if (ptr)
+   adev->sdma.ip_dump = ptr;
+   else
+   DRM_ERROR("Failed to allocated memory for SDMA IP Dump\n");
+
return r;
 }
 
@@ -1391,6 +1449,8 @@ static int sdma_v5_0_sw_fini(void *handle)
 
amdgpu_sdma_destroy_inst_ctx(adev, false);
 
+   kfree(adev->sdma.ip_dump);
+
return 0;
 }
 
@@ -1718,6 +1778,27 @@ static void sdma_v5_0_get_clockgating_state(void 
*handle, u64 *flags)
*flags |= AMD_CG_SUPPORT_SDMA_LS;
 }
 
+static void sdma_v5_0_dump_ip_state(void *handle)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+   int i, j;
+   uint32_t instance_offset;
+   uint32_t reg_count = ARRAY_SIZE(sdma_reg_list_5_0);
+
+   if (!adev->sdma.ip_dump)
+   return;
+
+ 

[PATCH v1 5/5] drm/amdgpu: add print support for sdma_v_5_0 ip_dump

2024-07-16 Thread Sunil Khatri
Add support for ip dump for sdma_v_5_0 in devcoredump.

Signed-off-by: Sunil Khatri 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index cb324a90b310..d5f0dc132a47 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -1778,6 +1778,27 @@ static void sdma_v5_0_get_clockgating_state(void 
*handle, u64 *flags)
*flags |= AMD_CG_SUPPORT_SDMA_LS;
 }
 
+static void sdma_v5_0_print_ip_state(void *handle, struct drm_printer *p)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+   int i, j;
+   uint32_t reg_count = ARRAY_SIZE(sdma_reg_list_5_0);
+   uint32_t instance_offset;
+
+   if (!adev->sdma.ip_dump)
+   return;
+
+   drm_printf(p, "num_instances:%d\n", adev->sdma.num_instances);
+   for (i = 0; i < adev->sdma.num_instances; i++) {
+   instance_offset = i * reg_count;
+   drm_printf(p, "\nInstance:%d\n", i);
+
+   for (j = 0; j < reg_count; j++)
+   drm_printf(p, "%-50s \t 0x%08x\n", 
sdma_reg_list_5_0[j].reg_name,
+  adev->sdma.ip_dump[instance_offset + j]);
+   }
+}
+
 static void sdma_v5_0_dump_ip_state(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
@@ -1816,6 +1837,7 @@ const struct amd_ip_funcs sdma_v5_0_ip_funcs = {
.set_powergating_state = sdma_v5_0_set_powergating_state,
.get_clockgating_state = sdma_v5_0_get_clockgating_state,
.dump_ip_state = sdma_v5_0_dump_ip_state,
+   .print_ip_state = sdma_v5_0_print_ip_state,
 };
 
 static const struct amdgpu_ring_funcs sdma_v5_0_ring_funcs = {
-- 
2.34.1



[PATCH v1 3/5] drm/amdgpu: fix the extra space between two functions

2024-07-16 Thread Sunil Khatri
fix extra line space between two functions.

Signed-off-by: Sunil Khatri 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 630b03f2ce3d..66bb85955fa4 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -1742,6 +1742,7 @@ static void sdma_v5_2_print_ip_state(void *handle, struct 
drm_printer *p)
   adev->sdma.ip_dump[instance_offset + j]);
}
 }
+
 static void sdma_v5_2_dump_ip_state(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-- 
2.34.1



Re: [PATCH v1 5/5] drm/amdgpu: add print support for sdma_v_5_0 ip_dump

2024-07-16 Thread Alex Deucher
Series is:
Reviewed-by: Alex Deucher 

On Tue, Jul 16, 2024 at 7:20 AM Sunil Khatri  wrote:
>
> Add support for ip dump for sdma_v_5_0 in devcoredump.
>
> Signed-off-by: Sunil Khatri 
> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 22 ++
>  1 file changed, 22 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c 
> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> index cb324a90b310..d5f0dc132a47 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> @@ -1778,6 +1778,27 @@ static void sdma_v5_0_get_clockgating_state(void 
> *handle, u64 *flags)
> *flags |= AMD_CG_SUPPORT_SDMA_LS;
>  }
>
> +static void sdma_v5_0_print_ip_state(void *handle, struct drm_printer *p)
> +{
> +   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> +   int i, j;
> +   uint32_t reg_count = ARRAY_SIZE(sdma_reg_list_5_0);
> +   uint32_t instance_offset;
> +
> +   if (!adev->sdma.ip_dump)
> +   return;
> +
> +   drm_printf(p, "num_instances:%d\n", adev->sdma.num_instances);
> +   for (i = 0; i < adev->sdma.num_instances; i++) {
> +   instance_offset = i * reg_count;
> +   drm_printf(p, "\nInstance:%d\n", i);
> +
> +   for (j = 0; j < reg_count; j++)
> +   drm_printf(p, "%-50s \t 0x%08x\n", 
> sdma_reg_list_5_0[j].reg_name,
> +  adev->sdma.ip_dump[instance_offset + j]);
> +   }
> +}
> +
>  static void sdma_v5_0_dump_ip_state(void *handle)
>  {
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> @@ -1816,6 +1837,7 @@ const struct amd_ip_funcs sdma_v5_0_ip_funcs = {
> .set_powergating_state = sdma_v5_0_set_powergating_state,
> .get_clockgating_state = sdma_v5_0_get_clockgating_state,
> .dump_ip_state = sdma_v5_0_dump_ip_state,
> +   .print_ip_state = sdma_v5_0_print_ip_state,
>  };
>
>  static const struct amdgpu_ring_funcs sdma_v5_0_ring_funcs = {
> --
> 2.34.1
>


Re: [PATCH 6/6] Documentation/amdgpu: Fix duplicate declaration

2024-07-16 Thread Alex Deucher
Series is:
Acked-by: Alex Deucher 

On Mon, Jul 15, 2024 at 10:50 PM Rodrigo Siqueira
 wrote:
>
> Address the below kernel doc warning:
>
> Documentation/gpu/amdgpu/display/display-manager:134:
> drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h:3: WARNING: Duplicate C
> declaration, also defined at gpu/amdgpu/display/dcn-blocks:101.
> Declaration is '.. c:struct:: mpcc_blnd_cfg'.
> Documentation/gpu/amdgpu/display/display-manager:146:
> drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h:3: WARNING: Duplicate C
> declaration, also defined at gpu/amdgpu/display/dcn-blocks:3.
> Declaration is '.. c:enum:: mpcc_alpha_blend_mode'.
>
> To address the above warnings, this commit uses the 'no-identifiers'
> option in the dcn-blocks to avoid duplication with the previous use of
> this function doc in the display-manager file. Finally, replaces the
> deprecated ':function:' in favor of ':identifiers:'.
>
> Cc: Alex Deucher 
> Reported-by: Stephen Rothwell 
> Signed-off-by: Rodrigo Siqueira 
> ---
>  Documentation/gpu/amdgpu/display/dcn-blocks.rst  | 1 +
>  Documentation/gpu/amdgpu/display/display-manager.rst | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/gpu/amdgpu/display/dcn-blocks.rst 
> b/Documentation/gpu/amdgpu/display/dcn-blocks.rst
> index f80df596ef5c..5e34366f6dbe 100644
> --- a/Documentation/gpu/amdgpu/display/dcn-blocks.rst
> +++ b/Documentation/gpu/amdgpu/display/dcn-blocks.rst
> @@ -34,6 +34,7 @@ MPC
>
>  .. kernel-doc:: drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h
> :internal:
> +   :no-identifiers: mpcc_blnd_cfg mpcc_alpha_blend_mode
>
>  OPP
>  ---
> diff --git a/Documentation/gpu/amdgpu/display/display-manager.rst 
> b/Documentation/gpu/amdgpu/display/display-manager.rst
> index 67a811e6891f..b269ff3f7a54 100644
> --- a/Documentation/gpu/amdgpu/display/display-manager.rst
> +++ b/Documentation/gpu/amdgpu/display/display-manager.rst
> @@ -132,7 +132,7 @@ The DRM blend mode and its elements are then mapped by 
> AMDGPU display manager
>  (MPC), as follows:
>
>  .. kernel-doc:: drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h
> -   :functions: mpcc_blnd_cfg
> +   :identifiers: mpcc_blnd_cfg
>
>  Therefore, the blending configuration for a single MPCC instance on the MPC
>  tree is defined by :c:type:`mpcc_blnd_cfg`, where
> @@ -144,7 +144,7 @@ alpha and plane alpha values. It sets one of the three 
> modes for
>  :c:type:`MPCC_ALPHA_BLND_MODE`, as described below.
>
>  .. kernel-doc:: drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h
> -   :functions: mpcc_alpha_blend_mode
> +   :identifiers: mpcc_alpha_blend_mode
>
>  DM then maps the elements of `enum mpcc_alpha_blend_mode` to those in the DRM
>  blend formula, as follows:
> --
> 2.43.0
>


[PATCH AUTOSEL 6.9 11/22] drm/amd/display: Reset freesync config before update new state

2024-07-16 Thread Sasha Levin
From: Tom Chung 

[ Upstream commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b ]

[Why]
Sometimes the new_crtc_state->vrr_infopacket did not sync up with the
current state.
It will affect the update_freesync_state_on_stream() does not update
the state correctly.

[How]
Reset the freesync config before get_freesync_config_for_crtc() to
make sure we have the correct new_crtc_state for VRR.

Reviewed-by: Sun peng Li 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f866a02f4f489..53a55270998cc 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10028,6 +10028,7 @@ static int dm_update_crtc_state(struct 
amdgpu_display_manager *dm,
}
 
/* Update Freesync settings. */
+   reset_freesync_config_for_crtc(dm_new_crtc_state);
get_freesync_config_for_crtc(dm_new_crtc_state,
 dm_new_conn_state);
 
-- 
2.43.0



[PATCH AUTOSEL 6.9 12/22] drm/amd/display: Add refresh rate range check

2024-07-16 Thread Sasha Levin
From: Tom Chung 

[ Upstream commit 74ad26b36d303ac233eccadc5c3a8d7ee4709f31 ]

[Why]
We only enable the VRR while monitor usable refresh rate range
is greater than 10 Hz.
But we did not check the range in DRM_EDID_FEATURE_CONTINUOUS_FREQ
case.

[How]
Add a refresh rate range check before set the freesync_capable flag
in DRM_EDID_FEATURE_CONTINUOUS_FREQ case.

Reviewed-by: Mario Limonciello 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 53a55270998cc..6f43797e1c060 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -11290,9 +11290,11 @@ void amdgpu_dm_update_freesync_caps(struct 
drm_connector *connector,
if (is_dp_capable_without_timing_msa(adev->dm.dc,
 amdgpu_dm_connector)) {
if (edid->features & DRM_EDID_FEATURE_CONTINUOUS_FREQ) {
-   freesync_capable = true;
amdgpu_dm_connector->min_vfreq = 
connector->display_info.monitor_range.min_vfreq;
amdgpu_dm_connector->max_vfreq = 
connector->display_info.monitor_range.max_vfreq;
+   if (amdgpu_dm_connector->max_vfreq -
+   amdgpu_dm_connector->min_vfreq > 10)
+   freesync_capable = true;
} else {
edid_check_required = edid->version > 1 ||
  (edid->version == 1 &&
-- 
2.43.0



[PATCH AUTOSEL 6.9 13/22] drm/amd/display: Account for cursor prefetch BW in DML1 mode support

2024-07-16 Thread Sasha Levin
From: Alvin Lee 

[ Upstream commit 074b3a886713f69d98d30bb348b1e4cb3ce52b22 ]

[Description]
We need to ensure to take into account cursor prefetch BW in
mode support or we may pass ModeQuery but fail an actual flip
which will cause a hang. Flip may fail because the cursor_pre_bw
is populated during mode programming (and mode programming is
never called prior to ModeQuery).

Reviewed-by: Chaitanya Dhere 
Reviewed-by: Nevenko Stupar 
Signed-off-by: Jerry Zuo 
Signed-off-by: Alvin Lee 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index 6c84b0fa40f44..0782a34689a00 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -3364,6 +3364,9 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l

&mode_lib->vba.UrgentBurstFactorLumaPre[k],

&mode_lib->vba.UrgentBurstFactorChromaPre[k],

&mode_lib->vba.NotUrgentLatencyHidingPre[k]);
+
+   v->cursor_bw_pre[k] = 
mode_lib->vba.NumberOfCursors[k] * mode_lib->vba.CursorWidth[k][0] * 
mode_lib->vba.CursorBPP[k][0] /
+   8.0 / 
(mode_lib->vba.HTotal[k] / mode_lib->vba.PixelClock[k]) * 
v->VRatioPreY[i][j][k];
}
 
{
-- 
2.43.0



[PATCH AUTOSEL 6.9 14/22] drm/amd/display: Fix refresh rate range for some panel

2024-07-16 Thread Sasha Levin
From: Tom Chung 

[ Upstream commit 9ef1548aeaa8858e7aee2152bf95cc71cdcd6dff ]

[Why]
Some of the panels does not have the refresh rate range info
in base EDID and only have the refresh rate range info in
DisplayID block.
It will cause the max/min freesync refresh rate set to 0.

[How]
Try to parse the refresh rate range info from DisplayID if the
max/min refresh rate is 0.

Reviewed-by: Sun peng Li 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 48 +++
 1 file changed, 48 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6f43797e1c060..fc47d68877654 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -11162,6 +11162,49 @@ static bool parse_edid_cea(struct amdgpu_dm_connector 
*aconnector,
return ret;
 }
 
+static void parse_edid_displayid_vrr(struct drm_connector *connector,
+   struct edid *edid)
+{
+   u8 *edid_ext = NULL;
+   int i;
+   int j = 0;
+   u16 min_vfreq;
+   u16 max_vfreq;
+
+   if (edid == NULL || edid->extensions == 0)
+   return;
+
+   /* Find DisplayID extension */
+   for (i = 0; i < edid->extensions; i++) {
+   edid_ext = (void *)(edid + (i + 1));
+   if (edid_ext[0] == DISPLAYID_EXT)
+   break;
+   }
+
+   if (edid_ext == NULL)
+   return;
+
+   while (j < EDID_LENGTH) {
+   /* Get dynamic video timing range from DisplayID if available */
+   if (EDID_LENGTH - j > 13 && edid_ext[j] == 0x25 &&
+   (edid_ext[j+1] & 0xFE) == 0 && (edid_ext[j+2] == 9)) {
+   min_vfreq = edid_ext[j+9];
+   if (edid_ext[j+1] & 7)
+   max_vfreq = edid_ext[j+10] + ((edid_ext[j+11] & 
3) << 8);
+   else
+   max_vfreq = edid_ext[j+10];
+
+   if (max_vfreq && min_vfreq) {
+   connector->display_info.monitor_range.max_vfreq 
= max_vfreq;
+   connector->display_info.monitor_range.min_vfreq 
= min_vfreq;
+
+   return;
+   }
+   }
+   j++;
+   }
+}
+
 static int parse_amd_vsdb(struct amdgpu_dm_connector *aconnector,
  struct edid *edid, struct amdgpu_hdmi_vsdb_info 
*vsdb_info)
 {
@@ -11283,6 +11326,11 @@ void amdgpu_dm_update_freesync_caps(struct 
drm_connector *connector,
if (!adev->dm.freesync_module)
goto update;
 
+   /* Some eDP panels only have the refresh rate range info in DisplayID */
+   if ((connector->display_info.monitor_range.min_vfreq == 0 ||
+connector->display_info.monitor_range.max_vfreq == 0))
+   parse_edid_displayid_vrr(connector, edid);
+
if (edid && (sink->sink_signal == SIGNAL_TYPE_DISPLAY_PORT ||
 sink->sink_signal == SIGNAL_TYPE_EDP)) {
bool edid_check_required = false;
-- 
2.43.0



[PATCH AUTOSEL 6.9 15/22] drm/amd/display: Update efficiency bandwidth for dcn351

2024-07-16 Thread Sasha Levin
From: Fangzhi Zuo 

[ Upstream commit 7ae37db29a8bc4d3d116a409308dd98fc3a0b1b3 ]

Fix 4k240 underflow on dcn351

Acked-by: Rodrigo Siqueira 
Signed-off-by: Fangzhi Zuo 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c
index a20f28a5d2e7b..3af759dca6ebf 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c
@@ -233,6 +233,7 @@ void dml2_init_socbb_params(struct dml2_context *dml2, 
const struct dc *in_dc, s
out->round_trip_ping_latency_dcfclk_cycles = 106;
out->smn_latency_us = 2;
out->dispclk_dppclk_vco_speed_mhz = 3600;
+   out->pct_ideal_dram_bw_after_urgent_pixel_only = 65.0;
break;
 
}
-- 
2.43.0



[PATCH AUTOSEL 6.9 17/22] drm/radeon: check bo_va->bo is non-NULL before using it

2024-07-16 Thread Sasha Levin
From: Pierre-Eric Pelloux-Prayer 

[ Upstream commit 6fb15dcbcf4f212930350eaee174bb60ed40a536 ]

The call to radeon_vm_clear_freed might clear bo_va->bo, so
we have to check it before dereferencing it.

Signed-off-by: Pierre-Eric Pelloux-Prayer 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 3fec3acdaf284..27225d1fe8d2e 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -641,7 +641,7 @@ static void radeon_gem_va_update_vm(struct radeon_device 
*rdev,
if (r)
goto error_unlock;
 
-   if (bo_va->it.start)
+   if (bo_va->it.start && bo_va->bo)
r = radeon_vm_bo_update(rdev, bo_va, bo_va->bo->tbo.resource);
 
 error_unlock:
-- 
2.43.0



[PATCH AUTOSEL 6.9 16/22] drm/amd/display: Fix array-index-out-of-bounds in dml2/FCLKChangeSupport

2024-07-16 Thread Sasha Levin
From: Roman Li 

[ Upstream commit 0ad4b4a2f6357c45fbe444ead1a929a0b4017d03 ]

[Why]
Potential out of bounds access in dml2_calculate_rq_and_dlg_params()
because the value of out_lowest_state_idx used as an index for FCLKChangeSupport
array can be greater than 1.

[How]
Currently dml2 core specifies identical values for all FCLKChangeSupport
elements. Always use index 0 in the condition to avoid out of bounds access.

Acked-by: Rodrigo Siqueira 
Signed-off-by: Jerry Zuo 
Signed-off-by: Roman Li 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
index b72ed3e78df05..bb4e812248aec 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c
@@ -294,7 +294,7 @@ void dml2_calculate_rq_and_dlg_params(const struct dc *dc, 
struct dc_state *cont
context->bw_ctx.bw.dcn.clk.dcfclk_deep_sleep_khz = (unsigned 
int)in_ctx->v20.dml_core_ctx.mp.DCFCLKDeepSleep * 1000;
context->bw_ctx.bw.dcn.clk.dppclk_khz = 0;
 
-   if 
(in_ctx->v20.dml_core_ctx.ms.support.FCLKChangeSupport[in_ctx->v20.scratch.mode_support_params.out_lowest_state_idx]
 == dml_fclock_change_unsupported)
+   if (in_ctx->v20.dml_core_ctx.ms.support.FCLKChangeSupport[0] == 
dml_fclock_change_unsupported)
context->bw_ctx.bw.dcn.clk.fclk_p_state_change_support = false;
else
context->bw_ctx.bw.dcn.clk.fclk_p_state_change_support = true;
-- 
2.43.0



[PATCH AUTOSEL 6.6 10/18] drm/amd/display: Reset freesync config before update new state

2024-07-16 Thread Sasha Levin
From: Tom Chung 

[ Upstream commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b ]

[Why]
Sometimes the new_crtc_state->vrr_infopacket did not sync up with the
current state.
It will affect the update_freesync_state_on_stream() does not update
the state correctly.

[How]
Reset the freesync config before get_freesync_config_for_crtc() to
make sure we have the correct new_crtc_state for VRR.

Reviewed-by: Sun peng Li 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 7ed6bb61fe0ad..a1acc8108586f 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -9517,6 +9517,7 @@ static int dm_update_crtc_state(struct 
amdgpu_display_manager *dm,
}
 
/* Update Freesync settings. */
+   reset_freesync_config_for_crtc(dm_new_crtc_state);
get_freesync_config_for_crtc(dm_new_crtc_state,
 dm_new_conn_state);
 
-- 
2.43.0



[PATCH AUTOSEL 6.6 11/18] drm/amd/display: Add refresh rate range check

2024-07-16 Thread Sasha Levin
From: Tom Chung 

[ Upstream commit 74ad26b36d303ac233eccadc5c3a8d7ee4709f31 ]

[Why]
We only enable the VRR while monitor usable refresh rate range
is greater than 10 Hz.
But we did not check the range in DRM_EDID_FEATURE_CONTINUOUS_FREQ
case.

[How]
Add a refresh rate range check before set the freesync_capable flag
in DRM_EDID_FEATURE_CONTINUOUS_FREQ case.

Reviewed-by: Mario Limonciello 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a1acc8108586f..023fd3945e47a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10760,9 +10760,11 @@ void amdgpu_dm_update_freesync_caps(struct 
drm_connector *connector,
if (is_dp_capable_without_timing_msa(adev->dm.dc,
 amdgpu_dm_connector)) {
if (edid->features & DRM_EDID_FEATURE_CONTINUOUS_FREQ) {
-   freesync_capable = true;
amdgpu_dm_connector->min_vfreq = 
connector->display_info.monitor_range.min_vfreq;
amdgpu_dm_connector->max_vfreq = 
connector->display_info.monitor_range.max_vfreq;
+   if (amdgpu_dm_connector->max_vfreq -
+   amdgpu_dm_connector->min_vfreq > 10)
+   freesync_capable = true;
} else {
edid_check_required = edid->version > 1 ||
  (edid->version == 1 &&
-- 
2.43.0



[PATCH AUTOSEL 6.6 12/18] drm/amd/display: Account for cursor prefetch BW in DML1 mode support

2024-07-16 Thread Sasha Levin
From: Alvin Lee 

[ Upstream commit 074b3a886713f69d98d30bb348b1e4cb3ce52b22 ]

[Description]
We need to ensure to take into account cursor prefetch BW in
mode support or we may pass ModeQuery but fail an actual flip
which will cause a hang. Flip may fail because the cursor_pre_bw
is populated during mode programming (and mode programming is
never called prior to ModeQuery).

Reviewed-by: Chaitanya Dhere 
Reviewed-by: Nevenko Stupar 
Signed-off-by: Jerry Zuo 
Signed-off-by: Alvin Lee 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index 6c84b0fa40f44..0782a34689a00 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -3364,6 +3364,9 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l

&mode_lib->vba.UrgentBurstFactorLumaPre[k],

&mode_lib->vba.UrgentBurstFactorChromaPre[k],

&mode_lib->vba.NotUrgentLatencyHidingPre[k]);
+
+   v->cursor_bw_pre[k] = 
mode_lib->vba.NumberOfCursors[k] * mode_lib->vba.CursorWidth[k][0] * 
mode_lib->vba.CursorBPP[k][0] /
+   8.0 / 
(mode_lib->vba.HTotal[k] / mode_lib->vba.PixelClock[k]) * 
v->VRatioPreY[i][j][k];
}
 
{
-- 
2.43.0



[PATCH AUTOSEL 6.6 13/18] drm/amd/display: Fix refresh rate range for some panel

2024-07-16 Thread Sasha Levin
From: Tom Chung 

[ Upstream commit 9ef1548aeaa8858e7aee2152bf95cc71cdcd6dff ]

[Why]
Some of the panels does not have the refresh rate range info
in base EDID and only have the refresh rate range info in
DisplayID block.
It will cause the max/min freesync refresh rate set to 0.

[How]
Try to parse the refresh rate range info from DisplayID if the
max/min refresh rate is 0.

Reviewed-by: Sun peng Li 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 48 +++
 1 file changed, 48 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 023fd3945e47a..e7664a39bfceb 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10631,6 +10631,49 @@ static bool parse_edid_cea(struct amdgpu_dm_connector 
*aconnector,
return ret;
 }
 
+static void parse_edid_displayid_vrr(struct drm_connector *connector,
+   struct edid *edid)
+{
+   u8 *edid_ext = NULL;
+   int i;
+   int j = 0;
+   u16 min_vfreq;
+   u16 max_vfreq;
+
+   if (edid == NULL || edid->extensions == 0)
+   return;
+
+   /* Find DisplayID extension */
+   for (i = 0; i < edid->extensions; i++) {
+   edid_ext = (void *)(edid + (i + 1));
+   if (edid_ext[0] == DISPLAYID_EXT)
+   break;
+   }
+
+   if (edid_ext == NULL)
+   return;
+
+   while (j < EDID_LENGTH) {
+   /* Get dynamic video timing range from DisplayID if available */
+   if (EDID_LENGTH - j > 13 && edid_ext[j] == 0x25 &&
+   (edid_ext[j+1] & 0xFE) == 0 && (edid_ext[j+2] == 9)) {
+   min_vfreq = edid_ext[j+9];
+   if (edid_ext[j+1] & 7)
+   max_vfreq = edid_ext[j+10] + ((edid_ext[j+11] & 
3) << 8);
+   else
+   max_vfreq = edid_ext[j+10];
+
+   if (max_vfreq && min_vfreq) {
+   connector->display_info.monitor_range.max_vfreq 
= max_vfreq;
+   connector->display_info.monitor_range.min_vfreq 
= min_vfreq;
+
+   return;
+   }
+   }
+   j++;
+   }
+}
+
 static int parse_amd_vsdb(struct amdgpu_dm_connector *aconnector,
  struct edid *edid, struct amdgpu_hdmi_vsdb_info 
*vsdb_info)
 {
@@ -10753,6 +10796,11 @@ void amdgpu_dm_update_freesync_caps(struct 
drm_connector *connector,
if (!adev->dm.freesync_module)
goto update;
 
+   /* Some eDP panels only have the refresh rate range info in DisplayID */
+   if ((connector->display_info.monitor_range.min_vfreq == 0 ||
+connector->display_info.monitor_range.max_vfreq == 0))
+   parse_edid_displayid_vrr(connector, edid);
+
if (edid && (sink->sink_signal == SIGNAL_TYPE_DISPLAY_PORT ||
 sink->sink_signal == SIGNAL_TYPE_EDP)) {
bool edid_check_required = false;
-- 
2.43.0



[PATCH AUTOSEL 6.6 14/18] drm/radeon: check bo_va->bo is non-NULL before using it

2024-07-16 Thread Sasha Levin
From: Pierre-Eric Pelloux-Prayer 

[ Upstream commit 6fb15dcbcf4f212930350eaee174bb60ed40a536 ]

The call to radeon_vm_clear_freed might clear bo_va->bo, so
we have to check it before dereferencing it.

Signed-off-by: Pierre-Eric Pelloux-Prayer 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 3fec3acdaf284..27225d1fe8d2e 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -641,7 +641,7 @@ static void radeon_gem_va_update_vm(struct radeon_device 
*rdev,
if (r)
goto error_unlock;
 
-   if (bo_va->it.start)
+   if (bo_va->it.start && bo_va->bo)
r = radeon_vm_bo_update(rdev, bo_va, bo_va->bo->tbo.resource);
 
 error_unlock:
-- 
2.43.0



[PATCH AUTOSEL 6.1 09/15] drm/amd/display: Reset freesync config before update new state

2024-07-16 Thread Sasha Levin
From: Tom Chung 

[ Upstream commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b ]

[Why]
Sometimes the new_crtc_state->vrr_infopacket did not sync up with the
current state.
It will affect the update_freesync_state_on_stream() does not update
the state correctly.

[How]
Reset the freesync config before get_freesync_config_for_crtc() to
make sure we have the correct new_crtc_state for VRR.

Reviewed-by: Sun peng Li 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 31bae620aeffc..ebf53a9a9dc89 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -9278,6 +9278,7 @@ static int dm_update_crtc_state(struct 
amdgpu_display_manager *dm,
}
 
/* Update Freesync settings. */
+   reset_freesync_config_for_crtc(dm_new_crtc_state);
get_freesync_config_for_crtc(dm_new_crtc_state,
 dm_new_conn_state);
 
-- 
2.43.0



[PATCH AUTOSEL 6.1 10/15] drm/amd/display: Account for cursor prefetch BW in DML1 mode support

2024-07-16 Thread Sasha Levin
From: Alvin Lee 

[ Upstream commit 074b3a886713f69d98d30bb348b1e4cb3ce52b22 ]

[Description]
We need to ensure to take into account cursor prefetch BW in
mode support or we may pass ModeQuery but fail an actual flip
which will cause a hang. Flip may fail because the cursor_pre_bw
is populated during mode programming (and mode programming is
never called prior to ModeQuery).

Reviewed-by: Chaitanya Dhere 
Reviewed-by: Nevenko Stupar 
Signed-off-by: Jerry Zuo 
Signed-off-by: Alvin Lee 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index cc8c1a48c5c4d..76df036fb2f34 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -3338,6 +3338,9 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l

&mode_lib->vba.UrgentBurstFactorLumaPre[k],

&mode_lib->vba.UrgentBurstFactorChromaPre[k],

&mode_lib->vba.NotUrgentLatencyHidingPre[k]);
+
+   v->cursor_bw_pre[k] = 
mode_lib->vba.NumberOfCursors[k] * mode_lib->vba.CursorWidth[k][0] * 
mode_lib->vba.CursorBPP[k][0] /
+   8.0 / 
(mode_lib->vba.HTotal[k] / mode_lib->vba.PixelClock[k]) * 
v->VRatioPreY[i][j][k];
}
 
{
-- 
2.43.0



[PATCH AUTOSEL 6.1 11/15] drm/radeon: check bo_va->bo is non-NULL before using it

2024-07-16 Thread Sasha Levin
From: Pierre-Eric Pelloux-Prayer 

[ Upstream commit 6fb15dcbcf4f212930350eaee174bb60ed40a536 ]

The call to radeon_vm_clear_freed might clear bo_va->bo, so
we have to check it before dereferencing it.

Signed-off-by: Pierre-Eric Pelloux-Prayer 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 75d79c3110389..3388a3d21d2c0 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -657,7 +657,7 @@ static void radeon_gem_va_update_vm(struct radeon_device 
*rdev,
if (r)
goto error_unlock;
 
-   if (bo_va->it.start)
+   if (bo_va->it.start && bo_va->bo)
r = radeon_vm_bo_update(rdev, bo_va, bo_va->bo->tbo.resource);
 
 error_unlock:
-- 
2.43.0



[PATCH AUTOSEL 5.15 6/9] drm/amd/display: Reset freesync config before update new state

2024-07-16 Thread Sasha Levin
From: Tom Chung 

[ Upstream commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b ]

[Why]
Sometimes the new_crtc_state->vrr_infopacket did not sync up with the
current state.
It will affect the update_freesync_state_on_stream() does not update
the state correctly.

[How]
Reset the freesync config before get_freesync_config_for_crtc() to
make sure we have the correct new_crtc_state for VRR.

Reviewed-by: Sun peng Li 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b821abb56ac3b..cbf1a9a625068 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10475,6 +10475,7 @@ static int dm_update_crtc_state(struct 
amdgpu_display_manager *dm,
}
 
/* Update Freesync settings. */
+   reset_freesync_config_for_crtc(dm_new_crtc_state);
get_freesync_config_for_crtc(dm_new_crtc_state,
 dm_new_conn_state);
 
-- 
2.43.0



[PATCH AUTOSEL 5.15 7/9] drm/radeon: check bo_va->bo is non-NULL before using it

2024-07-16 Thread Sasha Levin
From: Pierre-Eric Pelloux-Prayer 

[ Upstream commit 6fb15dcbcf4f212930350eaee174bb60ed40a536 ]

The call to radeon_vm_clear_freed might clear bo_va->bo, so
we have to check it before dereferencing it.

Signed-off-by: Pierre-Eric Pelloux-Prayer 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 57218263ef3b1..277a313432b28 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -653,7 +653,7 @@ static void radeon_gem_va_update_vm(struct radeon_device 
*rdev,
if (r)
goto error_unlock;
 
-   if (bo_va->it.start)
+   if (bo_va->it.start && bo_va->bo)
r = radeon_vm_bo_update(rdev, bo_va, bo_va->bo->tbo.resource);
 
 error_unlock:
-- 
2.43.0



[PATCH AUTOSEL 5.10 6/7] drm/amd/display: Reset freesync config before update new state

2024-07-16 Thread Sasha Levin
From: Tom Chung 

[ Upstream commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b ]

[Why]
Sometimes the new_crtc_state->vrr_infopacket did not sync up with the
current state.
It will affect the update_freesync_state_on_stream() does not update
the state correctly.

[How]
Reset the freesync config before get_freesync_config_for_crtc() to
make sure we have the correct new_crtc_state for VRR.

Reviewed-by: Sun peng Li 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 29ef0ed44d5f4..c957ef1283f68 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -8385,6 +8385,7 @@ static int dm_update_crtc_state(struct 
amdgpu_display_manager *dm,
}
 
/* Update Freesync settings. */
+   reset_freesync_config_for_crtc(dm_new_crtc_state);
get_freesync_config_for_crtc(dm_new_crtc_state,
 dm_new_conn_state);
 
-- 
2.43.0



[PATCH AUTOSEL 5.4 6/7] drm/amd/display: Reset freesync config before update new state

2024-07-16 Thread Sasha Levin
From: Tom Chung 

[ Upstream commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b ]

[Why]
Sometimes the new_crtc_state->vrr_infopacket did not sync up with the
current state.
It will affect the update_freesync_state_on_stream() does not update
the state correctly.

[How]
Reset the freesync config before get_freesync_config_for_crtc() to
make sure we have the correct new_crtc_state for VRR.

Reviewed-by: Sun peng Li 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 3bfc4aa328c6f..d400f9f4ca7bd 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6878,6 +6878,7 @@ static int dm_update_crtc_state(struct 
amdgpu_display_manager *dm,
}
 
/* Update Freesync settings. */
+   reset_freesync_config_for_crtc(dm_new_crtc_state);
get_freesync_config_for_crtc(dm_new_crtc_state,
 dm_new_conn_state);
 
-- 
2.43.0



Re: [PATCH AUTOSEL 6.9 11/22] drm/amd/display: Reset freesync config before update new state

2024-07-16 Thread Hamza Mahfooz

Hi Sasha,

On 7/16/24 10:24, Sasha Levin wrote:

From: Tom Chung 

[ Upstream commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b ]

[Why]
Sometimes the new_crtc_state->vrr_infopacket did not sync up with the
current state.
It will affect the update_freesync_state_on_stream() does not update
the state correctly.

[How]
Reset the freesync config before get_freesync_config_for_crtc() to
make sure we have the correct new_crtc_state for VRR.


Please drop this patch from the stable queue entirely, since it has
already been reverted (as of commit dc1000bf463d ("Revert
"drm/amd/display: Reset freesync config before update new state"")).



Reviewed-by: Sun peng Li 
Signed-off-by: Jerry Zuo 
Signed-off-by: Tom Chung 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f866a02f4f489..53a55270998cc 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10028,6 +10028,7 @@ static int dm_update_crtc_state(struct 
amdgpu_display_manager *dm,
}
  
  	/* Update Freesync settings. */

+   reset_freesync_config_for_crtc(dm_new_crtc_state);
get_freesync_config_for_crtc(dm_new_crtc_state,
 dm_new_conn_state);
  

--
Hamza



[PATCH] drm/amd/display: Add null check for dm_state in create_validate_stream_for_sink

2024-07-16 Thread Srinivasan Shanmugam
This commit adds a null check for the dm_state variable in the
create_validate_stream_for_sink function. Previously, dm_state was being
checked for nullity at line 7194, but then it was being dereferenced
without any nullity check at line 7200. This could potentially lead to a
null pointer dereference error if dm_state is indeed null.

we now ensure that dm_state is not null before  dereferencing it. We do
this by adding a nullity check for dm_state  before the call to
create_stream_for_sink at line 7200. If dm_state  is null, we log an
error message and return NULL immediately.

This fix prevents a null pointer dereference error.

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:7201 
create_validate_stream_for_sink()
error: we previously assumed 'dm_state' could be null (see line 7194)

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c
7185 struct dc_stream_state *
7186 create_validate_stream_for_sink(struct amdgpu_dm_connector *aconnector,
7187 const struct drm_display_mode 
*drm_mode,
7188 const struct dm_connector_state 
*dm_state,
7189 const struct dc_stream_state 
*old_stream)
7190 {
7191 struct drm_connector *connector = &aconnector->base;
7192 struct amdgpu_device *adev = drm_to_adev(connector->dev);
7193 struct dc_stream_state *stream;
7194 const struct drm_connector_state *drm_state = dm_state ? 
&dm_state->base : NULL;
   
 ^ This used check connector->state 
but then we changed it to dm_state instead

7195 int requested_bpc = drm_state ? drm_state->max_requested_bpc : 
8;
7196 enum dc_status dc_result = DC_OK;
7197
7198 do {
7199 stream = create_stream_for_sink(connector, drm_mode,
7200 dm_state, old_stream,
 

But dm_state is dereferenced on the next line without checking.  (Presumably 
the NULL check can be removed).

--> 7201 requested_bpc);
7202 if (stream == NULL) {
7203 DRM_ERROR("Failed to create stream for 
sink!\n");
7204 break;
7205 }
7206
7207 if (aconnector->base.connector_type == 
DRM_MODE_CONNECTOR_WRITEBACK)

Fixes: fa7041d9d2fc ("drm/amd/display: Fix ineffective setting of max bpc 
property")
Reported-by: Dan Carpenter 
Cc: Tom Chung 
Cc: Rodrigo Siqueira 
Cc: Roman Li 
Cc: Hersen Wu 
Cc: Alex Hung 
Cc: Aurabindo Pillai 
Cc: Harry Wentland 
Cc: Hamza Mahfooz 
Signed-off-by: Srinivasan Shanmugam 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index d1527c2e46a1..b7eaece455c8 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7195,6 +7195,11 @@ create_validate_stream_for_sink(struct 
amdgpu_dm_connector *aconnector,
int requested_bpc = drm_state ? drm_state->max_requested_bpc : 8;
enum dc_status dc_result = DC_OK;
 
+   if (!dm_state) {
+   DRM_ERROR("dm_state is NULL!\n");
+   return NULL;
+   }
+
do {
stream = create_stream_for_sink(connector, drm_mode,
dm_state, old_stream,
-- 
2.34.1



[PATCH v5 1/2] drm/buddy: Add start address support to trim function

2024-07-16 Thread Arunpravin Paneer Selvam
- Add a new start parameter in trim function to specify exact
  address from where to start the trimming. This would help us
  in situations like if drivers would like to do address alignment
  for specific requirements.

- Add a new flag DRM_BUDDY_TRIM_DISABLE. Drivers can use this
  flag to disable the allocator trimming part. This patch enables
  the drivers control trimming and they can do it themselves
  based on the application requirements.

v1:(Matthew)
  - check new_start alignment with min chunk_size
  - use range_overflows()

Signed-off-by: Arunpravin Paneer Selvam 
Acked-by: Alex Deucher 
Acked-by: Christian König 
---
 drivers/gpu/drm/drm_buddy.c  | 25 +++--
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c |  2 +-
 include/drm/drm_buddy.h  |  2 ++
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 6a8e45e9d0ec..103c185bb1c8 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -851,6 +851,7 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm,
  * drm_buddy_block_trim - free unused pages
  *
  * @mm: DRM buddy manager
+ * @start: start address to begin the trimming.
  * @new_size: original size requested
  * @blocks: Input and output list of allocated blocks.
  * MUST contain single block as input to be trimmed.
@@ -866,11 +867,13 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm,
  * 0 on success, error code on failure.
  */
 int drm_buddy_block_trim(struct drm_buddy *mm,
+u64 *start,
 u64 new_size,
 struct list_head *blocks)
 {
struct drm_buddy_block *parent;
struct drm_buddy_block *block;
+   u64 block_start, block_end;
LIST_HEAD(dfs);
u64 new_start;
int err;
@@ -882,6 +885,9 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
 struct drm_buddy_block,
 link);
 
+   block_start = drm_buddy_block_offset(block);
+   block_end = block_start + drm_buddy_block_size(mm, block);
+
if (WARN_ON(!drm_buddy_block_is_allocated(block)))
return -EINVAL;
 
@@ -894,6 +900,20 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
if (new_size == drm_buddy_block_size(mm, block))
return 0;
 
+   new_start = block_start;
+   if (start) {
+   new_start = *start;
+
+   if (new_start < block_start)
+   return -EINVAL;
+
+   if (!IS_ALIGNED(new_start, mm->chunk_size))
+   return -EINVAL;
+
+   if (range_overflows(new_start, new_size, block_end))
+   return -EINVAL;
+   }
+
list_del(&block->link);
mark_free(mm, block);
mm->avail += drm_buddy_block_size(mm, block);
@@ -904,7 +924,6 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
parent = block->parent;
block->parent = NULL;
 
-   new_start = drm_buddy_block_offset(block);
list_add(&block->tmp_link, &dfs);
err =  __alloc_range(mm, &dfs, new_start, new_size, blocks, NULL);
if (err) {
@@ -1066,7 +1085,8 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
} while (1);
 
/* Trim the allocated block to the required size */
-   if (original_size != size) {
+   if (!(flags & DRM_BUDDY_TRIM_DISABLE) &&
+   original_size != size) {
struct list_head *trim_list;
LIST_HEAD(temp);
u64 trim_size;
@@ -1083,6 +1103,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
}
 
drm_buddy_block_trim(mm,
+NULL,
 trim_size,
 trim_list);
 
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c 
b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index fe3779fdba2c..423b261ea743 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -150,7 +150,7 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager 
*man,
} while (remaining_size);
 
if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
-   if (!drm_buddy_block_trim(mm, vres->base.size, &vres->blocks))
+   if (!drm_buddy_block_trim(mm, NULL, vres->base.size, 
&vres->blocks))
size = vres->base.size;
}
 
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index 2a74fa9d0ce5..9689a7c5dd36 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -27,6 +27,7 @@
 #define DRM_BUDDY_CONTIGUOUS_ALLOCATIONBIT(2)
 #define DRM_BUDDY_CLEAR_ALLOCATION BIT(3)
 #define DRM_BUDDY_CLEARED  BIT(4)
+#define DRM_BUDDY_TRIM_DISABLE BIT(5)
 
 struct drm_buddy_block {
 #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63,

[PATCH v5 2/2] drm/amdgpu: Add address alignment support to DCC buffers

2024-07-16 Thread Arunpravin Paneer Selvam
Add address alignment support to the DCC VRAM buffers.

v2:
  - adjust size based on the max_texture_channel_caches values
only for GFX12 DCC buffers.
  - used AMDGPU_GEM_CREATE_GFX12_DCC flag to apply change only
for DCC buffers.
  - roundup non power of two DCC buffer adjusted size to nearest
power of two number as the buddy allocator does not support non
power of two alignments. This applies only to the contiguous
DCC buffers.

v3:(Alex)
  - rewrite the max texture channel caches comparison code in an
algorithmic way to determine the alignment size.

v4:(Alex)
  - Move the logic from amdgpu_vram_mgr_dcc_alignment() to gmc_v12_0.c
and add a new gmc func callback for dcc alignment. If the callback
is non-NULL, call it to get the alignment, otherwise, use the default.

v5:(Alex)
  - Set the Alignment to a default value if the callback doesn't exist.
  - Add the callback to amdgpu_gmc_funcs.

v6:
  - Fix checkpatch error reported by Intel CI.

Signed-off-by: Arunpravin Paneer Selvam 
Acked-by: Alex Deucher 
Acked-by: Christian König 
Reviewed-by: Frank Min 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h  |  6 
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 36 ++--
 drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c   | 15 
 3 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index febca3130497..654d0548a3f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -156,6 +156,8 @@ struct amdgpu_gmc_funcs {
  uint64_t addr, uint64_t *flags);
/* get the amount of memory used by the vbios for pre-OS console */
unsigned int (*get_vbios_fb_size)(struct amdgpu_device *adev);
+   /* get the DCC buffer alignment */
+   u64 (*get_dcc_alignment)(struct amdgpu_device *adev);
 
enum amdgpu_memory_partition (*query_mem_partition_mode)(
struct amdgpu_device *adev);
@@ -363,6 +365,10 @@ struct amdgpu_gmc {
(adev)->gmc.gmc_funcs->override_vm_pte_flags\
((adev), (vm), (addr), (pte_flags))
 #define amdgpu_gmc_get_vbios_fb_size(adev) 
(adev)->gmc.gmc_funcs->get_vbios_fb_size((adev))
+#define amdgpu_gmc_get_dcc_alignment(_adev) ({ \
+   typeof(_adev) (adev) = (_adev); \
+   ((adev)->gmc.gmc_funcs->get_dcc_alignment((adev))); \
+})
 
 /**
  * amdgpu_gmc_vram_full_visible - Check if full VRAM is visible through the BAR
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index f91cc149d06c..aa9dca12371c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -512,6 +512,16 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
vres->flags |= DRM_BUDDY_RANGE_ALLOCATION;
 
remaining_size = (u64)vres->base.size;
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
+   bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) {
+   u64 adjust_size;
+
+   if (adev->gmc.gmc_funcs->get_dcc_alignment) {
+   adjust_size = amdgpu_gmc_get_dcc_alignment(adev);
+   remaining_size = roundup_pow_of_two(remaining_size + 
adjust_size);
+   vres->flags |= DRM_BUDDY_TRIM_DISABLE;
+   }
+   }
 
mutex_lock(&mgr->lock);
while (remaining_size) {
@@ -521,8 +531,12 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
min_block_size = mgr->default_page_size;
 
size = remaining_size;
-   if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
-   !(size & (((u64)pages_per_block << PAGE_SHIFT) - 1)))
+
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
+   bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC)
+   min_block_size = size;
+   else if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
+!(size & (((u64)pages_per_block << PAGE_SHIFT) - 1)))
min_block_size = (u64)pages_per_block << PAGE_SHIFT;
 
BUG_ON(min_block_size < mm->chunk_size);
@@ -553,6 +567,24 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
}
mutex_unlock(&mgr->lock);
 
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
+   bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) {
+   struct drm_buddy_block *dcc_block;
+   u64 dcc_start, alignment;
+
+   dcc_block = amdgpu_vram_mgr_first_block(&vres->blocks);
+   dcc_start = amdgpu_vram_mgr_block_start(dcc_block);
+
+   if (adev->gmc.gmc_funcs->get_dcc_alignment) {
+   alignment = amdgpu_gmc_get_dcc_alig

[PATCH] drm/amd/display: Add null check for set_output_gamma in dcn30_set_output_transfer_func

2024-07-16 Thread Srinivasan Shanmugam
This commit adds a null check for the set_output_gamma function pointer
in the  dcn30_set_output_transfer_func function. Previously,
set_output_gamma was being checked for nullity at line 386, but then it
was being dereferenced without any nullity check at line 401. This
could potentially lead to a null pointer dereference error if
set_output_gamma is indeed null.

To fix this, we now ensure that set_output_gamma is not null before
dereferencing it. We do this by adding a nullity check for
set_output_gamma before the call to set_output_gamma at line 401. If
set_output_gamma is null, we log an error message and do not call the
function.

This fix prevents a potential null pointer dereference error.

drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:401 
dcn30_set_output_transfer_func()
error: we previously assumed 'mpc->funcs->set_output_gamma' could be null (see 
line 386)

drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c
373 bool dcn30_set_output_transfer_func(struct dc *dc,
374 struct pipe_ctx *pipe_ctx,
375 const struct dc_stream_state *stream)
376 {
377 int mpcc_id = pipe_ctx->plane_res.hubp->inst;
378 struct mpc *mpc = 
pipe_ctx->stream_res.opp->ctx->dc->res_pool->mpc;
379 const struct pwl_params *params = NULL;
380 bool ret = false;
381
382 /* program OGAM or 3DLUT only for the top pipe*/
383 if (pipe_ctx->top_pipe == NULL) {
384 /*program rmu shaper and 3dlut in MPC*/
385 ret = dcn30_set_mpc_shaper_3dlut(pipe_ctx, stream);
386 if (ret == false && mpc->funcs->set_output_gamma) {
 If 
this is NULL

387 if (stream->out_transfer_func.type == 
TF_TYPE_HWPWL)
388 params = &stream->out_transfer_func.pwl;
389 else if 
(pipe_ctx->stream->out_transfer_func.type ==
390 TF_TYPE_DISTRIBUTED_POINTS &&
391 
cm3_helper_translate_curve_to_hw_format(
392 &stream->out_transfer_func,
393 &mpc->blender_params, false))
394 params = &mpc->blender_params;
395  /* there are no ROM LUTs in OUTGAM */
396 if (stream->out_transfer_func.type == 
TF_TYPE_PREDEFINED)
397 BREAK_TO_DEBUGGER();
398 }
399 }
400
--> 401 mpc->funcs->set_output_gamma(mpc, mpcc_id, params);
 Then it will crash

402 return ret;
403 }

Fixes: d99f13878d6f ("drm/amd/display: Add DCN3 HWSEQ")
Reported-by: Dan Carpenter 
Cc: Tom Chung 
Cc: Rodrigo Siqueira 
Cc: Roman Li 
Cc: Hersen Wu 
Cc: Alex Hung 
Cc: Aurabindo Pillai 
Cc: Harry Wentland 
Cc: Hamza Mahfooz 
Signed-off-by: Srinivasan Shanmugam 
---
 drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_hwseq.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_hwseq.c
index eaeeade31ed7..bd807eb79786 100644
--- a/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_hwseq.c
@@ -398,7 +398,11 @@ bool dcn30_set_output_transfer_func(struct dc *dc,
}
}
 
-   mpc->funcs->set_output_gamma(mpc, mpcc_id, params);
+   if (mpc->funcs->set_output_gamma)
+   mpc->funcs->set_output_gamma(mpc, mpcc_id, params);
+   else
+   DRM_ERROR("set_output_gamma function pointer is NULL.\n");
+
return ret;
 }
 
-- 
2.34.1



Re: [PATCH] drm/amd/display: Add null check for dm_state in create_validate_stream_for_sink

2024-07-16 Thread Hamza Mahfooz

On 7/16/24 11:08, Srinivasan Shanmugam wrote:

This commit adds a null check for the dm_state variable in the
create_validate_stream_for_sink function. Previously, dm_state was being
checked for nullity at line 7194, but then it was being dereferenced
without any nullity check at line 7200. This could potentially lead to a
null pointer dereference error if dm_state is indeed null.

we now ensure that dm_state is not null before  dereferencing it. We do
this by adding a nullity check for dm_state  before the call to
create_stream_for_sink at line 7200. If dm_state  is null, we log an
error message and return NULL immediately.

This fix prevents a null pointer dereference error.

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:7201 
create_validate_stream_for_sink()
error: we previously assumed 'dm_state' could be null (see line 7194)

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c
 7185 struct dc_stream_state *
 7186 create_validate_stream_for_sink(struct amdgpu_dm_connector 
*aconnector,
 7187 const struct drm_display_mode 
*drm_mode,
 7188 const struct dm_connector_state 
*dm_state,
 7189 const struct dc_stream_state 
*old_stream)
 7190 {
 7191 struct drm_connector *connector = &aconnector->base;
 7192 struct amdgpu_device *adev = drm_to_adev(connector->dev);
 7193 struct dc_stream_state *stream;
 7194 const struct drm_connector_state *drm_state = dm_state ? 
&dm_state->base : NULL;

  ^ This used check 
connector->state but then we changed it to dm_state instead

 7195 int requested_bpc = drm_state ? drm_state->max_requested_bpc 
: 8;
 7196 enum dc_status dc_result = DC_OK;
 7197
 7198 do {
 7199 stream = create_stream_for_sink(connector, drm_mode,
 7200 dm_state, old_stream,
  

But dm_state is dereferenced on the next line without checking.  (Presumably 
the NULL check can be removed).

--> 7201 requested_bpc);
 7202 if (stream == NULL) {
 7203 DRM_ERROR("Failed to create stream for 
sink!\n");
 7204 break;
 7205 }
 7206
 7207 if (aconnector->base.connector_type == 
DRM_MODE_CONNECTOR_WRITEBACK)

Fixes: fa7041d9d2fc ("drm/amd/display: Fix ineffective setting of max bpc 
property")
Reported-by: Dan Carpenter 
Cc: Tom Chung 
Cc: Rodrigo Siqueira 
Cc: Roman Li 
Cc: Hersen Wu 
Cc: Alex Hung 
Cc: Aurabindo Pillai 
Cc: Harry Wentland 
Cc: Hamza Mahfooz 
Signed-off-by: Srinivasan Shanmugam 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index d1527c2e46a1..b7eaece455c8 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7195,6 +7195,11 @@ create_validate_stream_for_sink(struct 
amdgpu_dm_connector *aconnector,
int requested_bpc = drm_state ? drm_state->max_requested_bpc : 8;
enum dc_status dc_result = DC_OK;
  
+	if (!dm_state) {

+   DRM_ERROR("dm_state is NULL!\n");


Use drm_err() instead, DRM_ERROR() is deprecated.


+   return NULL;
+   }
+
do {
stream = create_stream_for_sink(connector, drm_mode,
dm_state, old_stream,

--
Hamza



[PATCH] drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell

2024-07-16 Thread Alex Deucher
We seem to have a case where SDMA will sometimes miss a doorbell
if GFX is entering the powergating state when the doorbell comes in.
To workaround this, we can update the wptr via MMIO, however,
this is only safe because we disallow gfxoff in begin_ring() for
SDMA 5.2 and then allow it again in end_ring().

Enable this workaround while we are root causing the issue with
the HW team.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 7e475d9b554e..3c37e3cd3cbf 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -225,6 +225,14 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring 
*ring)
DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
ring->doorbell_index, ring->wptr << 2);
WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
+   /* SDMA seems to miss doorbells sometimes when powergating 
kicks in.
+* Updating the wptr directly will wake it. This is only safe 
because
+* we disallow gfxoff in begin_use() and then allow it again in 
end_use().
+*/
+   WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, 
mmSDMA0_GFX_RB_WPTR),
+  lower_32_bits(ring->wptr << 2));
+   WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, 
mmSDMA0_GFX_RB_WPTR_HI),
+  upper_32_bits(ring->wptr << 2));
} else {
DRM_DEBUG("Not using doorbell -- "
"mmSDMA%i_GFX_RB_WPTR == 0x%08x "
@@ -1707,6 +1715,10 @@ static void sdma_v5_2_ring_begin_use(struct amdgpu_ring 
*ring)
 * but it shouldn't hurt for other parts since
 * this GFXOFF will be disallowed anyway when SDMA is
 * active, this just makes it explicit.
+* sdma_v5_2_ring_set_wptr() takes advantage of this
+* to update the wptr because sometimes SDMA seems to miss
+* doorbells when entering PG.  If you remove this, update
+* sdma_v5_2_ring_set_wptr() as well!
 */
amdgpu_gfx_off_ctrl(adev, false);
 }
-- 
2.45.2



Re: DisplayPort: handling of HPD events / link training

2024-07-16 Thread Thomas Zimmermann

Hi

Am 27.02.24 um 23:40 schrieb Dmitry Baryshkov:

Hello,

We are currently looking at checking and/or possibly redesigning the
way the MSM DRM driver handles the HPD events and link training.

After a quick glance at the drivers implementing DP support, I noticed
following main approaches:
- Perform link training at the atomic_enable time, don't report
failures (mtk, analogix, zynqmp, tegra, nouveau)
- Perform link training at the atomic_enable time, report errors using
link_status property (i915, mhdp8546)
- Perform link training on the plug event (msm, it8605).
- Perform link training from the DPMS handler, also calling it from
the enable callback (AMDGPU, radeon).

It looks like the majority wins and we should move HPD to
atomic_enable time. Is that assumption correct?


Did you ever receive an answer to this question? I currently investigate 
ast's DP code, which does link training as part of detecting the 
connector state (in detect_ctx). But most other drivers do this in 
atomic_enable. I wonder if ast should follow.


Best regards
Thomas



Also two related questions:
- Is there a plan to actually make use of the link_status property?
Intel presented it at FOSDEM 2018, but since that time it was not
picked up by other drivers.

- Is there any plan to create generic DP link training helpers? After
glancing through the DP drivers there is a lot of similar code in the
link training functions, with minor differences here and there. And
it's those minor differences that bug me. It means that drivers might
respond differently to similar devices. Or that there might be minor
bugs here and there.



--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)



Re: DisplayPort: handling of HPD events / link training

2024-07-16 Thread Dmitry Baryshkov
On Tue, 16 Jul 2024 at 18:58, Thomas Zimmermann  wrote:
>
> Hi
>
> Am 27.02.24 um 23:40 schrieb Dmitry Baryshkov:
> > Hello,
> >
> > We are currently looking at checking and/or possibly redesigning the
> > way the MSM DRM driver handles the HPD events and link training.
> >
> > After a quick glance at the drivers implementing DP support, I noticed
> > following main approaches:
> > - Perform link training at the atomic_enable time, don't report
> > failures (mtk, analogix, zynqmp, tegra, nouveau)
> > - Perform link training at the atomic_enable time, report errors using
> > link_status property (i915, mhdp8546)
> > - Perform link training on the plug event (msm, it8605).
> > - Perform link training from the DPMS handler, also calling it from
> > the enable callback (AMDGPU, radeon).
> >
> > It looks like the majority wins and we should move HPD to
> > atomic_enable time. Is that assumption correct?
>
> Did you ever receive an answer to this question? I currently investigate
> ast's DP code, which does link training as part of detecting the
> connector state (in detect_ctx). But most other drivers do this in
> atomic_enable. I wonder if ast should follow.

Short answer: yes, the only proper place to do it is atomic_enable().

Long answer: I don't see a way to retrigger link training in ast_dp.c
Without such change you are just shifting things around. The
end-result of moving link-training to atomic_enable() is that each
enable can trigger link training, possibly lowering the link rate,
etc. if link training is just a status bit from the firmware that we
don't control, it doesn't make real-real sense to move it.

>
> Best regards
> Thomas
>
> >
> > Also two related questions:
> > - Is there a plan to actually make use of the link_status property?
> > Intel presented it at FOSDEM 2018, but since that time it was not
> > picked up by other drivers.
> >
> > - Is there any plan to create generic DP link training helpers? After
> > glancing through the DP drivers there is a lot of similar code in the
> > link training functions, with minor differences here and there. And
> > it's those minor differences that bug me. It means that drivers might
> > respond differently to similar devices. Or that there might be minor
> > bugs here and there.
> >
>
> --
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Frankenstrasse 146, 90461 Nuernberg, Germany
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> HRB 36809 (AG Nuernberg)
>


-- 
With best wishes
Dmitry


Re: DisplayPort: handling of HPD events / link training

2024-07-16 Thread Thomas Zimmermann

Hi

Am 16.07.24 um 18:35 schrieb Dmitry Baryshkov:

On Tue, 16 Jul 2024 at 18:58, Thomas Zimmermann  wrote:

Hi

Am 27.02.24 um 23:40 schrieb Dmitry Baryshkov:

Hello,

We are currently looking at checking and/or possibly redesigning the
way the MSM DRM driver handles the HPD events and link training.

After a quick glance at the drivers implementing DP support, I noticed
following main approaches:
- Perform link training at the atomic_enable time, don't report
failures (mtk, analogix, zynqmp, tegra, nouveau)
- Perform link training at the atomic_enable time, report errors using
link_status property (i915, mhdp8546)
- Perform link training on the plug event (msm, it8605).
- Perform link training from the DPMS handler, also calling it from
the enable callback (AMDGPU, radeon).

It looks like the majority wins and we should move HPD to
atomic_enable time. Is that assumption correct?

Did you ever receive an answer to this question? I currently investigate
ast's DP code, which does link training as part of detecting the
connector state (in detect_ctx). But most other drivers do this in
atomic_enable. I wonder if ast should follow.

Short answer: yes, the only proper place to do it is atomic_enable().


Thanks.



Long answer: I don't see a way to retrigger link training in ast_dp.c
Without such change you are just shifting things around. The
end-result of moving link-training to atomic_enable() is that each
enable can trigger link training, possibly lowering the link rate,
etc. if link training is just a status bit from the firmware that we
don't control, it doesn't make real-real sense to move it.


I have to think about what to do. People tend to copy existing drivers, 
which alone might be a good argument for using atomic_enable. The link 
training is indeed just a flag that is set by the firmware. I think it's 
possible to re-trigger training by powering the port down and up again. 
atomic_enable could likely do that. The hardware is also somewhat buggy 
and not fully standard conformant.


Best regards
Thomas




Best regards
Thomas


Also two related questions:
- Is there a plan to actually make use of the link_status property?
Intel presented it at FOSDEM 2018, but since that time it was not
picked up by other drivers.

- Is there any plan to create generic DP link training helpers? After
glancing through the DP drivers there is a lot of similar code in the
link training functions, with minor differences here and there. And
it's those minor differences that bug me. It means that drivers might
respond differently to similar devices. Or that there might be minor
bugs here and there.


--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)





--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)



Re: 6.10/bisected/regression - commits bc87d666c05 and 6d4279cb99ac cause appearing green flashing bar on top of screen on Radeon 6900XT and 120Hz

2024-07-16 Thread Alex Deucher
Does the attached partial revert fix it?

Alex

On Wed, Jul 10, 2024 at 3:03 AM Mikhail Gavrilov
 wrote:
>
> On Wed, Jul 10, 2024 at 12:01 PM Mikhail Gavrilov
>  wrote:
> >
> > On Tue, Jul 9, 2024 at 7:48 PM Rodrigo Siqueira Jordao
> >  wrote:
> > > Hi,
> > >
> > > I also tried it with 6900XT. I got the same results on my side.
> >
> > This is weird.
> >
> > > Anyway, I could not reproduce the issue with the below components. I may
> > > be missing something that will trigger this bug; in this sense, could
> > > you describe the following:
> > > - The display resolution and refresh rate.
> >
> > 3840x2160 and 120Hz
> > At 60Hz issue not reproduced.
> >
> > > - Are you able to reproduce this issue with DP and HDMI?
> >
> > My TV, an OLED LG C3, has only an HDMI 2.1 port.
> >
> > > - Could you provide the firmware information: sudo cat
> > > /sys/kernel/debug/dri/0/amdgpu_firmware_info
> >
> > > sudo cat /sys/kernel/debug/dri/0/amdgpu_firmware_info
> > [sudo] password for mikhail:
> > VCE feature version: 0, firmware version: 0x
> > UVD feature version: 0, firmware version: 0x
> > MC feature version: 0, firmware version: 0x
> > ME feature version: 38, firmware version: 0x000e
> > PFP feature version: 38, firmware version: 0x000e
> > CE feature version: 38, firmware version: 0x0003
> > RLC feature version: 1, firmware version: 0x001f
> > RLC SRLC feature version: 1, firmware version: 0x0001
> > RLC SRLG feature version: 1, firmware version: 0x0001
> > RLC SRLS feature version: 1, firmware version: 0x0001
> > RLCP feature version: 0, firmware version: 0x
> > RLCV feature version: 0, firmware version: 0x
> > MEC feature version: 38, firmware version: 0x0015
> > MEC2 feature version: 38, firmware version: 0x0015
> > IMU feature version: 0, firmware version: 0x
> > SOS feature version: 0, firmware version: 0x
> > ASD feature version: 553648344, firmware version: 0x21d8
> > TA XGMI feature version: 0x, firmware version: 0x
> > TA RAS feature version: 0x, firmware version: 0x
> > TA HDCP feature version: 0x, firmware version: 0x173f
> > TA DTM feature version: 0x, firmware version: 0x1216
> > TA RAP feature version: 0x, firmware version: 0x
> > TA SECUREDISPLAY feature version: 0x, firmware version: 0x
> > SMC feature version: 0, program: 0, firmware version: 0x00544fdf (84.79.223)
> > SDMA0 feature version: 52, firmware version: 0x0009
> > VCN feature version: 0, firmware version: 0x0311f002
> > DMCU feature version: 0, firmware version: 0x
> > DMCUB feature version: 0, firmware version: 0x05000f00
> > TOC feature version: 0, firmware version: 0x0007
> > MES_KIQ feature version: 0, firmware version: 0x
> > MES feature version: 0, firmware version: 0x
> > VPE feature version: 0, firmware version: 0x
> > VBIOS version: 102-RAPHAEL-008
> >
>
> I forgot to add output for discrete GPU:
>
> > sudo cat /sys/kernel/debug/dri/1/amdgpu_firmware_info
> [sudo] password for mikhail:
> VCE feature version: 0, firmware version: 0x
> UVD feature version: 0, firmware version: 0x
> MC feature version: 0, firmware version: 0x
> ME feature version: 44, firmware version: 0x0040
> PFP feature version: 44, firmware version: 0x0062
> CE feature version: 44, firmware version: 0x0025
> RLC feature version: 1, firmware version: 0x0060
> RLC SRLC feature version: 0, firmware version: 0x
> RLC SRLG feature version: 0, firmware version: 0x
> RLC SRLS feature version: 0, firmware version: 0x
> RLCP feature version: 0, firmware version: 0x
> RLCV feature version: 0, firmware version: 0x
> MEC feature version: 44, firmware version: 0x0076
> MEC2 feature version: 44, firmware version: 0x0076
> IMU feature version: 0, firmware version: 0x
> SOS feature version: 0, firmware version: 0x00210e64
> ASD feature version: 553648345, firmware version: 0x21d9
> TA XGMI feature version: 0x, firmware version: 0x200f
> TA RAS feature version: 0x, firmware version: 0x1b00013e
> TA HDCP feature version: 0x, firmware version: 0x173f
> TA DTM feature version: 0x, firmware version: 0x1216
> TA RAP feature version: 0x, firmware version: 0x0716
> TA SECUREDISPLAY feature version: 0x, firmware version: 0x
> SMC feature version: 0, program: 0, firmware version: 0x003a5a00 (58.90.0)
> SDMA0 feature version: 52, firmware version: 0x0053
> SDMA1 feature version: 52, firmware version: 0x0053
> SDMA2 feature version: 52, firmware version: 0x0053
> SDMA3 feature version: 52, firmware version: 0x0053
> VCN feature version: 0, firmware version: 0x0311f002
> DMCU feature version: 0, firmware version: 0x
> DMCUB feature version: 0, firmware version: 0x02020020
> TOC 

Re: DisplayPort: handling of HPD events / link training

2024-07-16 Thread Dmitry Baryshkov
On Tue, Jul 16, 2024 at 06:48:12PM GMT, Thomas Zimmermann wrote:
> Hi
> 
> Am 16.07.24 um 18:35 schrieb Dmitry Baryshkov:
> > On Tue, 16 Jul 2024 at 18:58, Thomas Zimmermann  wrote:
> > > Hi
> > > 
> > > Am 27.02.24 um 23:40 schrieb Dmitry Baryshkov:
> > > > Hello,
> > > > 
> > > > We are currently looking at checking and/or possibly redesigning the
> > > > way the MSM DRM driver handles the HPD events and link training.
> > > > 
> > > > After a quick glance at the drivers implementing DP support, I noticed
> > > > following main approaches:
> > > > - Perform link training at the atomic_enable time, don't report
> > > > failures (mtk, analogix, zynqmp, tegra, nouveau)
> > > > - Perform link training at the atomic_enable time, report errors using
> > > > link_status property (i915, mhdp8546)
> > > > - Perform link training on the plug event (msm, it8605).
> > > > - Perform link training from the DPMS handler, also calling it from
> > > > the enable callback (AMDGPU, radeon).
> > > > 
> > > > It looks like the majority wins and we should move HPD to
> > > > atomic_enable time. Is that assumption correct?
> > > Did you ever receive an answer to this question? I currently investigate
> > > ast's DP code, which does link training as part of detecting the
> > > connector state (in detect_ctx). But most other drivers do this in
> > > atomic_enable. I wonder if ast should follow.
> > Short answer: yes, the only proper place to do it is atomic_enable().
> 
> Thanks.
> 
> > 
> > Long answer: I don't see a way to retrigger link training in ast_dp.c
> > Without such change you are just shifting things around. The
> > end-result of moving link-training to atomic_enable() is that each
> > enable can trigger link training, possibly lowering the link rate,
> > etc. if link training is just a status bit from the firmware that we
> > don't control, it doesn't make real-real sense to move it.
> 
> I have to think about what to do. People tend to copy existing drivers,
> which alone might be a good argument for using atomic_enable. The link
> training is indeed just a flag that is set by the firmware. I think it's
> possible to re-trigger training by powering the port down and up again.
> atomic_enable could likely do that. The hardware is also somewhat buggy and
> not fully standard conformant.

It stil looks like having an explicit comment ('check LT here becasue
handled by firmware') might be better.

-- 
With best wishes
Dmitry


[PATCH] drm/amd/display: fix corruption with high refresh rates on DCN 3.0

2024-07-16 Thread Alex Deucher
This reverts commit bc87d666c05a13e6d4ae1ddce41fc43d2567b9a2 and the
register changes from commit 6d4279cb99ac4f51d10409501d29969f687ac8dc.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3478
Cc: mikhail.v.gavri...@gmail.com
Cc: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
---
 .../drm/amd/display/dc/optc/dcn10/dcn10_optc.c| 15 +++
 .../drm/amd/display/dc/optc/dcn20/dcn20_optc.c| 10 ++
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c 
b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
index 4f82146d94b1..f00d27b7c6fe 100644
--- a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
@@ -950,19 +950,10 @@ void optc1_set_drr(
OTG_FORCE_LOCK_ON_EVENT, 0,
OTG_SET_V_TOTAL_MIN_MASK_EN, 0,
OTG_SET_V_TOTAL_MIN_MASK, 0);
-
-   // Setup manual flow control for EOF via TRIG_A
-   optc->funcs->setup_manual_trigger(optc);
-
-   } else {
-   REG_UPDATE_4(OTG_V_TOTAL_CONTROL,
-   OTG_SET_V_TOTAL_MIN_MASK, 0,
-   OTG_V_TOTAL_MIN_SEL, 0,
-   OTG_V_TOTAL_MAX_SEL, 0,
-   OTG_FORCE_LOCK_ON_EVENT, 0);
-
-   optc->funcs->set_vtotal_min_max(optc, 0, 0);
}
+
+   // Setup manual flow control for EOF via TRIG_A
+   optc->funcs->setup_manual_trigger(optc);
 }
 
 void optc1_set_vtotal_min_max(struct timing_generator *optc, int vtotal_min, 
int vtotal_max)
diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c 
b/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c
index 43417cff2c9b..b4694985a40a 100644
--- a/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c
@@ -453,6 +453,16 @@ void optc2_setup_manual_trigger(struct timing_generator 
*optc)
 {
struct optc *optc1 = DCN10TG_FROM_TG(optc);
 
+   /* Set the min/max selectors unconditionally so that
+* DMCUB fw may change OTG timings when necessary
+* TODO: Remove the w/a after fixing the issue in DMCUB firmware
+*/
+   REG_UPDATE_4(OTG_V_TOTAL_CONTROL,
+OTG_V_TOTAL_MIN_SEL, 1,
+OTG_V_TOTAL_MAX_SEL, 1,
+OTG_FORCE_LOCK_ON_EVENT, 0,
+OTG_SET_V_TOTAL_MIN_MASK, (1 << 1)); /* TRIGA 
*/
+
REG_SET_8(OTG_TRIGA_CNTL, 0,
OTG_TRIGA_SOURCE_SELECT, 21,
OTG_TRIGA_SOURCE_PIPE_SELECT, optc->inst,
-- 
2.45.2



Re: [PATCH] drm/amd/display: fix corruption with high refresh rates on DCN 3.0

2024-07-16 Thread Rodrigo Siqueira Jordao




On 7/16/24 11:33 AM, Alex Deucher wrote:

This reverts commit bc87d666c05a13e6d4ae1ddce41fc43d2567b9a2 and the
register changes from commit 6d4279cb99ac4f51d10409501d29969f687ac8dc.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3478
Cc: mikhail.v.gavri...@gmail.com
Cc: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
---
  .../drm/amd/display/dc/optc/dcn10/dcn10_optc.c| 15 +++
  .../drm/amd/display/dc/optc/dcn20/dcn20_optc.c| 10 ++
  2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c 
b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
index 4f82146d94b1..f00d27b7c6fe 100644
--- a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
@@ -950,19 +950,10 @@ void optc1_set_drr(
OTG_FORCE_LOCK_ON_EVENT, 0,
OTG_SET_V_TOTAL_MIN_MASK_EN, 0,
OTG_SET_V_TOTAL_MIN_MASK, 0);
-
-   // Setup manual flow control for EOF via TRIG_A
-   optc->funcs->setup_manual_trigger(optc);
-
-   } else {
-   REG_UPDATE_4(OTG_V_TOTAL_CONTROL,
-   OTG_SET_V_TOTAL_MIN_MASK, 0,
-   OTG_V_TOTAL_MIN_SEL, 0,
-   OTG_V_TOTAL_MAX_SEL, 0,
-   OTG_FORCE_LOCK_ON_EVENT, 0);
-
-   optc->funcs->set_vtotal_min_max(optc, 0, 0);
}
+
+   // Setup manual flow control for EOF via TRIG_A
+   optc->funcs->setup_manual_trigger(optc);
  }
  
  void optc1_set_vtotal_min_max(struct timing_generator *optc, int vtotal_min, int vtotal_max)

diff --git a/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c 
b/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c
index 43417cff2c9b..b4694985a40a 100644
--- a/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c
@@ -453,6 +453,16 @@ void optc2_setup_manual_trigger(struct timing_generator 
*optc)
  {
struct optc *optc1 = DCN10TG_FROM_TG(optc);
  
+	/* Set the min/max selectors unconditionally so that

+* DMCUB fw may change OTG timings when necessary
+* TODO: Remove the w/a after fixing the issue in DMCUB firmware
+*/
+   REG_UPDATE_4(OTG_V_TOTAL_CONTROL,
+OTG_V_TOTAL_MIN_SEL, 1,
+OTG_V_TOTAL_MAX_SEL, 1,
+OTG_FORCE_LOCK_ON_EVENT, 0,
+OTG_SET_V_TOTAL_MIN_MASK, (1 << 1)); /* TRIGA 
*/
+
REG_SET_8(OTG_TRIGA_CNTL, 0,
OTG_TRIGA_SOURCE_SELECT, 21,
OTG_TRIGA_SOURCE_PIPE_SELECT, optc->inst,


(+Jay)

Reviewed-by: Rodrigo Siqueira 


Re: [PATCH] drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell

2024-07-16 Thread Friedrich Vock

On 16.07.24 17:54, Alex Deucher wrote:

We seem to have a case where SDMA will sometimes miss a doorbell
if GFX is entering the powergating state when the doorbell comes in.
To workaround this, we can update the wptr via MMIO, however,
this is only safe because we disallow gfxoff in begin_ring() for
SDMA 5.2 and then allow it again in end_ring().

Enable this workaround while we are root causing the issue with
the HW team.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440
Signed-off-by: Alex Deucher 


Looks like it works for me.
Tested-by: Friedrich Vock 

Is there a particular reason you chose to still go with the doorbell
path plus updating the wptr via MMIO instead of setting
ring->use_doorbell to false? The workaround shipping in SteamOS does
that - if that has some adverse effects or something like that we should
probably stop :)

Thanks,
Friedrich


---
  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 12 
  1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 7e475d9b554e..3c37e3cd3cbf 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -225,6 +225,14 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring 
*ring)
DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
ring->doorbell_index, ring->wptr << 2);
WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
+   /* SDMA seems to miss doorbells sometimes when powergating 
kicks in.
+* Updating the wptr directly will wake it. This is only safe 
because
+* we disallow gfxoff in begin_use() and then allow it again in 
end_use().
+*/
+   WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, 
mmSDMA0_GFX_RB_WPTR),
+  lower_32_bits(ring->wptr << 2));
+   WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, 
mmSDMA0_GFX_RB_WPTR_HI),
+  upper_32_bits(ring->wptr << 2));
} else {
DRM_DEBUG("Not using doorbell -- "
"mmSDMA%i_GFX_RB_WPTR == 0x%08x "
@@ -1707,6 +1715,10 @@ static void sdma_v5_2_ring_begin_use(struct amdgpu_ring 
*ring)
 * but it shouldn't hurt for other parts since
 * this GFXOFF will be disallowed anyway when SDMA is
 * active, this just makes it explicit.
+* sdma_v5_2_ring_set_wptr() takes advantage of this
+* to update the wptr because sometimes SDMA seems to miss
+* doorbells when entering PG.  If you remove this, update
+* sdma_v5_2_ring_set_wptr() as well!
 */
amdgpu_gfx_off_ctrl(adev, false);
  }


Re: [PATCH] drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell

2024-07-16 Thread Alex Deucher
On Tue, Jul 16, 2024 at 2:30 PM Friedrich Vock  wrote:
>
> On 16.07.24 17:54, Alex Deucher wrote:
> > We seem to have a case where SDMA will sometimes miss a doorbell
> > if GFX is entering the powergating state when the doorbell comes in.
> > To workaround this, we can update the wptr via MMIO, however,
> > this is only safe because we disallow gfxoff in begin_ring() for
> > SDMA 5.2 and then allow it again in end_ring().
> >
> > Enable this workaround while we are root causing the issue with
> > the HW team.
> >
> > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440
> > Signed-off-by: Alex Deucher 
>
> Looks like it works for me.
> Tested-by: Friedrich Vock 
>
> Is there a particular reason you chose to still go with the doorbell
> path plus updating the wptr via MMIO instead of setting
> ring->use_doorbell to false? The workaround shipping in SteamOS does
> that - if that has some adverse effects or something like that we should
> probably stop :)

Either way would work I think.  I just wanted to call out in the patch
that any access to SDMA or GFX MMIO needs to be done while gfxoff is
disallowed (via ring begin_use in this case), otherwise, you will hang
if gfx is in the off state.  If you want to go with disabling the
doorbell, we should double check that there are not any other places
where we access MMIO registers directly in the !doorbell case.  I
don't think there are, but I didn't look too closely.

Alex

>
> Thanks,
> Friedrich
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 12 
> >   1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
> > b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> > index 7e475d9b554e..3c37e3cd3cbf 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> > @@ -225,6 +225,14 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring 
> > *ring)
> >   DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
> >   ring->doorbell_index, ring->wptr << 2);
> >   WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
> > + /* SDMA seems to miss doorbells sometimes when powergating 
> > kicks in.
> > +  * Updating the wptr directly will wake it. This is only safe 
> > because
> > +  * we disallow gfxoff in begin_use() and then allow it again 
> > in end_use().
> > +  */
> > + WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, 
> > mmSDMA0_GFX_RB_WPTR),
> > +lower_32_bits(ring->wptr << 2));
> > + WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, 
> > mmSDMA0_GFX_RB_WPTR_HI),
> > +upper_32_bits(ring->wptr << 2));
> >   } else {
> >   DRM_DEBUG("Not using doorbell -- "
> >   "mmSDMA%i_GFX_RB_WPTR == 0x%08x "
> > @@ -1707,6 +1715,10 @@ static void sdma_v5_2_ring_begin_use(struct 
> > amdgpu_ring *ring)
> >* but it shouldn't hurt for other parts since
> >* this GFXOFF will be disallowed anyway when SDMA is
> >* active, this just makes it explicit.
> > +  * sdma_v5_2_ring_set_wptr() takes advantage of this
> > +  * to update the wptr because sometimes SDMA seems to miss
> > +  * doorbells when entering PG.  If you remove this, update
> > +  * sdma_v5_2_ring_set_wptr() as well!
> >*/
> >   amdgpu_gfx_off_ctrl(adev, false);
> >   }


[PATCH v2] drm/amd/display: Add null check for dm_state in create_validate_stream_for_sink

2024-07-16 Thread Srinivasan Shanmugam
This commit adds a null check for the dm_state variable in the
create_validate_stream_for_sink function. Previously, dm_state was being
checked for nullity at line 7194, but then it was being dereferenced
without any nullity check at line 7200. This could potentially lead to a
null pointer dereference error if dm_state is indeed null.

we now ensure that dm_state is not null before  dereferencing it. We do
this by adding a nullity check for dm_state  before the call to
create_stream_for_sink at line 7200. If dm_state  is null, we log an
error message and return NULL immediately.

This fix prevents a null pointer dereference error.

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:7201 
create_validate_stream_for_sink()
error: we previously assumed 'dm_state' could be null (see line 7194)

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c
7185 struct dc_stream_state *
7186 create_validate_stream_for_sink(struct amdgpu_dm_connector *aconnector,
7187 const struct drm_display_mode 
*drm_mode,
7188 const struct dm_connector_state 
*dm_state,
7189 const struct dc_stream_state 
*old_stream)
7190 {
7191 struct drm_connector *connector = &aconnector->base;
7192 struct amdgpu_device *adev = drm_to_adev(connector->dev);
7193 struct dc_stream_state *stream;
7194 const struct drm_connector_state *drm_state = dm_state ? 
&dm_state->base : NULL;
   
 ^ This used check connector->state 
but then we changed it to dm_state instead

7195 int requested_bpc = drm_state ? drm_state->max_requested_bpc : 
8;
7196 enum dc_status dc_result = DC_OK;
7197
7198 do {
7199 stream = create_stream_for_sink(connector, drm_mode,
7200 dm_state, old_stream,
 

But dm_state is dereferenced on the next line without checking.  (Presumably 
the NULL check can be removed).

--> 7201 requested_bpc);
7202 if (stream == NULL) {
7203 DRM_ERROR("Failed to create stream for 
sink!\n");
7204 break;
7205 }
7206
7207 if (aconnector->base.connector_type == 
DRM_MODE_CONNECTOR_WRITEBACK)

Fixes: fa7041d9d2fc ("drm/amd/display: Fix ineffective setting of max bpc 
property")
Reported-by: Dan Carpenter 
Cc: Tom Chung 
Cc: Rodrigo Siqueira 
Cc: Roman Li 
Cc: Hersen Wu 
Cc: Alex Hung 
Cc: Aurabindo Pillai 
Cc: Harry Wentland 
Cc: Hamza Mahfooz 
Signed-off-by: Srinivasan Shanmugam 
---
v2: s/DRM_ERROR/drm_err() (Hamza)
   
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index d1527c2e46a1..e7516a2dcb10 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7195,6 +7195,11 @@ create_validate_stream_for_sink(struct 
amdgpu_dm_connector *aconnector,
int requested_bpc = drm_state ? drm_state->max_requested_bpc : 8;
enum dc_status dc_result = DC_OK;
 
+   if (!dm_state) {
+   drm_err(&adev->ddev, "dm_state is NULL!\n");
+   return NULL;
+   }
+
do {
stream = create_stream_for_sink(connector, drm_mode,
dm_state, old_stream,
-- 
2.34.1



Re: [PATCH v5 2/2] drm/amdgpu: Add address alignment support to DCC buffers

2024-07-16 Thread Marek Olšák
AMDGPU_GEM_CREATE_GFX12_DCC is set on 90% of all memory allocations, and
almost all of them are not displayable. Shouldn't we use a different way to
indicate that we need a non-power-of-two alignment, such as looking at the
alignment field directly?

Marek

On Tue, Jul 16, 2024, 11:45 Arunpravin Paneer Selvam <
arunpravin.paneersel...@amd.com> wrote:

> Add address alignment support to the DCC VRAM buffers.
>
> v2:
>   - adjust size based on the max_texture_channel_caches values
> only for GFX12 DCC buffers.
>   - used AMDGPU_GEM_CREATE_GFX12_DCC flag to apply change only
> for DCC buffers.
>   - roundup non power of two DCC buffer adjusted size to nearest
> power of two number as the buddy allocator does not support non
> power of two alignments. This applies only to the contiguous
> DCC buffers.
>
> v3:(Alex)
>   - rewrite the max texture channel caches comparison code in an
> algorithmic way to determine the alignment size.
>
> v4:(Alex)
>   - Move the logic from amdgpu_vram_mgr_dcc_alignment() to gmc_v12_0.c
> and add a new gmc func callback for dcc alignment. If the callback
> is non-NULL, call it to get the alignment, otherwise, use the default.
>
> v5:(Alex)
>   - Set the Alignment to a default value if the callback doesn't exist.
>   - Add the callback to amdgpu_gmc_funcs.
>
> v6:
>   - Fix checkpatch error reported by Intel CI.
>
> Signed-off-by: Arunpravin Paneer Selvam 
> Acked-by: Alex Deucher 
> Acked-by: Christian König 
> Reviewed-by: Frank Min 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h  |  6 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 36 ++--
>  drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c   | 15 
>  3 files changed, 55 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> index febca3130497..654d0548a3f8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> @@ -156,6 +156,8 @@ struct amdgpu_gmc_funcs {
>   uint64_t addr, uint64_t *flags);
> /* get the amount of memory used by the vbios for pre-OS console */
> unsigned int (*get_vbios_fb_size)(struct amdgpu_device *adev);
> +   /* get the DCC buffer alignment */
> +   u64 (*get_dcc_alignment)(struct amdgpu_device *adev);
>
> enum amdgpu_memory_partition (*query_mem_partition_mode)(
> struct amdgpu_device *adev);
> @@ -363,6 +365,10 @@ struct amdgpu_gmc {
> (adev)->gmc.gmc_funcs->override_vm_pte_flags\
> ((adev), (vm), (addr), (pte_flags))
>  #define amdgpu_gmc_get_vbios_fb_size(adev)
> (adev)->gmc.gmc_funcs->get_vbios_fb_size((adev))
> +#define amdgpu_gmc_get_dcc_alignment(_adev) ({ \
> +   typeof(_adev) (adev) = (_adev); \
> +   ((adev)->gmc.gmc_funcs->get_dcc_alignment((adev))); \
> +})
>
>  /**
>   * amdgpu_gmc_vram_full_visible - Check if full VRAM is visible through
> the BAR
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index f91cc149d06c..aa9dca12371c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -512,6 +512,16 @@ static int amdgpu_vram_mgr_new(struct
> ttm_resource_manager *man,
> vres->flags |= DRM_BUDDY_RANGE_ALLOCATION;
>
> remaining_size = (u64)vres->base.size;
> +   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
> +   bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) {
> +   u64 adjust_size;
> +
> +   if (adev->gmc.gmc_funcs->get_dcc_alignment) {
> +   adjust_size = amdgpu_gmc_get_dcc_alignment(adev);
> +   remaining_size = roundup_pow_of_two(remaining_size
> + adjust_size);
> +   vres->flags |= DRM_BUDDY_TRIM_DISABLE;
> +   }
> +   }
>
> mutex_lock(&mgr->lock);
> while (remaining_size) {
> @@ -521,8 +531,12 @@ static int amdgpu_vram_mgr_new(struct
> ttm_resource_manager *man,
> min_block_size = mgr->default_page_size;
>
> size = remaining_size;
> -   if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
> -   !(size & (((u64)pages_per_block << PAGE_SHIFT) - 1)))
> +
> +   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
> +   bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC)
> +   min_block_size = size;
> +   else if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
> +!(size & (((u64)pages_per_block << PAGE_SHIFT) -
> 1)))
> min_block_size = (u64)pages_per_block <<
> PAGE_SHIFT;
>
> BUG_ON(min_block_size < mm->chunk_size);
> @@ -553,6 +567,24 @@ static int amdgpu_vram_mgr_new(struct
> t

Re: 6.10/bisected/regression - commits bc87d666c05 and 6d4279cb99ac cause appearing green flashing bar on top of screen on Radeon 6900XT and 120Hz

2024-07-16 Thread Mikhail Gavrilov
On Tue, Jul 16, 2024 at 10:10 PM Alex Deucher  wrote:
>
> Does the attached partial revert fix it?
>
> Alex
>

Yes, thanks.

Tested-by: Mikhail Gavrilov 

-- 
Best Regards,
Mike Gavrilov.