Re: [PATCH 3/3] drm: Convert open yes/no strings to yesno()

2022-01-26 Thread Lucas De Marchi
On Wed, Jan 19, 2022 at 09:30:47PM +0200, Andy Shevchenko wrote: On Tue, Jan 18, 2022 at 11:24:50PM -0800, Lucas De Marchi wrote: linux/string_helpers.h provides a helper to return "yes"/"no" strings. Replace the open coded versions with yesno(). The places were identified with the following sem

[PATCH v2 01/11] lib/string_helpers: Consolidate string helpers implementation

2022-01-26 Thread Lucas De Marchi
There are a few implementations of string helpers in the tree like yesno() that just returns "yes" or "no" depending on a boolean argument. Those are helpful to output strings to the user or log. In order to consolidate them, prefix all of them str_ prefix to make it clear what they are about and

[PATCH v2 00/11] lib/string_helpers: Add a few string helpers

2022-01-26 Thread Lucas De Marchi
Add some helpers under lib/string_helpers.h so they can be used throughout the kernel. When I started doing this there were 2 other previous attempts I know of, not counting the iterations each of them had: 1) https://lore.kernel.org/all/20191023131308.9420-1-jani.nik...@intel.com/ 2) https://lor

[PATCH v2 03/11] drm/i915: Use str_yes_no()

2022-01-26 Thread Lucas De Marchi
Remove the local yesno() implementation and adopt the str_yes_no() from linux/string_helpers.h. Signed-off-by: Lucas De Marchi Acked-by: Daniel Vetter Acked-by: Jani Nikula --- drivers/gpu/drm/i915/display/intel_display.c | 23 +++ .../drm/i915/display/intel_display_debugfs.c | 66 ++

[PATCH v2 02/11] drm/i915: Fix trailing semicolon

2022-01-26 Thread Lucas De Marchi
Remove the trailing semicolon, as correctly warned by checkpatch: -:1189: WARNING:TRAILING_SEMICOLON: macros should not use a trailing semicolon #1189: FILE: drivers/gpu/drm/i915/intel_device_info.c:119: +#define PRINT_FLAG(name) drm_printf(p, "%s: %s\n", #name, yesno(inf

[PATCH v2 09/11] drm: Convert open-coded yes/no strings to yesno()

2022-01-26 Thread Lucas De Marchi
linux/string_helpers.h provides a helper to return "yes"/"no" strings. Replace the open coded versions with str_yes_no(). The places were identified with the following semantic patch: @@ expression b; @@ - b ? "yes" : "no" + str_yes_no(b) Then the includes

[PATCH v2 04/11] drm/i915: Use str_enable_disable()

2022-01-26 Thread Lucas De Marchi
Remove the local enabledisable() implementation and adopt the str_enable_disable() from linux/string_helpers.h. Signed-off-by: Lucas De Marchi Acked-by: Daniel Vetter Acked-by: Jani Nikula --- drivers/gpu/drm/i915/display/intel_ddi.c | 4 +++- drivers/gpu/drm/i915/display/intel_displ

[PATCH v2 06/11] drm/i915: Use str_on_off()

2022-01-26 Thread Lucas De Marchi
Remove the local onoff() implementation and adopt the str_on_off() from linux/string_helpers.h. Signed-off-by: Lucas De Marchi Acked-by: Daniel Vetter Acked-by: Jani Nikula --- drivers/gpu/drm/i915/display/g4x_dp.c | 6 -- drivers/gpu/drm/i915/display/intel_display.c | 7

[PATCH v2 08/11] drm/gem: Sort includes alphabetically

2022-01-26 Thread Lucas De Marchi
Sort includes alphabetically so it's easier to add/remove includes and know when that is needed. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/drm_gem.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm

[PATCH v2 05/11] drm/i915: Use str_enabled_disabled()

2022-01-26 Thread Lucas De Marchi
Remove the local enableddisabled() implementation and adopt the str_enabled_disabled() from linux/string_helpers.h. Signed-off-by: Lucas De Marchi Acked-by: Daniel Vetter Acked-by: Jani Nikula --- drivers/gpu/drm/i915/display/intel_backlight.c | 3 ++- drivers/gpu/drm/i915/display/intel_dis

[PATCH v2 07/11] drm/amd/display: Use str_yes_no()

2022-01-26 Thread Lucas De Marchi
Remove the local yesno() implementation and adopt the str_yes_no() from linux/string_helpers.h. Signed-off-by: Lucas De Marchi --- .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 14 +- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amd

[PATCH v2 11/11] cxgb4: Use str_yes_no()

2022-01-26 Thread Lucas De Marchi
Remove the local yesno() implementation and adopt the str_yes_no() from linux/string_helpers.h. Signed-off-by: Lucas De Marchi --- .../ethernet/chelsio/cxgb4/cxgb4_debugfs.c| 249 ++ 1 file changed, 137 insertions(+), 112 deletions(-) diff --git a/drivers/net/ethernet/chelsi

[PATCH v2 10/11] tomoyo: Use str_yes_no()

2022-01-26 Thread Lucas De Marchi
Remove the local yesno() implementation and adopt the str_yes_no() from linux/string_helpers.h. Signed-off-by: Lucas De Marchi Reviewed-by: Sakari Ailus --- security/tomoyo/audit.c | 2 +- security/tomoyo/common.c | 19 +-- security/tomoyo/common.h | 1 - 3 files changed, 6 i

Re: [PATCH] drm/amdgpu: add safeguards for accessing mmhub CG registers

2022-01-26 Thread Christian König
Am 26.01.22 um 08:53 schrieb Lang Yu: We observed a gpu hang when querying mmhub CG status(i.e., cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan skillfish doesn't support any CG features. Only allow asics which support CG features accessing related registers. Will add similar safeguards f

Re: [Intel-gfx] [PATCH v2 09/11] drm: Convert open-coded yes/no strings to yesno()

2022-01-26 Thread Lucas De Marchi
On Wed, Jan 26, 2022 at 12:12:50PM +0200, Andy Shevchenko wrote: On Wed, Jan 26, 2022 at 11:39 AM Lucas De Marchi wrote: linux/string_helpers.h provides a helper to return "yes"/"no" strings. Replace the open coded versions with str_yes_no(). The places were oops, I replaced yesno() here but

[PATCH] drm/amdgpu: add umc_convert_error_address to simplify code

2022-01-26 Thread Tao Zhou
Make code reusable and more simple. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/umc_v6_7.c | 94 +-- drivers/gpu/drm/amd/amdgpu/umc_v8_7.c | 82 +-- 2 files changed, 61 insertions(+), 115 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu

Re: [PATCH] drm/amdgpu: add safeguards for accessing mmhub CG registers

2022-01-26 Thread Christian König
Am 26.01.22 um 12:02 schrieb Lang Yu: On 01/26/ , Christian König wrote: Am 26.01.22 um 08:53 schrieb Lang Yu: We observed a gpu hang when querying mmhub CG status(i.e., cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan skillfish doesn't support any CG features. Only allow asics which sup

[PATCH 1/2] drm/amdgpu: cleanup amdgpu_xgmi_sysfs_add_dev_info

2022-01-26 Thread Christian König
Don't initialize variables if it isn't absolutely necessary. Use amdgpu_xgmi_sysfs_rem_dev_info to cleanup when something goes wrong. Drop the explicit warnings since the sysfs core warns about things like duplicate files itself. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/am

[PATCH 2/2] drm/amdgpu: add sysfs files for XGMI segment size and physical node id

2022-01-26 Thread Christian König
umr needs that to correctly calculate the VRAM base address inside the MC address space. Only compile tested! Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 34 1 file changed, 34 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdg

Re: [PATCH] drm/amdgpu: add safeguards for accessing mmhub CG registers

2022-01-26 Thread Lazar, Lijo
[Public] Hi Lang, There are ASICs in which driver doesn't enable CG, and then these flags will be false. However, the CG will be enabled by another component like VBIOS. Driver still reports the CG status eventhough driver doesn't enable it. For those, this logic doesn't work. BTW, could you

Re: [RFC v3 01/12] drm/amdgpu: Introduce reset domain

2022-01-26 Thread Christian König
Am 25.01.22 um 23:37 schrieb Andrey Grodzovsky: Defined a reset_domain struct such that all the entities that go through reset together will be serialized one against another. Do it for both single device and XGMI hive cases. Signed-off-by: Andrey Grodzovsky Suggested-by: Daniel Vetter Suggest

Re: [RFC v3 08/12] drm/amdgpu: Rework reset domain to be refcounted.

2022-01-26 Thread Christian König
Am 25.01.22 um 23:37 schrieb Andrey Grodzovsky: The reset domain contains register access semaphor now and so needs to be present as long as each device in a hive needs it and so it cannot be binded to XGMI hive life cycle. Adress this by making reset domain refcounted and pointed by each memb

RE: [PATCH] drm/amdgpu: add safeguards for accessing mmhub CG registers

2022-01-26 Thread Yu, Lang
[Public] Hi Lijo, For cyan skillfish, both adev->cg_flags and adev->pg_flags are zero. I just found "RREG32_SOC15(MMHUB, 0, mmMM_ATC_L2_MISC_CG);" in mmhub_v2_0_get_clockgating() caused a gpu hang(cat amdgpu_pm_info). I didn't check if it's some sort of PG which causes the issue. Regards, Lan

Re: [PATCH 1/2] drm/amdgpu: cleanup amdgpu_xgmi_sysfs_add_dev_info

2022-01-26 Thread Luben Tuikov
Yeah, that's cleaner. Reviewed-by: Luben Tuikov Regards, Luben On 2022-01-26 06:59, Christian König wrote: > Don't initialize variables if it isn't absolutely necessary. > > Use amdgpu_xgmi_sysfs_rem_dev_info to cleanup when something goes wrong. > > Drop the explicit warnings since the sysfs c

Re: [PATCH 2/2] drm/amdgpu: add sysfs files for XGMI segment size and physical node id

2022-01-26 Thread Luben Tuikov
This seems reasonable. Hope it works out for umr. Reviewed-by: Luben Tuikov Regards, Luben On 2022-01-26 06:59, Christian König wrote: > umr needs that to correctly calculate the VRAM base address > inside the MC address space. > > Only compile tested! > > Signed-off-by: Christian König > ---

Re: [PATCH 2/2] drm/amdgpu: add sysfs files for XGMI segment size and physical node id

2022-01-26 Thread StDenis, Tom
[AMD Official Use Only] Sadly I don't control any XGMI hosts to try it out. So if they pick it up in their builds I can but otherwise we'll have to wait. Tom From: Tuikov, Luben Sent: Wednesday, January 26, 2022 07:55 To: Christian König; StDenis, Tom

Re: [Intel-gfx] [PATCH v2 09/11] drm: Convert open-coded yes/no strings to yesno()

2022-01-26 Thread Andy Shevchenko
On Wed, Jan 26, 2022 at 02:43:45AM -0800, Lucas De Marchi wrote: > On Wed, Jan 26, 2022 at 12:12:50PM +0200, Andy Shevchenko wrote: > > On Wed, Jan 26, 2022 at 11:39 AM Lucas De Marchi > > wrote: ... > > > 411986 104906176 428652 68a6c drm.ko.old > > > 411986 104906176 428652

Re: [PATCH v2 00/11] lib/string_helpers: Add a few string helpers

2022-01-26 Thread Andy Shevchenko
On Wed, Jan 26, 2022 at 11:39 AM Lucas De Marchi wrote: > > Add some helpers under lib/string_helpers.h so they can be used > throughout the kernel. When I started doing this there were 2 other > previous attempts I know of, not counting the iterations each of them > had: > > 1) https://lore.kerne

Re: [PATCH v2 09/11] drm: Convert open-coded yes/no strings to yesno()

2022-01-26 Thread Andy Shevchenko
On Wed, Jan 26, 2022 at 11:39 AM Lucas De Marchi wrote: > > linux/string_helpers.h provides a helper to return "yes"/"no" strings. > Replace the open coded versions with str_yes_no(). The places were > identified with the following semantic patch: > > @@ > expression b; > @

[PATCH] drm/amdgpu: Wrong order for config and counter_id parameters

2022-01-26 Thread jinsdb
From: huangqu Wrong order for config and counter_id parameters was passed, when calling df_v3_6_pmc_set_deferred and df_v3_6_pmc_is_deferred functions. Signed-off-by: huangqu --- drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/dri

[PATCH RESEND] drm/amd/display: Force link_rate as LINK_RATE_RBR2 for 2018 15" Apple Retina panels

2022-01-26 Thread Aditya Garg
From: Aun-Ali Zaidi The eDP link rate reported by the DP_MAX_LINK_RATE dpcd register (0xa) is contradictory to the highest rate supported reported by EDID (0xc = LINK_RATE_RBR2). The effects of this compounded with commit '4a8ca46bae8a ("drm/amd/display: Default max bpc to 16 for eDP")' results

Re: [PATCH v2 07/11] drm/amd/display: Use str_yes_no()

2022-01-26 Thread Harry Wentland
On 2022-01-26 04:39, Lucas De Marchi wrote: > Remove the local yesno() implementation and adopt the str_yes_no() from > linux/string_helpers.h. > > Signed-off-by: Lucas De Marchi Reviewed-by: Harry Wentland Harry > --- > .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 14 +-

Re: [PATCH v2 02/11] drm/i915: Fix trailing semicolon

2022-01-26 Thread Jani Nikula
On Wed, 26 Jan 2022, Lucas De Marchi wrote: > Remove the trailing semicolon, as correctly warned by checkpatch: > > -:1189: WARNING:TRAILING_SEMICOLON: macros should not use a trailing > semicolon > #1189: FILE: drivers/gpu/drm/i915/intel_device_info.c:119: > +#define PRINT_FLAG

Re: [PATCH v2 08/11] drm/gem: Sort includes alphabetically

2022-01-26 Thread Jani Nikula
On Wed, 26 Jan 2022, Lucas De Marchi wrote: > Sort includes alphabetically so it's easier to add/remove includes and > know when that is needed. > > Signed-off-by: Lucas De Marchi Reviewed-by: Jani Nikula > --- > drivers/gpu/drm/drm_gem.c | 20 ++-- > 1 file changed, 10 insert

RE: [PATCH v5 4/4] drm/amd: don't reset dGPUs that don't go through system S3

2022-01-26 Thread Limonciello, Mario
[AMD Official Use Only] > -Original Message- > From: Limonciello, Mario > Sent: Tuesday, January 25, 2022 22:10 > To: amd-gfx@lists.freedesktop.org > Cc: Liang, Prike ; Limonciello, Mario > > Subject: [PATCH v5 4/4] drm/amd: don't reset dGPUs that don't go through > system S3 > > dGPUs

Re: [PATCH v2 09/11] drm: Convert open-coded yes/no strings to yesno()

2022-01-26 Thread Jani Nikula
On Wed, 26 Jan 2022, Lucas De Marchi wrote: > linux/string_helpers.h provides a helper to return "yes"/"no" strings. > Replace the open coded versions with str_yes_no(). The places were > identified with the following semantic patch: > > @@ > expression b; > @@ > > - b ? "y

Re: [PATCH v5 2/4] drm/amd: add support to check whether the system is set to s3

2022-01-26 Thread Lazar, Lijo
[Public] Returns true for dGPU always. Better to keep the whole check under something like this. if (pm_suspend_target_state != PM_SUSPEND_ON) Thanks, Lijo From: amd-gfx on behalf of Mario Limonciello Sent: Wednesday, January 26, 2022 9:39:42 AM To: amd-gfx@l

RE: [PATCH v5 2/4] drm/amd: add support to check whether the system is set to s3

2022-01-26 Thread Limonciello, Mario
[Public] That was intentional - shouldn't dGPU always be going through S3 path currently? From: Lazar, Lijo Sent: Wednesday, January 26, 2022 09:06 To: Limonciello, Mario ; amd-gfx@lists.freedesktop.org Cc: Liang, Prike ; Limonciello, Mario Subject: Re: [PATCH v5 2/4] drm/amd: add support to

Re: [PATCH] drm/amdgpu: Fix an error message in rmmod

2022-01-26 Thread Felix Kuehling
My question is, why is this problem only seen during module unload? Why aren't we seeing HWS hangs due to GFX_OFF all the time in normal operations? For example when the GPU is idle and a new KFD process is started, creating a new runlist. Are we just getting lucky because the process first has

Re: [PATCH v5 2/4] drm/amd: add support to check whether the system is set to s3

2022-01-26 Thread Lazar, Lijo
Talking from generic API perspective - S3 is considered active for dGPU only if it's going to non-S0 state. If called from anywhere else than suspend op, this should return false. Thanks, Lijo From: Limonciello, Mario Sent: Wednesday, January 26, 2022 8:37:28 PM

RE: [PATCH v5 2/4] drm/amd: add support to check whether the system is set to s3

2022-01-26 Thread Limonciello, Mario
[Public] Right -from an API perspective both amdgpu_acpi_is_s0ix_active and amdgpu_acpi_is_s3_active are only in suspend ops. But so coming back to the 4th patch (and the associated bug), what is supposed to happen with a dGPU on an Intel system that does s2i? For AMD APU w/ dGPU in the system

Re: [PATCH] display/amd: decrease message verbosity about watermarks table failure

2022-01-26 Thread Harry Wentland
On 2022-01-25 18:35, Mario Limonciello wrote: > A number of BIOS versions have a problem with the watermarks table not > being configured properly. This manifests as a very scary looking warning > during resume from s0i3. This should be harmless in most cases and is well > understood, so decre

Re: [PATCH v5 2/4] drm/amd: add support to check whether the system is set to s3

2022-01-26 Thread Lazar, Lijo
I remember Alex adding a patch for smart suspend such that it skips the suspend call if runtime pm suspended. In summary, the resume doesn't work with/without reset? Thanks, Lijo From: Limonciello, Mario Sent: Wednesday, January 26, 2022 8:47:05 PM To: Lazar, Li

Re: [PATCH v5 2/4] drm/amd: add support to check whether the system is set to s3

2022-01-26 Thread Alex Deucher
I don't think smart suspend works as expected. I asked Raphael about it several times, but he never got around to following up with me. I think that is probably the preferred way to go, but the tricky part is that the dGPUs have integrated bridges and audio and usb and all of that probably needs

Re: [RFC v3 01/12] drm/amdgpu: Introduce reset domain

2022-01-26 Thread Andrey Grodzovsky
On 2022-01-26 07:07, Christian König wrote: Am 25.01.22 um 23:37 schrieb Andrey Grodzovsky: Defined a reset_domain struct such that all the entities that go through reset together will be serialized one against another. Do it for both single device and XGMI hive cases. Signed-off-by: Andrey G

Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

2022-01-26 Thread Andrey Grodzovsky
JingWen - could you maybe give those patches a try on SRIOV XGMI system ? If you see issues maybe you could let me connect and debug. My SRIOV XGMI system which Shayun kindly arranged for me is not loading the driver with my drm-misc-next branch even without my patches. Andrey On 2022-01-17 1

RE: [PATCH v5 2/4] drm/amd: add support to check whether the system is set to s3

2022-01-26 Thread Limonciello, Mario
[AMD Official Use Only] They key here is that smart suspend seems to have a dependency on pm_suspend_via_firmware(). So if you have an APU doing S2I or Intel SOC doing S2I it will always return 0. Can we drop that dependency of pm_suspend_via_firmware for it perhaps? > I don't think smart suspen

Re: drm/amd/amdgpu: Add ip_discovery_text sysfs entry (v2)

2022-01-26 Thread Tom St Denis
Thanks, if we don't end up dropping this patchset I'll incorporate your suggestions into a v3. Tom On Wed, Jan 26, 2022 at 12:36 AM Limonciello, Mario < mario.limoncie...@amd.com> wrote: > A few suggestion ideas inline. > > On 1/25/2022 12:18, Tom St Denis wrote: > > Newer hardware has a discove

Re: [PATCH] drm/amdgpu: add safeguards for accessing mmhub CG registers

2022-01-26 Thread Deucher, Alexander
[Public] Should we set *flags = 0 before we return? Alex From: Yu, Lang Sent: Wednesday, January 26, 2022 2:53 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Lazar, Lijo ; Huang, Ray ; Yu, Lang Subject: [PATCH] drm/amdgpu: add safeguards for ac

Re: [PATCH v9 2/6] drm: improve drm_buddy_alloc function

2022-01-26 Thread Arunpravin
On 21/01/22 5:30 pm, Matthew Auld wrote: > On 19/01/2022 11:37, Arunpravin wrote: >> - Make drm_buddy_alloc a single function to handle >>range allocation and non-range allocation demands >> >> - Implemented a new function alloc_range() which allocates >>the requested power-of-two block

Re: [PATCH v9 4/6] drm: implement a method to free unused pages

2022-01-26 Thread Arunpravin
> -Original Message- > From: amd-gfx On Behalf Of Matthew > Auld > Sent: Thursday, January 20, 2022 11:05 PM > To: Paneer Selvam, Arunpravin ; > dri-de...@lists.freedesktop.org; intel-...@lists.freedesktop.org; > amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; tzimmerm...@s

[PATCH v10 1/5] drm: improve drm_buddy_alloc function

2022-01-26 Thread Arunpravin
- Make drm_buddy_alloc a single function to handle range allocation and non-range allocation demands - Implemented a new function alloc_range() which allocates the requested power-of-two block comply with range limitations - Moved order computation and memory alignment logic from i915 drive

[PATCH v10 2/5] drm: implement top-down allocation method

2022-01-26 Thread Arunpravin
Implemented a function which walk through the order list, compares the offset and returns the maximum offset block, this method is unpredictable in obtaining the high range address blocks which depends on allocation and deallocation. for instance, if driver requests address at a low specific range,

[PATCH v10 3/5] drm: implement a method to free unused pages

2022-01-26 Thread Arunpravin
On contiguous allocation, we round up the size to the *next* power of 2, implement a function to free the unused pages after the newly allocate block. v2(Matthew Auld): - replace function name 'drm_buddy_free_unused_pages' with drm_buddy_block_trim - replace input argument name 'actual_siz

[PATCH v10 4/5] drm/amdgpu: move vram inline functions into a header

2022-01-26 Thread Arunpravin
Move shared vram inline functions and structs into a header file Signed-off-by: Arunpravin --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 51 1 file changed, 51 insertions(+) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h diff --git a/drivers/gpu/drm/a

[PATCH v10 5/5] drm/amdgpu: add drm buddy support to amdgpu

2022-01-26 Thread Arunpravin
- Remove drm_mm references and replace with drm buddy functionalities - Add res cursor support for drm buddy v2(Matthew Auld): - replace spinlock with mutex as we call kmem_cache_zalloc (..., GFP_KERNEL) in drm_buddy_alloc() function - lock drm_buddy_block_trim() function as it calls

[PATCH v6 1/4] drm/amd: avoid suspend on dGPUs w/ s2idle support when runtime PM enabled

2022-01-26 Thread Mario Limonciello
dGPUs connected to Intel systems configured for suspend to idle will not have the power rails cut at suspend and resetting the GPU may lead to problematic behaviors. Fixes: e25443d2765f4 ("drm/amdgpu: add a dev_pm_ops prepare callback (v2)") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/18

[PATCH v6 3/4] drm/amd: add support to check whether the system is set to s3

2022-01-26 Thread Mario Limonciello
This will be used to help make decisions on what to do in misconfigured systems. Signed-off-by: Mario Limonciello --- v5->v6: * Move in CONFIG_SUSPEND block drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 13 + 2 files changed, 15 inserti

[PATCH v6 4/4] drm/amd: Only run s3 or s0ix if system is configured properly

2022-01-26 Thread Mario Limonciello
This will cause misconfigured systems to not run the GPU suspend routines. * In APUs that are properly configured system will go into s2idle. * In APUs that are intended to be S3 but user selects s2idle the GPU will stay fully powered for the suspend. * In APUs that are intended to be s2idle and

[PATCH v6 2/4] drm/amd: Warn users about potential s0ix problems

2022-01-26 Thread Mario Limonciello
On some OEM setups users can configure the BIOS for S3 or S2idle. When configured to S3 users can still choose 's2idle' in the kernel by using `/sys/power/mem_sleep`. Before commit 6dc8265f9803 ("drm/amdgpu: always reset the asic in suspend (v2)"), the GPU would crash. Now when configured this wa

[PATCH 2/2] drm/amdgpu: add 1.3.1/2.4.0 athub CG support

2022-01-26 Thread Aaron Liu
This patch adds 1.3.1/2.4.0 athub clock gating support. Signed-off-by: Aaron Liu --- drivers/gpu/drm/amd/amdgpu/athub_v2_0.c | 1 + drivers/gpu/drm/amd/amdgpu/athub_v2_1.c | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/athub_v2_0.c b/drivers/gpu/drm/amd/amdgpu/

[PATCH 1/2] drm/amdgpu: convert code name to ip version for athub

2022-01-26 Thread Aaron Liu
Use IP version rather than codename for athub. Signed-off-by: Aaron Liu --- drivers/gpu/drm/amd/amdgpu/athub_v1_0.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/athub_v1_0.c b/drivers/gpu/drm/amd/amdgpu/athub_v1_0.c index 3ea557864

Re: [PATCH 2/2] drm/amdgpu: add 1.3.1/2.4.0 athub CG support

2022-01-26 Thread Huang Rui
On Thu, Jan 27, 2022 at 09:48:06AM +0800, Liu, Aaron wrote: > This patch adds 1.3.1/2.4.0 athub clock gating support. > > Signed-off-by: Aaron Liu Series are Reviewed-by: Huang Rui > --- > drivers/gpu/drm/amd/amdgpu/athub_v2_0.c | 1 + > drivers/gpu/drm/amd/amdgpu/athub_v2_1.c | 1 + > 2 file

RE: [PATCH v3 3/3] amdgpu/pm: Linked emit_clock_levels to use cases amdgpu_get_pp_{dpm_clock,od_clk_voltage}

2022-01-26 Thread Quan, Evan
[AMD Official Use Only] Series is reviewed-by: Evan Quan > -Original Message- > From: Powell, Darren > Sent: Wednesday, January 26, 2022 12:55 PM > To: amd-gfx@lists.freedesktop.org > Cc: Powell, Darren > Subject: [PATCH v3 3/3] amdgpu/pm: Linked emit_clock_levels to use cases > amdgpu

Re: [PATCH 1/2] drm/amdgpu: convert code name to ip version for athub

2022-01-26 Thread Alex Deucher
Reviewed-by: Alex Deucher On Wed, Jan 26, 2022 at 8:49 PM Aaron Liu wrote: > > Use IP version rather than codename for athub. > > Signed-off-by: Aaron Liu > --- > drivers/gpu/drm/amd/amdgpu/athub_v1_0.c | 13 +++-- > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/driv

Re: [PATCH 2/2] drm/amdgpu: add 1.3.1/2.4.0 athub CG support

2022-01-26 Thread Alex Deucher
Reviewed-by: Alex Deucher On Wed, Jan 26, 2022 at 9:07 PM Huang Rui wrote: > > On Thu, Jan 27, 2022 at 09:48:06AM +0800, Liu, Aaron wrote: > > This patch adds 1.3.1/2.4.0 athub clock gating support. > > > > Signed-off-by: Aaron Liu > > Series are Reviewed-by: Huang Rui > > > --- > > drivers/g

[PATCH v4 01/10] mm: add zone device coherent type memory support

2022-01-26 Thread Alex Sierra
Device memory that is cache coherent from device and CPU point of view. This is used on platforms that have an advanced system bus (like CAPI or CXL). Any page of a process can be migrated to such memory. However, no one should be allowed to pin such memory so that it can always be evicted. Signed

[PATCH v4 00/10] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping

2022-01-26 Thread Alex Sierra
This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory owned by a device that can be mapped into CPU page tables like MEMORY_DEVICE_GENERIC and can also be migrated like MEMORY_DEVICE_PRIVATE. Christoph, the suggestion to incorporate Ralph Campbell’s refcount cleanup patch into our

[PATCH v4 02/10] mm: add device coherent vma selection for memory migration

2022-01-26 Thread Alex Sierra
This case is used to migrate pages from device memory, back to system memory. Device coherent type memory is cache coherent from device and CPU point of view. Signed-off-by: Alex Sierra --- v2: condition added when migrations from device coherent pages. --- include/linux/migrate.h | 1 + mm/migr

[PATCH v4 06/10] lib: test_hmm add ioctl to get zone device type

2022-01-26 Thread Alex Sierra
new ioctl cmd added to query zone device type. This will be used once the test_hmm adds zone device coherent type. Signed-off-by: Alex Sierra --- lib/test_hmm.c | 23 +-- lib/test_hmm_uapi.h | 8 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/l

[PATCH v4 05/10] drm/amdkfd: coherent type as sys mem on migration to ram

2022-01-26 Thread Alex Sierra
Coherent device type memory on VRAM to RAM migration, has similar access as System RAM from the CPU. This flag sets the source from the sender. Which in Coherent type case, should be set as MIGRATE_VMA_SELECT_DEVICE_COHERENT. Signed-off-by: Alex Sierra Reviewed-by: Felix Kuehling --- drivers/gp

[PATCH v4 03/10] mm/gup: fail get_user_pages for LONGTERM dev coherent type

2022-01-26 Thread Alex Sierra
Avoid long term pinning for Coherent device type pages. This could interfere with their own device memory manager. For now, we are just returning error for PIN_LONGTERM Coherent device type pages. Eventually, these type of pages will get migrated to system memory, once the device migration pages su

[PATCH v4 07/10] lib: test_hmm add module param for zone device type

2022-01-26 Thread Alex Sierra
In order to configure device coherent in test_hmm, two module parameters should be passed, which correspond to the SP start address of each device (2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed, private device type is configured. Signed-off-by: Alex Sierra --- lib/test_hmm.c

[PATCH v4 09/10] tools: update hmm-test to support device coherent type

2022-01-26 Thread Alex Sierra
Test cases such as migrate_fault and migrate_multiple, were modified to explicit migrate from device to sys memory without the need of page faults, when using device coherent type. Snapshot test case updated to read memory device type first and based on that, get the proper returned results migrat

[PATCH v4 08/10] lib: add support for device coherent type in test_hmm

2022-01-26 Thread Alex Sierra
Device Coherent type uses device memory that is coherently accesible by the CPU. This could be shown as SP (special purpose) memory range at the BIOS-e820 memory enumeration. If no SP memory is supported in system, this could be faked by setting CONFIG_EFI_FAKE_MEMMAP. Currently, test_hmm only sup

[PATCH v4 10/10] tools: update test_hmm script to support SP config

2022-01-26 Thread Alex Sierra
Add two more parameters to set spm_addr_dev0 & spm_addr_dev1 addresses. These two parameters configure the start SP addresses for each device in test_hmm driver. Consequently, this configures zone device type as coherent. Signed-off-by: Alex Sierra Reviewed-by: Alistair Popple --- v2: Add more m

[PATCH v4 04/10] drm/amdkfd: add SPM support for SVM

2022-01-26 Thread Alex Sierra
When CPU is connected throug XGMI, it has coherent access to VRAM resource. In this case that resource is taken from a table in the device gmc aperture base. This resource is used along with the device type, which could be DEVICE_PRIVATE or DEVICE_COHERENT to create the device page map region. Sig

Re: [PATCH v3 09/10] tools: update hmm-test to support device coherent type

2022-01-26 Thread Sierra Guiza, Alejandro (Alex)
On 1/20/2022 12:14 AM, Alistair Popple wrote: On Tuesday, 11 January 2022 9:32:00 AM AEDT Alex Sierra wrote: Test cases such as migrate_fault and migrate_multiple, were modified to explicit migrate from device to sys memory without the need of page faults, when using device coherent type. Sna

[RFC PATCH v5 0/3] Add support modifiers for drivers whose planes only support linear layout

2022-01-26 Thread Tomohito Esaki
Some drivers whose planes only support linear layout fb do not support format modifiers. These drivers should support modifiers, however the DRM core should handle this rather than open-coding in every driver. In this patch series, these drivers expose format modifiers based on the following sugge

[RFC PATCH v5 1/3] drm: introduce fb_modifiers_not_supported flag in mode_config

2022-01-26 Thread Tomohito Esaki
If only linear modifier is advertised, since there are many drivers that only linear supported, the DRM core should handle this rather than open-coding in every driver. However, there are legacy drivers such as radeon that do not support modifiers but infer the actual layout of the underlying buffe

[RFC PATCH v5 2/3] drm: add support modifiers for drivers whose planes only support linear layout

2022-01-26 Thread Tomohito Esaki
The LINEAR modifier is advertised as default if a driver doesn't specify modifiers. Signed-off-by: Tomohito Esaki --- drivers/gpu/drm/drm_plane.c | 23 +-- include/drm/drm_plane.h | 3 +++ 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/d

[RFC PATCH v5 3/3] drm: remove allow_fb_modifiers

2022-01-26 Thread Tomohito Esaki
The allow_fb_modifiers flag is unnecessary since it has been replaced with fb_modifiers_not_supported flag. Signed-off-by: Tomohito Esaki --- drivers/gpu/drm/selftests/test-drm_framebuffer.c | 1 - include/drm/drm_mode_config.h| 16 2 files changed, 17 delet

[PATCH] drm/amd/display: Fix unused variable warning

2022-01-26 Thread Tim Huang
[Why] It will build failed with unused variable 'dc' with '-Werror=unused-variable'enabled when CONFIG_DRM_AMD_DC_DCN is not defined. Signed-off-by: Tim Huang --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/d

Re: [PATCH] drm/amdgpu: Fix an error message in rmmod

2022-01-26 Thread Yin, Tianci (Rico)
[AMD Official Use Only] The rmmod ops has prerequisite multi-user target and blacklist amdgpu, which is IGT requirement so that IGT can make itself DRM master to test KMS. igt-gpu-tools/build/tests/amdgpu/amd_module_load --run-subtest reload >From my understanding, the KFD process belongs to the

Re: [PATCH] drm/amd/display: Fix unused variable warning

2022-01-26 Thread Alex Deucher
Reviewed-by: Alex Deucher On Wed, Jan 26, 2022 at 10:34 PM Tim Huang wrote: > > [Why] > It will build failed with unused variable 'dc' with > '-Werror=unused-variable'enabled when CONFIG_DRM_AMD_DC_DCN > is not defined. > > Signed-off-by: Tim Huang > --- > drivers/gpu/drm/amd/display/amdgpu_dm

RE: [PATCH] drm/amd/display: Fix unused variable warning

2022-01-26 Thread Liu, Aaron
[AMD Official Use Only] Reviewed-by: Aaron Liu -- Best Regards Aaron Liu > -Original Message- > From: Huang, Tim > Sent: Thursday, January 27, 2022 11:34 AM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Huang, Ray > ; Liu, Aaron ; Huang, Tim > > Subject: [PATCH] drm/a

Re: [PATCH] drm/amd/display: Fix unused variable warning

2022-01-26 Thread Huang Rui
On Thu, Jan 27, 2022 at 11:33:50AM +0800, Huang, Tim wrote: > [Why] > It will build failed with unused variable 'dc' with > '-Werror=unused-variable'enabled when CONFIG_DRM_AMD_DC_DCN > is not defined. > > Signed-off-by: Tim Huang Reviewed-by: Huang Rui > --- > drivers/gpu/drm/amd/display/amd

[pull] amdgpu drm-fixes-5.17

2022-01-26 Thread Alex Deucher
Hi Dave, Daniel, Fixes for 5.17. The following changes since commit e783362eb54cd99b2cac8b3a9aeac942e6f6ac07: Linux 5.17-rc1 (2022-01-23 10:12:53 +0200) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-fixes-5.17-2022-01-26 for you to fe

[PATCH] drm/amdgpu: move GTT allocation from gmc_sw_init to gmc_hw_init

2022-01-26 Thread Aaron Liu
The below patch causes system hang for harvested ASICs. d015e9861e55 drm/amdgpu: improve debug VRAM access performance using sdma The root cause is that GTT buffer should be allocated after GC SA harvest programming completed. For harvested AISC, the GC SA harvest process(see utcl2_harvest) is pr

RE: [PATCH] drm/amdgpu: move GTT allocation from gmc_sw_init to gmc_hw_init

2022-01-26 Thread Chen, Guchun
[Public] This will create sdma_access_bo only for ASIC with gmc v10? Original creation occurs in amdgpu_ttm_init, it's not limited to ASICs with gmc v10. Regards, Guchun -Original Message- From: amd-gfx On Behalf Of Aaron Liu Sent: Thursday, January 27, 2022 3:04 PM To: amd-gfx@lists.f

RE: [PATCH] drm/amdgpu: move GTT allocation from gmc_sw_init to gmc_hw_init

2022-01-26 Thread Liu, Aaron
[Public] Guchun. Thanks your reminder. I need to modify again. -- Best Regards Aaron Liu > -Original Message- > From: Chen, Guchun > Sent: Thursday, January 27, 2022 3:10 PM > To: Liu, Aaron ; amd-gfx@lists.freedesktop.org > Cc: Ji, Ruili ; Kim, Jonathan ; > Kuehling, Felix ; Liu, Aaron

[PATCH V2] drm/amdgpu: move GTT allocation from gmc_sw_init to gmc_hw_init(V2)

2022-01-26 Thread Aaron Liu
The below patch causes system hang for harvested ASICs. d015e9861e55 drm/amdgpu: improve debug VRAM access performance using sdma The root cause is that GTT buffer should be allocated after GC SA harvest programming completed. For harvested AISC, the GC SA harvest process(see utcl2_harvest) is pr