Hi Oscar,
Alex already answered on your bug report that RV515 is pretty old hardware.
I have strong doubts that anybody has the time nor the hardware to look into
this.
The only thing you can do is to try to narrow down the bug and then we could
look at what exactly changed.
Regards,
Christia
Reviewed-and-tested-by: Evan Quan
> -Original Message-
> From: amd-gfx On Behalf Of Cui,
> Flora
> Sent: Wednesday, March 06, 2019 2:37 PM
> To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org
> Cc: Cui, Flora
> Subject: [PATCH v2] tests/amdgpu: add deadlock test for sdma
deadlock test for sdma will cause gpu recoverty.
disable the test for now until GPU reset recovery could survive at least
1000 times test.
v2: add modprobe parameter
Change-Id: I9adac63c62db22107345eddb30e7d81a1bda838c
Signed-off-by: Flora Cui
---
tests/amdgpu/amdgpu_test.c| 4 ++
tests/a
deadlock test for sdma will cause gpu recoverty.
disable the test for now until GPU reset recovery could survive at least
1000 times test.
Change-Id: I9adac63c62db22107345eddb30e7d81a1bda838c
Signed-off-by: Flora Cui
---
tests/amdgpu/amdgpu_test.c| 4 ++
tests/amdgpu/deadlock_tests.c | 103
On 2019-03-05 6:20 a.m., Michel Dänzer wrote:
> From: Michel Dänzer
>
> The compiler pointed out that one if block unintentionally wasn't part
> of the loop:
>
> In file included from ./include/linux/kernfs.h:14,
> from ./include/linux/sysfs.h:16,
> from ./inclu
Hi
I filled up a bug at https://bugzilla.kernel.org/show_bug.cgi?id=202599 but
nobody answer. I think that all information you need is there.
You are in MAINTAINERS file for RADEON and AMDGPU DRM DRIVERS.
I guess that all users with my video card will have the same issue, so is
important to solv
From: Eric Huang
It is to collaborate with HSA_CAPABILITY in libhsakmt.
v2: squash in NULL pointer check
Signed-off-by: Eric Huang
Reviewed-by: Felix Kuehling
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 16
drivers/
From: Eric Huang
RAS ECC event will combine with GPU reset event, due to
ECC interrupts are caused by uncorrectable error that triggers
GPU reset.
v2: Fix misleading-indentation warning
Signed-off-by: Eric Huang
Reviewed-by: Felix Kuehling
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deuche
From: xinhui pan
register IH, enable ras features on sdma.
create sysfs debugfs file for sdma.
Signed-off-by: xinhui pan
Signed-off-by: Feifei Xu
Signed-off-by: Eric Huang
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 4 +
drivers/gp
From: xinhui pan
gpu reset is not stable on vega20 A1.
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
b/drivers/gpu/drm
From: xinhui pan
Add a query for userspace to check which RAS features
are enabled.
v2: squash in warning fix
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 10
include/uapi/drm/amdgpu_drm.h |
From: xinhui pan
Mark vram pages with errors as bad and prevent the driver
from using them.
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/
From: xinhui pan
add obj management.
add feature control.
add debugfs infrastructure.
add sysfs infrastructure.
add IH infrastructure.
add recovery infrastructure.
It is a framework. Other IPs need call amdgpu_ras_xxx function instead of
psp_ras_xxx functions.
v2: squash in warning fixes
Signe
From: xinhui pan
Currently, the debugfs control node can't parse bash-like commands.
Now add such support for any tester that uses scripts.
v2: squash in fixes for input validation
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu
From: xinhui pan
allow userspace enable/disable ras
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 122 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 9 ++
2 files changed, 121 insertions(+)
From: xinhui pan
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ps
From: xinhui pan
Allow RAS feature enable/disable via boot parameter.
Signed-off-by: xinhui pan
Reviewed-by: Hawking Zhang
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 17
From: xinhui pan
Add AMDGPU_CTX_QUERY2_FLAGS_RAS_CE/UE which indicate if any error happened
between previous query and this query.
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 17 +
drivers/gpu/d
From: xinhui pan
Add trigger_error and cure_posion.
Acked-by: Hawking Zhang
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 50 ++
1 file changed, 50 insertions(+)
diff --git a/drivers/gpu/
This patch set adds initial RAS (Reliability, Availability, Serviceability)
support to amdgpu on supported boards. Features include SRAM and VRAM ECC,
bad page tracking, and error containment.
Eric Huang (2):
drm/amdkfd: add RAS capabilities in topology for Vega20 (v2)
drm/amdkfd: add RAS ECC
From: xinhui pan
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/ta_ras_if.h | 108 +
1 file changed, 108 insertions(+)
create mode 100644 drivers/gpu/drm/amd/amdgpu/ta_ras_if.h
diff --git a/drivers/gpu/d
From: xinhui pan
Add ras fw loading, init, terminate.
Add ras cmd submit helper.
Add ras feature enable/disable common function.
v2: squash in unused variable warning fix
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ps
From: Feifei Xu
Register ecc interrupts and ecc interrupt handler on gfx9.
Add ras support on gfx9
v2: squash in warning fix
Signed-off-by: Feifei Xu
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 3 +
drivers
From: xinhui pan
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 2 +
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 276
2 files changed, 278 insertions(+)
diff --git a/drivers/gpu/drm/amd/am
From: xinhui pan
Define the driver side interface for ras ta.
Acked-by: Hawking Zhang
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 11 +++
1 file changed, 11 insertions(+)
diff --git a/drivers/gpu/drm/
From: xinhui pan
Output the ta fw, aka xgmi/ras, via debugfs.
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 21 +
include/uapi/drm/amdgpu_drm.h | 1 +
2 files changed, 22 insertions
From: xinhui pan
Add ras fw part, xgmi and ras fw are combined together in ta binary.
Reading the data from the info is not implemented yet.
v2: squash in "drm/amdgpu: fix NULL pointer when ta is missing"
Signed-off-by: xinhui pan
Reviewed-by: Alex Deucher
Signed-off-by: Alex Deucher
---
dr
For each device a file xgmi_device_id is created.
On the first device a subdirectory named xgmi_hive_info is created,
It contains a file named hive_id and symlinks named node 1-4 linking
to each device in the hive.
v2: Return error codes instead of '-1' and few misspellings.
Signed-off-by: Andre
From: Leo Li
The following warning is seen during compile:
./include/linux/idr.h:212:2: warning: this ‘for’ clause does not
guard... [-Wmisleading-indentation]
for ((entry) = idr_get_next((idr), &(id)); \
^
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_events.c:1038:3: note: in
expansion of mac
In each /sys/class/drm/cardX/device/ device you will see the following
xgmi_device_id /* contains the device id within the hive */
xgmi_hive_info ->
../../../../:00:01.1/:02:00.0/:03:00.0/:04:00.0/xgmi_hive_info/
/* hive info folder */
inside xgmi_hive_info are back pointers to
On 2019-02-26 4:24 p.m., Sasha Levin wrote:
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: all
>
> The bot has tested the following trees: v4.20.12, v4.19.25, v4.1
On Tue, Mar 5, 2019 at 1:16 PM Paul Menzel
wrote:
>
> Dear Linux folks,
>
>
> Using the MST display Dell UP3214Q (two panels) with an AMD system,
> the virtual monitor object is not created. GDM and Xfce consider both
> panels as separate screens (`xrandr --listmonitors`).
>
> [0.00] Linux
Am 05.03.19 um 19:36 schrieb Grodzovsky, Andrey:
> On 3/5/19 1:26 PM, Koenig, Christian wrote:
>> Am 05.03.19 um 19:24 schrieb Grodzovsky, Andrey:
>>> On 3/5/19 1:18 PM, Koenig, Christian wrote:
Am 05.03.19 um 18:47 schrieb Andrey Grodzovsky:
> For each device a file xgmi_device_id is crea
On 3/5/19 1:26 PM, Koenig, Christian wrote:
> Am 05.03.19 um 19:24 schrieb Grodzovsky, Andrey:
>> On 3/5/19 1:18 PM, Koenig, Christian wrote:
>>> Am 05.03.19 um 18:47 schrieb Andrey Grodzovsky:
For each device a file xgmi_device_id is created.
On the first device a subdirectory named xgmi
Am 05.03.19 um 19:24 schrieb Grodzovsky, Andrey:
> On 3/5/19 1:18 PM, Koenig, Christian wrote:
>> Am 05.03.19 um 18:47 schrieb Andrey Grodzovsky:
>>> For each device a file xgmi_device_id is created.
>>> On the first device a subdirectory named xgmi_hive_info is created,
>>> It contains a file nam
On 3/5/19 1:18 PM, Koenig, Christian wrote:
> Am 05.03.19 um 18:47 schrieb Andrey Grodzovsky:
>> For each device a file xgmi_device_id is created.
>> On the first device a subdirectory named xgmi_hive_info is created,
>> It contains a file named hive_id and symlinks named node 1-4 linking
>> to e
Couple of comments inline.
Since I don't have any XGMI gear it would probably help to see what the
final directory layout/contents look like so I can update umr to
automagically scan all of this.
Tom
On 2019-03-05 12:47 p.m., Andrey Grodzovsky wrote:
> For each device a file xgmi_device_id is
Am 05.03.19 um 18:47 schrieb Andrey Grodzovsky:
> For each device a file xgmi_device_id is created.
> On the first device a subdirectory named xgmi_hive_info is created,
> It contains a file named hive_id and symlinks named node 1-4 linking
> to each device in the hive.
>
> Signed-off-by: Andrey G
Dear Linux folks,
Using the MST display Dell UP3214Q (two panels) with an AMD system,
the virtual monitor object is not created. GDM and Xfce consider both
panels as separate screens (`xrandr --listmonitors`).
[0.00] Linux version 4.20.13.mx64.248
(r...@holidayincambodia.molgen.mpg.de)
userptr may cross two VMAs if the forked child process (not call exec
after fork) malloc buffer, then free it, and then malloc larger size
buf, kerenl will create new VMA adjacent to old VMA which was cloned
from parent process, some pages of userptr are in the first VMA, the
rest pages are in the
Userptr restore may have concurrent userptr invalidation after
hmm_vma_fault adds the range to the hmm->ranges list, needs call
hmm_vma_range_done to remove the range from hmm->ranges list first,
then reschedule the restore worker. Otherwise hmm_vma_fault will add
same range to the list, this will
Those corner cases are found by kfdtest.KFDIPCTest.
Philip Yang (3):
drm/amdkfd: support concurrent userptr update for HMM
drm/amdgpu: support userptr cross VMAs case with HMM
drm/amdgpu: more descriptive message if HMM not enabled
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 28 +++-
If using old kernel config file, CONFIG_ZONE_DEVICE is not selected,
so CONFIG_HMM and CONFIG_HMM_MIRROR is not enabled, the current driver
error message "Failed to register MMU notifier" is not clear. Inform
user with more descriptive message on how to fix the missing kernel
config option.
Bugzil
For each device a file xgmi_device_id is created.
On the first device a subdirectory named xgmi_hive_info is created,
It contains a file named hive_id and symlinks named node 1-4 linking
to each device in the hive.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c |
On 2019-03-05 6:24 p.m., Nicholas Kazlauskas wrote:
> [Why]
> Can happen on ASICs with 6 planes, but this isn't a bug since we haven't
> written outside the array.
>
> [How]
> Use <= instead of <.
>
> Cc: Leo Li
> Cc: Michel Dänzer
> Reported-by: Michel Dänzer
> Signed-off-by: Nicholas Kazlaus
[Why]
Can happen on ASICs with 6 planes, but this isn't a bug since we haven't
written outside the array.
[How]
Use <= instead of <.
Cc: Leo Li
Cc: Michel Dänzer
Reported-by: Michel Dänzer
Signed-off-by: Nicholas Kazlauskas
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
1 file
On 3/5/19 12:09 PM, Michel Dänzer wrote:
> On 2019-02-25 11:46 p.m., Bhawanpreet Lakha wrote:
>> From: Nicholas Kazlauskas
>>
>> [Why]
>> Primary and underlay planes were previously exposed to DRM by using
>> max_planes and max_slave_planes.
>>
>> The value for max_planes was always pipe_count + h
On 2019-02-25 11:46 p.m., Bhawanpreet Lakha wrote:
> From: Nicholas Kazlauskas
>
> [Why]
> Primary and underlay planes were previously exposed to DRM by using
> max_planes and max_slave_planes.
>
> The value for max_planes was always pipe_count + has_underlay.
> If there was an underlay pipe, th
On 2019-03-05 9:14 a.m., Nicholas Kazlauskas wrote:
> [Why]
> New DRM versions manage locking for private objects for us, so this
> is no longer needed.
>
> This also prevents a WARN_ON from occurring when the private object is
> duplicated during the forced atomic commit that occurs from the HP
Signed-off-by: Alex Deucher
---
data/amdgpu.ids | 1 +
1 file changed, 1 insertion(+)
diff --git a/data/amdgpu.ids b/data/amdgpu.ids
index d24c7ee6..f61497e4 100644
--- a/data/amdgpu.ids
+++ b/data/amdgpu.ids
@@ -45,6 +45,7 @@
6665, 83, AMD Radeon (TM) R5 M320
6667, 0, AMD Radeon R5
Driver vote low to high pstate switch whenever there is an outstanding
XGMI mapping request. Driver vote high to low pstate when all the
outstanding XGMI mapping is terminated.
Change-Id: I499fb1c389077632fe9cfce4b6dc9a33deff6875
Signed-off-by: shaoyunl
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h
Reviewed-by: Alex Deucher
From: amd-gfx on behalf of Michel
Dänzer
Sent: Tuesday, March 5, 2019 6:20 AM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdkfd: Add curly braces around
idr_for_each_entry_continue loop
From: Michel Dänzer
The compiler p
Am 05.03.19 um 16:49 schrieb Liu, Shaoyun:
Adjust vram base offset for XGMI mapping when update the PT entry so
the address will fall into correct XGMI aperture for peer device
Change-Id: I78bdf244da699d2559481ef5afe9663b3e752236
Signed-off-by: shaoyunl
Reviewed-by: Christian König
---
d
Acked-by: Oak Zeng
Regards,
Oak
-Original Message-
From: amd-gfx On Behalf Of Christian
König
Sent: Monday, March 4, 2019 11:28 AM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 2/2] drm/amdgpu: let amdgpu_vm_clear_bo figure out ats status v2
Instead of providing it from outside fi
Adjust vram base offset for XGMI mapping when update the PT entry so
the address will fall into correct XGMI aperture for peer device
Change-Id: I78bdf244da699d2559481ef5afe9663b3e752236
Signed-off-by: shaoyunl
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 +
1 file changed, 9 inse
Oh, indeed. Nobody noticed so far and the patch is already committed.
Christian.
Am 05.03.19 um 16:25 schrieb Zeng, Oak:
Is the "UMD" in the title a typo? From the comments in the code it is "UMC"
Regards,
Oak
-Original Message-
From: amd-gfx On Behalf Of Christian
König
Sent: Monda
Reviewed-by: Alex Deucher
From: amd-gfx on behalf of Tom St Denis
Sent: Tuesday, March 5, 2019 10:17 AM
To: amd-gfx mailing list
Subject: Re: [PATCH] drm/amd/amdgpu: Add ENGINE_CNTL register to vcn10 headers
Hi,
Alex can I get an RB on this :-)
Thanks,
Tom
O
Is the "UMD" in the title a typo? From the comments in the code it is "UMC"
Regards,
Oak
-Original Message-
From: amd-gfx On Behalf Of Christian
König
Sent: Monday, March 4, 2019 8:15 AM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: reroute VMC and UMD to IH ring 1
Pag
Hi,
Alex can I get an RB on this :-)
Thanks,
Tom
On Mon, Mar 4, 2019 at 10:59 AM StDenis, Tom wrote:
> Signed-off-by: Tom St Denis
> ---
> drivers/gpu/drm/amd/include/asic_reg/vcn/vcn_1_0_offset.h | 2 ++
> drivers/gpu/drm/amd/include/asic_reg/vcn/vcn_1_0_sh_mask.h | 5 +
> 2 files chan
[Why]
New DRM versions manage locking for private objects for us, so this
is no longer needed.
This also prevents a WARN_ON from occurring when the private object is
duplicated during the forced atomic commit that occurs from the HPD
handler.
The HPD handler calls drm_modeset_lock_all before the
Hello wentalou,
The patch 2c11ee6ae553: "drm/amdgpu: tighten gpu_recover in
mailbox_flr to avoid duplicate recover in sriov" from Jan 30, 2019,
leads to the following static checker warning:
drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:270 xgpu_ai_mailbox_flr_work()
warn: impossible cond
From: Michel Dänzer
The compiler pointed out that one if block unintentionally wasn't part
of the loop:
In file included from ./include/linux/kernfs.h:14,
from ./include/linux/sysfs.h:16,
from ./include/linux/kobject.h:20,
from ./include/linux/d
Am 05.03.19 um 00:35 schrieb Kuehling, Felix:
One not so obvious change here: The fence on the page table after
clear_bo now waits for clearing both the page table and the shadow. That
may make clearing of page tables appear a bit slower. On the other hand,
if you're clearing a bunch of page tabl
Am 05.03.19 um 03:37 schrieb 周磊:
I got this kernel Oops when running ROCm kernel in a ARM64 machine
with Vega64 card.
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
b/drivers/gpu/drm/a
Am 05.03.19 um 00:27 schrieb Liu, Shaoyun:
Adjust vram base offset for XGMI mapping when update the PT entry so
the address will fall into correct XGMI aperture for peer device
Change-Id: I78bdf244da699d2559481ef5afe9663b3e752236
Signed-off-by: shaoyunl
---
drivers/gpu/drm/amd/amdgpu/amdgpu_v
65 matches
Mail list logo