From: Jason Gunthorpe
Since amdgpu does not use the snapshot mode of hmm_range_fault() a
successful return already proves that all entries in the pfns are
HMM_PFN_VALID, there is no need to check the return result of
hmm_device_entry_to_page().
Signed-off-by: Jason Gunthorpe
---
drivers/gpu/dr
发件人:"Christian König"
发送日期:2020-04-21 22:53:47
收件人:"赵军奎"
抄送人:Alex Deucher ,"David (ChunMing) Zhou"
,David Airlie ,Daniel Vetter
,Tom St Denis ,Ori Messinger
,Sam Ravnborg
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org,opensource.ker...@vivo.com
主题
dy Dunlap
Cc: Harry Wentland
Cc: Alex Deucher
Cc: Krzysztof Kozlowski
---
drivers/gpu/drm/amd/display/Kconfig |8 ++--
1 file changed, 2 insertions(+), 6 deletions(-)
--- linux-next-20200421.orig/drivers/gpu/drm/amd/display/Kconfig
+++ linux-next-20200421/drivers/gpu/drm/amd/displ
On Tue, Apr 21, 2020 at 09:21:46PM -0300, Jason Gunthorpe wrote:
> +void nouveau_hmm_convert_pfn(struct nouveau_drm *drm, struct hmm_range
> *range,
> + u64 *ioctl_addr)
> {
> unsigned long i, npages;
>
> + /*
> + * The ioctl_addr prepared here is pass
From: Jason Gunthorpe
This is just an alias for HMM_PFN_ERROR, nothing cares that the error was
because of a special page vs any other error case.
Signed-off-by: Jason Gunthorpe
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 -
drivers/gpu/drm/nouveau/nouveau_svm.c | 1 -
include/linux/hmm.
From: Jason Gunthorpe
The API is a bit complicated for the uses we actually have, and
disucssions for simplifying have come up a number of times.
This small series removes the customizable pfn format and simplifies the
return code of hmm_range_fault()
All the drivers are adjusted to process in
Sure, this seems to be a lot more professional than my previous modification.
My original intention is to make the code easier to read, and I learned a lot
from
submitting these patches. Thank you very much for all your guidance!
Regards,
Bernard
发件人:Felix Kuehling
发送日期:2020-04-22 10:27:16
收件
On Tue, Apr 21, 2020 at 09:21:43PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe
>
> hmm_vma_walk->last is supposed to be updated after every write to the
> pfns, so that it can be returned by hmm_range_fault(). However, this is
> not done consistently. Fortunately nothing checks the retu
Make the code a bit more readable by using a common
error handling pattern.
With that done the patch is Reviewed-by: Christian König
.
Signed-off-by: Bernard Zhao
Changes since V1:
*commit message improve
*code style refactoring
Changes since V2:
*code style adjust
Changes since V3:
*find the
On Tue, Apr 21, 2020 at 09:21:45PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe
>
> This is just an alias for HMM_PFN_ERROR, nothing cares that the error was
> because of a special page vs any other error case.
Looks good,
Reviewed-by: Christoph Hellwig
___
From: Jason Gunthorpe
Presumably the intent here was that hmm_range_fault() could put the data
into some HW specific format and thus avoid some work. However, nothing
actually does that, and it isn't clear how anything actually could do that
as hmm_range_fault() provides CPU addresses which must
From: Jason Gunthorpe
There is no reason for a user to select this or not directly - it should
be selected by drivers that are going to use the feature, similar to how
CONFIG_HMM_MIRROR works.
Currently all drivers provide a feature kconfig that will disable use of
DEVICE_PRIVATE in that driver,
On Tue, Apr 21, 2020 at 09:21:42PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe
>
> There is no reason for a user to select this or not directly - it should
> be selected by drivers that are going to use the feature, similar to how
> CONFIG_HMM_MIRROR works.
>
> Currently all drivers pr
From: Jason Gunthorpe
hmm_vma_walk->last is supposed to be updated after every write to the
pfns, so that it can be returned by hmm_range_fault(). However, this is
not done consistently. Fortunately nothing checks the return code of
hmm_range_fault() for anything other than error.
More important
[AMD Public Use]
Reviewed-by: Guchun Chen
Regards,
Guchun
-Original Message-
From: Dennis Li
Sent: Wednesday, April 22, 2020 1:57 PM
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
; Zhou1, Tao ; Zhang, Hawking
; Chen, Guchun
Cc: Li, Dennis
Subject: [PATCH v2] drm/amdgpu: se
If set error query ready in amdgpu_ras_late_init, which will
cause some IP blocks aren't initialized, but their error query
is ready.
v2: change the prefix of title to "drm/amdgpu" and remove
the unnecessary "{}".
Change-Id: I5087527261cb1b462afd82ad7592cf1ef73b15bd
Signed-off-by: Dennis Li
dif
[AMD Public Use]
if (adev->in_suspend || adev->in_gpu_reset) {
- amdgpu_ras_set_error_query_ready(adev, true);
return 0;
}
Please also remove the "{}". With that fixed, the patch is
Reviewed-by: Hawking Zhang
Regards,
Hawking
-Original Message-
[AMD Public Use]
Need to modify prefix of commit tile to 'drm/amdgpu'.
With that fixed, the patch is: Reviewed-by: Guchun Chen
Regards,
Guchun
-Original Message-
From: Dennis Li
Sent: Wednesday, April 22, 2020 12:31 PM
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
; Zhou1, T
According to the current kiq access register method,
there will be race condition when using KIQ to read
register if multiple clients want to read at same time
just like the expample below:
1. client-A start to read REG-0 throguh KIQ
2. client-A poll the seqno-0
3. client-B start to read REG-1 thro
If set error query ready in amdgpu_ras_late_init, which will
cause some IP blocks aren't initialized, but their error query
is ready.
Change-Id: I5087527261cb1b462afd82ad7592cf1ef73b15bd
Signed-off-by: Dennis Li
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgp
Acked-by: Yintian Tao
-Original Message-
From: Christian König
Sent: 2020年4月21日 22:23
To: Liu, Monk ; He, Jacob ; Tao, Yintian
; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: change how we update mmRLC_SPM_MC_CNTL
In pp_one_vf mode avoid the extra overhead and read/write
Mode1 reset is also affected as I confirmed on navi10 unfortunately.
That is why the original design(switch to mode1 reset on audio suspended
failure) over our previous discussions was not taken.
Anyway, I sent out a V2 patch to limit this for baco and mode1 reset only.
Regards,
Evan
-Origina
At default, the autosuspend delay of audio controller is 3S. If the
gpu reset is triggered within 3S(after audio controller idle),
the audio controller may be unable into suspended state. Then
the sudden gpu reset will cause some audio errors. The change
here is targeted to resolve this.
However i
On 2020-04-21 21:46, Bernard Zhao wrote:
Make the code a bit more readable by using a common
error handling pattern.
With that done the patch is Reviewed-by: Christian König
.
Signed-off-by: Bernard Zhao
Thanks. The patch is
Reviewed-by: Felix Kuehling
I removed the history from the commi
Thanks again for the patch. I'm going to apply this with some minor
fixes. The headline should start with "drm/amdgpu:". I'll also change
the wording of the headline and commit message:
drm/amdgpu: shrink critical section in
amdgpu_amdkfd_gpuvm_free_memory_of_gpu
Reduce the mem->lock
Reviewed-by: Monk Liu
_
Monk Liu|GPU Virtualization Team |AMD
-Original Message-
From: Christian König
Sent: Tuesday, April 21, 2020 10:23 PM
To: Liu, Monk ; He, Jacob ; Tao, Yintian
; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: chan
See comments inline.
Am 2020-04-19 um 9:58 p.m. schrieb Mukul Joshi:
> Track GPU memory usage on a per process basis and report it through
> sysfs.
>
> Signed-off-by: Mukul Joshi
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 12 ++
> drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 7
>
Patch 1 Acked-by: Andrey Grodzovsky
Patches 2-4 Reviewed-by: Andrey Grodzovsky
Andrey
On 4/21/20 1:23 AM, Evan Quan wrote:
Patch 1 and 2 are the necessary fixes for XGMI setup. Since these
operations are needed for other devices from the same hive. That's
missing now.
Patch 3 are 4 are basic
On Tue, Apr 21, 2020 at 8:00 AM Evan Quan wrote:
>
> At default, the autosuspend delay of audio controller is 3S. If the
> gpu reset is triggered within 3S(after audio controller idle),
> the audio controller may be unable into suspended state. Then
> the sudden gpu reset will cause some audio err
发件人:"Christian König"
发送日期:2020-04-21 21:02:27
收件人:"赵军奎"
抄送人:Alex Deucher ,"David (ChunMing) Zhou"
,David Airlie ,Daniel Vetter
,Tom St Denis ,Ori Messinger
,Sam Ravnborg
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org,opensource.ker...@vivo.com
主题
Am 21.04.20 um 15:39 schrieb 赵军奎:
发件人:"Christian König"
发送日期:2020-04-21 21:02:27
收件人:"赵军奎"
抄送人:Alex Deucher ,"David (ChunMing) Zhou" ,David Airlie
,Daniel Vetter ,Tom St Denis ,Ori Messinger
,Sam Ravnborg
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel
Am 21.04.20 um 16:33 schrieb Christian König:
Am 20.04.20 um 03:50 schrieb Randy Dunlap:
Fix a kernel-doc warning of missing struct field desription:
../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:92: warning: Function
parameter or member 'vm' not described in 'amdgpu_vm_eviction_lock'
Can't we j
Am 20.04.20 um 03:50 schrieb Randy Dunlap:
Fix a kernel-doc warning of missing struct field desription:
../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:92: warning: Function parameter or
member 'vm' not described in 'amdgpu_vm_eviction_lock'
Can't we just document the function parameter instead? Sh
In pp_one_vf mode avoid the extra overhead and read/write the
registers without the KIQ.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++---
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 10 --
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 13 ++---
On 2020-04-19 9:50 p.m., Randy Dunlap wrote:
> Fix a kernel-doc warning of missing struct field desription:
>
> ../drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:331: warning: Function
> parameter or member 'hdcp_workqueue' not described in 'amdgpu_display_manager'
>
> Fixes: 52704fcaf74b ("d
On 2020-04-19 9:50 p.m., Randy Dunlap wrote:
> Fix a kernel-doc warning of missing struct field desription:
>
> ../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:92: warning: Function parameter or
> member 'vm' not described in 'amdgpu_vm_eviction_lock'
>
> Fixes: a269e44989f3 ("drm/amdgpu: Avoid reclai
> What are you talking about? The bits control what is used in the MC
> interface, there is no increment or anything here.
My bad , it is RLC_SPM_PERF_CNTR not RLC_SPM_PERF_COUNTER, I though it as
COUNTER
>>Agreed that sounds like a good idea to me as well no matter if we use RMW or
>>just
Hi Christian
Great. Then can you modify the patch according to Monk's suggestion?
We need this patch for one important project.
Best Regards
Yintian Tao
-Original Message-
From: Koenig, Christian
Sent: 2020年4月21日 21:38
To: Liu, Monk ; Tao, Yintian ; He, Jacob
; amd-gfx@lists.freede
The problem is some fields are increased by hardware
What are you talking about? The bits control what is used in the MC
interface, there is no increment or anything here.
I think at least we should apply one change: we use NO_KIQ for SRIOV
pp_one_vf_mode case to access this SPM register t
The problem is some fields are increased by hardware, and RLC simply read its
value, we cannot set those field together with VMID
Christian, we should stop arguing on this small feature, there is no way to
have a worse solution compared with current logic
I think at least we should appl
From: "Christian König"
Date: 2020-04-21 16:06:03
To: 1587180037-113840-1-git-send-email-bern...@vivo.com,Felix Kuehling
,Alex Deucher ,"David
(ChunMing) Zhou" ,David Airlie ,Daniel
Vetter
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
Cc: opens
Maybe we could reduce the mutex_lock(&mem->lock)`s protected code area,
and no need to protect pr_debug.
Signed-off-by: Bernard Zhao
Changes since V1:
*commit message improve
Changes since V2:
*move comment along with the mutex_unlock
Changes since V3:
*lock protect the if check, there is some
From: "Christian König"
Date: 2020-04-21 19:22:49
To: Bernard Zhao ,Alex Deucher
,"David (ChunMing) Zhou" ,David
Airlie ,Daniel Vetter ,Tom St Denis
,Ori Messinger ,Sam Ravnborg
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
Cc: opensource.ker...
There is no need to if check again, maybe we could merge
into the above else branch.
Signed-off-by: Bernard Zhao
Changes since V1:
*commit message improve
*code style refactoring
Changes since V2:
*code style adjust
Changes since V3:
*find the best way to merge unnecessary if/else check branch
From: "Christian König"
Date: 2020-04-21 15:41:27
To: 1587181464-114215-1-git-send-email-bern...@vivo.com,Felix Kuehling
,Alex Deucher ,"David
(ChunMing) Zhou" ,David Airlie ,Daniel
Vetter
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
Cc: opens
>> But i have to say there are so many code not follow the kernel code-style in
>> amdgpu module.
>> And also the ./scripts/checkpatch.pl did not throw any warning or error.
>
> That is unfortunately true, yes. But we try to push new code through the
> usual code review and improve things as we g
>>> There is no need to if check again, maybe we could merge
>>> into the above else branch.
I find also this commit message still improvable (besides the mentioned
implementation details around coding style concerns).
How will corresponding review comments be taken better into account?
Regards,
> The "if(!encoder)" branch return the same value 0 of the success
> branch, maybe return -EINVAL is more better.
I suggest to improve the commit message.
* Are you still unsure about the next changes?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/proces
> But i have to say there are so many code not follow the kernel code-style in
> amdgpu module.
> And also the ./scripts/checkpatch.pl did not throw any warning or error.
Will such information become more interesting for further evolution
in the affected software areas?
Regards,
Markus
_
There is no need to if check again, maybe we could merge
into the above else branch.
Signed-off-by: Bernard Zhao
Changes since V1:
*commit message improve
*code style refactoring
Changes since V2:
*code style adjust
Link for V1:
*https://lore.kernel.org/patchwork/patch/1226587/
---
.../gpu/dr
VRAM manager and DRM MM when init failed, there is no operaction
to free kzalloc memory & remove device file.
This will lead to memleak & cause stability issue.
Signed-off-by: Bernard Zhao
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 24
1 file changed, 19 insertions(+
From: Bernard Zhao
Date: 2020-04-21 10:07:50
To: Alex Deucher ,"Christian König"
,"David (ChunMing) Zhou" ,David
Airlie ,Daniel Vetter ,Lyude Paul
,Sam Ravnborg ,Bernard Zhao
,"José Roberto de Souza" ,Andrzej
Pietrasiewicz
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,lin
> There is no need to if check again,
Thanks for this information.
* Should the function name be mentioned in this commit message?
* Would you like to adjust the patch subject another bit?
> maybe we could merge into the above else branch.
I suggest to reconsider this wording.
Are you still u
Am 21.04.20 um 14:22 schrieb Bernard Zhao:
There is no need to if check again, maybe we could merge
into the above else branch.
Signed-off-by: Bernard Zhao
You could improve the subject line and commit message a bit, e.g.
something like:
[PATCH] drm/amdgpu: cleanup coding style in amdkfd a
Am 21.04.20 um 14:09 schrieb 赵军奎:
From: "Christian König"
Date: 2020-04-21 19:22:49
To: Bernard Zhao ,Alex Deucher ,"David (ChunMing) Zhou"
,David Airlie ,Daniel Vetter ,Tom St Denis
,Ori Messinger ,Sam Ravnborg
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger
At default, the autosuspend delay of audio controller is 3S. If the
gpu reset is triggered within 3S(after audio controller idle),
the audio controller may be unable into suspended state. Then
the sudden gpu reset will cause some audio errors. The change
here is targeted to resolve this.
However i
Hi Monk,
at least on Vega that should be fine. If the RLC should use anything
else than 0 here we should update that together with the VMID.
Regards,
Christian.
Am 21.04.20 um 11:54 schrieb Liu, Monk:
Could only be that the firmware updates the bits to something non default, I'm
going to do
Am 21.04.20 um 13:17 schrieb Bernard Zhao:
VRAM manager and DRM MM when init failed, there is no operaction
to free kzalloc memory & remove device file.
This will lead to memleak & cause stability issue.
NAK, failure to create sysfs nodes are not critical.
Christian.
Signed-off-by: Bernard
>>> Could only be that the firmware updates the bits to something non default,
>>> I'm going to double check that on a Vega10.
I think that will be a sure answer, otherwise why we need those field if we
always write 0 to them and reader always expect 0 reading back from them ??
Those fields are
Am 21.04.20 um 11:45 schrieb Tao, Yintian:
-Original Message-
From: Christian König
Sent: 2020年4月21日 17:10
To: Liu, Monk ; Tao, Yintian ; He, Jacob
; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans
Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update
The RLC SPM configuration register co
Christian
Many fields looks not like going to be still value at all, e.g.:
RLC_SPM_PERF_CNTR 5 0x0 PERF_CNTR that is used by RLC for
memory transactions
By your change you always set above filed to 0, is it right ? I really doubt
it
Beside: to make SRIOV VF less painful ple
Am 21.04.20 um 10:44 schrieb 赵军奎:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..5424bd921a7b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
-Original Message-
From: Christian König
Sent: 2020年4月21日 17:10
To: Liu, Monk ; Tao, Yintian ; He, Jacob
; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans
Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update
The RLC SPM configuration register contains the information how the memory
acce
The RLC SPM configuration register contains the information how the memory
access is made (VMID, MTYPE, etc) which should always be consistent.
So instead of a read modify write cycle of the VMID always update
the whole register.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu
Original idea is from Monk which only update spm vmid
at first time which can release the frequent r/w register
burden under virtualization.
v2: set spm_vmid_updated to false when job timedout
Signed-off-by: Yintian Tao
---
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +++
drivers/gpu/drm/amd/am
Am 21.04.20 um 10:03 schrieb Bernard Zhao:
There is no need to if check again, maybe we could merge
into the above else branch.
Signed-off-by: Bernard Zhao
Changes since V1:
*commit message improve
*code style refactoring
Changes since V2:
*code style adjust
Link for V1:
*https://nam11.safel
Am 21.04.20 um 09:36 schrieb Bernard Zhao:
Maybe we could reduce the mutex_lock(&mem->lock)`s protected code area,
and no need to protect pr_debug.
Well that change looks rather superfluous to me.
This is for freeing memory which by definition can only be done once and
so the should be exactl
Am 21.04.20 um 04:41 schrieb Bernard Zhao:
There is no need to if check again, maybe we could merge
into the above else branch.
Signed-off-by: Bernard Zhao
---
Changes since V1:
*commit message improve
*code style refactoring
Link for V1:
*
https://nam11.safelinks.protection.outlook.com/?url
Am 21.04.20 um 04:41 schrieb YueHaibing:
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c: In function amdgpu_job_submit:
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:148:26: warning: variable priority set
but not used [-Wunused-but-set-variable]
commit 33abcb1f5a17 ("drm/amdgpu: set compute queue priority a
Maybe we could reduce the mutex_lock(&mem->lock)`s protected code area,
and noneed to protect pr_debug.
Signed-off-by: Bernard Zhao
Changes since V1:
*commit message improve
Link for V1:
*https://lore.kernel.org/patchwork/patch/1226588/
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 7
The "if(!encoder)" branch return the same value 0 of the success
branch, maybe return -EINVAL is more better.
Signed-off-by: Bernard Zhao
---
Changes since V1:
* commit message improve
---
drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c | 14 +++---
1 file changed, 7 insertions(+), 7 del
71 matches
Mail list logo