Re: [PATCH 3/3] drm/amdkfd: report pcie bandwidth to the kfd

2021-07-17 Thread Felix Kuehling
Am 2021-07-16 um 12:43 p.m. schrieb Jonathan Kim: > Similar to xGMI reporting the min/max bandwidth between direct peers, PCIe > will report the min/max bandwidth to the KFD. > > v2: change to bandwidth > > Signed-off-by: Jonathan Kim > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 61 ++

Re: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between direct peers to the kfd

2021-07-17 Thread Felix Kuehling
Am 2021-07-16 um 12:43 p.m. schrieb Jonathan Kim: > Report the min/max bandwidth in megabytes to the kfd for direct > xgmi connections only. By "direct XGMI connections", you mean this doesn't work for links with more than one hop? Will that spew out DRM_ERROR messages for such links? Then it's pr

[PATCH v4 07/13] mm: add generic type support to migrate_vma helpers

2021-07-17 Thread Alex Sierra
Device generic type case added for migrate_vma_pages and migrate_vma_check_page helpers. Both, generic and private device types have the same conditions to decide to migrate pages from/to device memory. Signed-off-by: Alex Sierra --- mm/migrate.c | 20 +--- 1 file changed, 9 inse

[PATCH v4 05/13] drm/amdkfd: generic type as sys mem on migration to ram

2021-07-17 Thread Alex Sierra
Generic device type memory on VRAM to RAM migration, has similar access as System RAM from the CPU. This flag sets the source from the sender. Which in Generic type case, should be set as SYSTEM. Signed-off-by: Alex Sierra Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c

[PATCH v4 03/13] kernel: resource: lookup_resource as exported symbol

2021-07-17 Thread Alex Sierra
The AMD architecture for the Frontier supercomputer will have device memory which can be coherently accessed by the CPU. The system BIOS advertises this memory as SPM (special purpose memory) in the UEFI system address map. The AMDGPU driver needs to be able to lookup this resource in order to cla

[PATCH v4 00/13] Support DEVICE_GENERIC memory in migrate_vma_*

2021-07-17 Thread Alex Sierra
v1: AMD is building a system architecture for the Frontier supercomputer with a coherent interconnect between CPUs and GPUs. This hardware architecture allows the CPUs to coherently access GPU device memory. We have hardware in our labs and we are working with our partner HPE on the BIOS, firmware

[PATCH v4 08/13] mm: call pgmap->ops->page_free for DEVICE_GENERIC pages

2021-07-17 Thread Alex Sierra
Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback. Device generic type memory case is now able to free its pages properly. Signed-off-by: Alex Sierra --- mm/memremap.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/memremap.c b/mm/memremap.c index 614b

[PATCH v4 04/13] drm/amdkfd: add SPM support for SVM

2021-07-17 Thread Alex Sierra
When CPU is connected throug XGMI, it has coherent access to VRAM resource. In this case that resource is taken from a table in the device gmc aperture base. This resource is used along with the device type, which could be DEVICE_PRIVATE or DEVICE_GENERIC to create the device page map region. Sign

[PATCH v4 09/13] lib: test_hmm add ioctl to get zone device type

2021-07-17 Thread Alex Sierra
new ioctl cmd added to query zone device type. This will be used once the test_hmm adds zone device generic type. Signed-off-by: Alex Sierra --- lib/test_hmm.c | 15 ++- lib/test_hmm_uapi.h | 7 +++ 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/lib/test_hmm

[PATCH v4 01/13] ext4/xfs: add page refcount helper

2021-07-17 Thread Alex Sierra
From: Ralph Campbell There are several places where ZONE_DEVICE struct pages assume a reference count == 1 means the page is idle and free. Instead of open coding this, add a helper function to hide this detail. v3: [AS]: rename dax_layout_is_idle_page func to dax_page_unused v4: [AS]: This ref

[PATCH v4 13/13] tools: update test_hmm script to support SP config

2021-07-17 Thread Alex Sierra
Add two more parameters to set spm_addr_dev0 & spm_addr_dev1 addresses. These two parameters configure the start SP addresses for each device in test_hmm driver. Consequently, this configures zone device type as generic. Signed-off-by: Alex Sierra --- tools/testing/selftests/vm/test_hmm.sh | 20

[PATCH v4 11/13] lib: add support for device generic type in test_hmm

2021-07-17 Thread Alex Sierra
Device Generic type uses device memory that is coherently accesible by the CPU. Usually, this is shown as SP (special purpose) memory range at the BIOS-e820 memory enumeration. If no SP memory is supported in system, this could be faked by setting CONFIG_EFI_FAKE_MEMMAP. Currently, test_hmm only s

[PATCH v4 02/13] mm: remove extra ZONE_DEVICE struct page refcount

2021-07-17 Thread Alex Sierra
From: Ralph Campbell ZONE_DEVICE struct pages have an extra reference count that complicates the code for put_page() and several places in the kernel that need to check the reference count to see that a page is not being used (gup, compaction, migration, etc.). Clean up the code so the reference

[PATCH v4 12/13] tools: update hmm-test to support device generic type

2021-07-17 Thread Alex Sierra
Test cases such as migrate_fault and migrate_multiple, were modified to explicit migrate from device to sys memory without the need of page faults, when using device generic type. Snapshot test case updated to read memory device type first and based on that, get the proper returned results migrate

[PATCH v4 06/13] include/linux/mm.h: helpers to check zone device generic type

2021-07-17 Thread Alex Sierra
Two helpers added. One checks if zone device page is generic type. The other if page is either private or generic type. Signed-off-by: Alex Sierra --- include/linux/mm.h | 8 1 file changed, 8 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index d8d79bb94be8..f5b247

[PATCH v4 10/13] lib: test_hmm add module param for zone device type

2021-07-17 Thread Alex Sierra
In order to configure device generic in test_hmm, two module parameters should be passed, which correspon to the SP start address of each device (2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed, private device type is configured. Signed-off-by: Alex Sierra --- lib/test_hmm.c |

Re: [PATCH v3 0/8] Support DEVICE_GENERIC memory in migrate_vma_*

2021-07-17 Thread Sierra Guiza, Alejandro (Alex)
On 7/16/2021 5:14 PM, Felix Kuehling wrote: Am 2021-07-16 um 11:07 a.m. schrieb Theodore Y. Ts'o: On Wed, Jun 23, 2021 at 05:49:55PM -0400, Felix Kuehling wrote: I can think of two ways to test the changes for MEMORY_DEVICE_GENERIC in this patch series in a way that is reproducible without spe