On 5/5/26 10:09, Alistair Popple wrote:

> Thanks for doing this work Mika. I've been meaning to take a look at this 
> series
> for a while. I'm currently at LSFMM but will try and take a look this week or
> next as it sounds quite useful.
>
>  - Alistair

Thanks Alistair and no problem, appreciate your insights whenever you have time.

--Mika

>
> On 2026-05-05 at 15:16 +1000, [email protected] wrote...
>> From: Mika Penttilä <[email protected]>
>>
>> Currently, the way device page faulting and migration works
>> is not optimal, if you want to do both fault handling and
>> migration at once.
>>
>> Being able to migrate not present pages (or pages mapped with incorrect
>> permissions, eg. COW) to the GPU requires doing either of the
>> following sequences:
>>
>> 1. hmm_range_fault() - fault in non-present pages with correct permissions, 
>> etc.
>> 2. migrate_vma_*() - migrate the pages
>>
>> Or:
>>
>> 1. migrate_vma_*() - migrate present pages
>> 2. If non-present pages detected by migrate_vma_*():
>>    a) call hmm_range_fault() to fault pages in
>>    b) call migrate_vma_*() again to migrate now present pages
>>
>> The problem with the first sequence is that you always have to do two
>> page walks even when most of the time the pages are present or zero page
>> mappings so the common case takes a performance hit.
>>
>> The second sequence is better for the common case, but far worse if
>> pages aren't present because now you have to walk the page tables three
>> times (once to find the page is not present, once so hmm_range_fault()
>> can find a non-present page to fault in and once again to setup the
>> migration). It is also tricky to code correctly. One page table walk
>> could costs over 1000 cpu cycles on X86-64, which is a significant hit.
>>
>> We should be able to walk the page table once, faulting
>> pages in as required and replacing them with migration entries if
>> requested.
>>
>> Add a new flag to HMM APIs, HMM_PFN_REQ_MIGRATE,
>> which tells to prepare for migration also during fault handling.
>> Also, for the migrate_vma_setup() call paths, a flag, MIGRATE_VMA_FAULT,
>> is added to tell to add fault handling to migrate.
>>
>> One extra benefit of migrating with hmm_range_fault() path
>> is the migrate_vma.vma gets populated, so no need to
>> retrieve that separataly.
>>
>> Tested in X86-64 VM with HMM test device, passing the selftests.
>> For performance, the migrate throughput tests from the selftests
>> show similar numbers (within error margin) as unmodified kernel.
>> Tested also rebased on the
>> "Remove device private pages from physical address space" series:
>> https://lore.kernel.org/linux-mm/[email protected]/
>> plus a small patch to adjust with no problems.
>>
>> Changes v8-v9
>>   - rebase on drm-tip
>>   - fixed uaf around  migrate_vma_split_folio() usage
>>   - added missing pmd unlock
>>
>> Changes v7-v8
>>   - rebase on 7.0
>>   - fixed subject in two patches
>>   - enhanced commit messages
>>   - squashed patch 6 into patch 4 to fix kernel test robot warning
>>   - readded dropped Cc block from cover letter
>>   - fixed white space
>>
>> Changes v6-v7
>>   - rebase on 7.0.0-rc6
>>   - added documentation and comments
>>   - denote to be migrated zero page as HMM_PFN_MIGRATE alone
>>   - got rid of HMM_PFN_INOUT_FLAGS movement in patch 2
>>   - picked up Acked-By from David for patch 1
>>   
>> Changes v5-v6
>>   - rebase on 7.0.0-rc4
>>   - use range based TLB flushing while unmapping ptes
>>   - gate migration behind HMM_PFN_REQ_MIGRATE for fault and
>>     migrate paths
>>   - always infer migration flags from migrate->flags only
>>
>> Changes v4-v5
>>   - rebase on 6.19
>>   - fixed David's email address
>>   - fixed link issue without CONFIG_TRANSPARENT_HUGEPAGE
>>   - refactored into smaller commits
>>   - added more comments to code
>>
>> Changes v3-v4:
>>   - rebase on 6.19-rc8
>>   - fixed issues found by kernel test robot with random configs
>>   - fixed typos
>>
>> Changes v2-v3:
>>   - rebase on 6.19-rc7
>>   - fixed issues found by kernel test robot
>>   - fixed smatch issues reported by Dan Carpenter <[email protected]>
>>   - fixes to lock handling (pmd/pte) on errors
>>   - added assertions for pmd/pte lock states
>>   - other issues discovered by Matthew, thanks!
>>
>> Changes v1-v2:
>>   - rebase on 6.19-rc6
>>   - fixed issues found by kernel test robot
>>   - fixed locking (pmd/ptl) to cover handle_ and prepare_ regions
>>     parts if migrating
>>   - other issues discovered by Matthew, thanks!
>>
>> Changes RFC-v1:
>>   - rebase on 6.19-rc5
>>   - adjust for the device THP
>>   - changes from feedback
>>
>> Revisions:
>>   - RFC 
>> https://lore.kernel.org/linux-mm/[email protected]/
>>   - v1: 
>> https://lore.kernel.org/all/[email protected]/
>>   - v2: 
>> https://lore.kernel.org/all/[email protected]/
>>   - v3: 
>> https://lore.kernel.org/all/[email protected]/
>>   - v4: 
>> https://lore.kernel.org/all/[email protected]/
>>   - v5: 
>> https://lore.kernel.org/linux-mm/[email protected]/
>>   - v6: 
>> https://lore.kernel.org/linux-mm/[email protected]/
>>   - v7: 
>> https://lore.kernel.org/linux-mm/[email protected]/
>>   - v8: 
>> https://lore.kernel.org/linux-mm/[email protected]/
>>
>> Cc: David Hildenbrand <[email protected]>
>> Cc: Jason Gunthorpe <[email protected]>
>> Cc: Leon Romanovsky <[email protected]>
>> Cc: Alistair Popple <[email protected]>
>> Cc: Balbir Singh <[email protected]>
>> Cc: Zi Yan <[email protected]>
>> Cc: Matthew Brost <[email protected]>
>> Cc: Andrew Morton <[email protected]>
>> Cc: Lorenzo Stoakes <[email protected]>
>> Cc: "Liam R. Howlett" <[email protected]>
>> Cc: Vlastimil Babka <[email protected]>
>> Cc: Mike Rapoport <[email protected]>
>> Cc: Suren Baghdasaryan <[email protected]>
>> Cc: Michal Hocko <[email protected]>
>>
>> Mika Penttilä (5):
>>   mm/Kconfig: changes for migrate on fault for device pages
>>   mm: Add helper to convert HMM pfn to migrate pfn
>>   mm/hmm: do the plumbing for HMM to participate in migration
>>   mm: setup device page migration in HMM pagewalk
>>   lib/test_hmm:: add a new testcase for the migrate on fault
>>
>>  include/linux/hmm.h                    |  19 +-
>>  include/linux/migrate.h                |  26 +-
>>  lib/test_hmm.c                         | 101 ++-
>>  lib/test_hmm_uapi.h                    |  19 +-
>>  mm/Kconfig                             |   2 +
>>  mm/hmm.c                               | 835 +++++++++++++++++++++++--
>>  mm/migrate_device.c                    | 583 +++--------------
>>  tools/testing/selftests/mm/hmm-tests.c |  54 ++
>>  8 files changed, 1066 insertions(+), 573 deletions(-)
>>
>> drm-tip
>> base-commit: 94d56a898a2db27f841b17f6966a81ba502fe63c
>> -- 
>> 2.50.0
>>

Reply via email to