[PATCH] mm/hmm: avoid bloating arch that do not make use of HMM

2017-08-17 Thread jglisse
From: Jérôme Glisse This move all new code including new page migration helper behind kernel Kconfig option so that there is no codee bloat for arch or user that do not want to use HMM or any of its associated features. arm allyesconfig (without all the patchset, then with and this patch): te

[PATCH 0/2] Optimize mmu_notifier->invalidate_range callback

2017-10-16 Thread jglisse
From: Jérôme Glisse (Andrew you already have v1 in your queue of patch 1, patch 2 is new, i think you can drop it patch 1 v1 for v2, v2 is bit more conservative and i fixed typos) All this only affect user of invalidate_range callback (at this time CAPI arch/powerpc/platforms/powernv/npu-dma.c

[PATCH 1/2] mm/mmu_notifier: avoid double notification when it is useless v2

2017-10-16 Thread jglisse
From: Jérôme Glisse This patch only affects users of mmu_notifier->invalidate_range callback which are device drivers related to ATS/PASID, CAPI, IOMMUv2, SVM ... and it is an optimization for those users. Everyone else is unaffected by it. When clearing a pte/pmd we are given a choice to notify

[PATCH 2/2] mm/mmu_notifier: avoid call to invalidate_range() in range_end()

2017-10-16 Thread jglisse
From: Jérôme Glisse This is an optimization patch that only affect mmu_notifier users which rely on the invalidate_range() callback. This patch avoids calling that callback twice in a row from inside __mmu_notifier_invalidate_range_end Existing pattern (before this patch): mmu_notifier_inval

[PATCH 4/7] mm/hmm: properly handle migration pmd v2

2018-08-29 Thread jglisse
From: Jérôme Glisse Before this patch migration pmd entry (!pmd_present()) would have been treated as a bad entry (pmd_bad() returns true on migration pmd entry). The outcome was that device driver would believe that the range covered by the pmd was bad and would either SIGBUS or simply kill all

[PATCH 3/7] mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly v2

2018-08-30 Thread jglisse
From: Ralph Campbell Private ZONE_DEVICE pages use a special pte entry and thus are not present. Properly handle this case in map_pte(), it is already handled in check_pte(), the map_pte() part was lost in some rebase most probably. Without this patch the slow migration path can not migrate back

[PATCH 2/7] mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly

2018-08-24 Thread jglisse
From: Ralph Campbell Private ZONE_DEVICE pages use a special pte entry and thus are not present. Properly handle this case in map_pte(), it is already handled in check_pte(), the map_pte() part was lost in some rebase most probably. Without this patch the slow migration path can not migrate back

[PATCH 7/7] mm/hmm: proper support for blockable mmu_notifier

2018-08-24 Thread jglisse
From: Jérôme Glisse When mmu_notifier calls invalidate_range_start callback with blockable set to false we should not sleep. Properly propagate this to HMM users. Signed-off-by: Jérôme Glisse Cc: Michal Hocko Cc: Ralph Campbell Cc: John Hubbard Cc: Andrew Morton --- include/linux/hmm.h | 1

[PATCH 1/7] mm/hmm: fix utf8 ...

2018-08-24 Thread jglisse
From: Jérôme Glisse Somehow utf=8 must have been broken. Signed-off-by: Jérôme Glisse Cc: Andrew Morton --- include/linux/hmm.h | 2 +- mm/hmm.c| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 4c92e3ba3e16..1ff4

[PATCH 6/7] mm/hmm: invalidate device page table at start of invalidation

2018-08-24 Thread jglisse
From: Jérôme Glisse Invalidate device page table at start of invalidation and invalidate in progress CPU page table snapshooting at both start and end of any invalidation. This is helpful when device need to dirty page because the device page table report the page as dirty. Dirtying page must ha

[PATCH 3/7] mm/hmm: fix race between hmm_mirror_unregister() and mmu_notifier callback

2018-08-24 Thread jglisse
From: Ralph Campbell In hmm_mirror_unregister(), mm->hmm is set to NULL and then mmu_notifier_unregister_no_release() is called. That creates a small window where mmu_notifier can call mmu_notifier_ops with mm->hmm equal to NULL. Fix this by first unregistering mmu notifier callbacks and then set

[PATCH 4/7] mm/hmm: properly handle migration pmd

2018-08-24 Thread jglisse
From: Jérôme Glisse Before this patch migration pmd entry (!pmd_present()) would have been treated as a bad entry (pmd_bad() returns true on migration pmd entry). The outcome was that device driver would believe that the range covered by the pmd was bad and would either SIGBUS or simply kill all

[PATCH 5/7] mm/hmm: use a structure for update callback parameters

2018-08-24 Thread jglisse
From: Jérôme Glisse Use a structure to gather all the parameters for the update callback. This make it easier when adding new parameters by avoiding having to update all callback function signature. Signed-off-by: Jérôme Glisse Cc: Ralph Campbell Cc: John Hubbard Cc: Andrew Morton --- inclu

[PATCH 0/7] HMM updates, improvements and fixes

2018-08-24 Thread jglisse
From: Jérôme Glisse Few fixes that only affect HMM users. Improve the synchronization call back so that we match was other mmu_notifier listener do and add proper support to the new blockable flags in the process. For curious folks here are branches to leverage HMM in various existing device dri

[PATCH] mm/hmm: fix uninitialized use of 'entry' in hmm_vma_walk_pmd()

2018-01-22 Thread jglisse
From: Ralph Campbell The variable 'entry' is used before being initialized in hmm_vma_walk_pmd() Signed-off-by: Ralph Campbell Signed-off-by: Jérôme Glisse --- mm/hmm.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index ea19742a5d60..979211c7ccc8

[PATCH 1/2] mm/hmm: do not ignore specific pte fault flag in hmm_vma_fault()

2018-03-26 Thread jglisse
From: Ralph Campbell Save requested fault flags from caller supplied pfns array before overwriting it with the special none value. Without this we would not fault on all cases requested by caller, leading to caller calling us in a loop unless something else did change the CPU page table. Signed-

[PATCH 0/2] Small HMM fixes

2018-03-26 Thread jglisse
From: Jérôme Glisse Two small fixes on top of what i already sent. First one fix a real dumb mistake (i did). Second one fix fault logic to be consistant in respect to all combinations. No cc-ing stable for lack of current upstream user. Kudos to Ralph for catching those. Cc: Ralph Campbell C

[PATCH 2/2] mm/hmm: clarify fault logic for device private memory

2018-03-26 Thread jglisse
From: Ralph Campbell For device private memory caller of hmm_vma_fault() want to be able to carefully control fault behavior. Update logic to only fault on device private entry if explicitly requested. Before this patch a read only device private CPU page table entry would fault if caller reques

[PATCH 00/15] hmm: fixes and documentations v3

2018-03-19 Thread jglisse
From: Jérôme Glisse Added a patch to fix zombie mm_struct (missing call to mmu notifier unregister) this was lost in translation at some point. Included all typos and comments received so far (and even more typos fixes). Added more comments. Updated individual patch version to reflect changes. B

[PATCH 02/15] mm/hmm: fix header file if/else/endif maze v2

2018-03-19 Thread jglisse
From: Jérôme Glisse The #if/#else/#endif for IS_ENABLED(CONFIG_HMM) were wrong. Because of this after multiple include there was multiple definition of both hmm_mm_init() and hmm_mm_destroy() leading to build failure if HMM was enabled (CONFIG_HMM set). Changed since v1: - Fix the maze when CO

[PATCH 04/15] mm/hmm: unregister mmu_notifier when last HMM client quit

2018-03-19 Thread jglisse
From: Jérôme Glisse This code was lost in translation at one point. This properly call mmu_notifier_unregister_no_release() once last user is gone. This fix the zombie mm_struct as without this patch we do not drop the refcount we have on it. Signed-off-by: Jérôme Glisse Cc: Evgeny Baskakov Cc

[PATCH 03/15] mm/hmm: HMM should have a callback before MM is destroyed v2

2018-03-19 Thread jglisse
From: Ralph Campbell The hmm_mirror_register() function registers a callback for when the CPU pagetable is modified. Normally, the device driver will call hmm_mirror_unregister() when the process using the device is finished. However, if the process exits uncleanly, the struct_mm can be destroyed

[PATCH 06/15] mm/hmm: use struct for hmm_vma_fault(), hmm_vma_get_pfns() parameters v2

2018-03-19 Thread jglisse
From: Jérôme Glisse Both hmm_vma_fault() and hmm_vma_get_pfns() were taking a hmm_range struct as parameter and were initializing that struct with others of their parameters. Have caller of those function do this as they are likely to already do and only pass this struct to both function this sho

[PATCH 07/15] mm/hmm: remove HMM_PFN_READ flag and ignore peculiar architecture v2

2018-03-19 Thread jglisse
From: Jérôme Glisse Only peculiar architecture allow write without read thus assume that any valid pfn do allow for read. Note we do not care for write only because it does make sense with thing like atomic compare and exchange or any other operations that allow you to get the memory value throug

[PATCH 09/15] mm/hmm: cleanup special vma handling (VM_SPECIAL)

2018-03-19 Thread jglisse
From: Jérôme Glisse Special vma (one with any of the VM_SPECIAL flags) can not be access by device because there is no consistent model across device drivers on those vma and their backing memory. This patch directly use hmm_range struct for hmm_pfns_special() argument as it is always affecting

[PATCH 08/15] mm/hmm: use uint64_t for HMM pfn instead of defining hmm_pfn_t to ulong v2

2018-03-19 Thread jglisse
From: Jérôme Glisse All device driver we care about are using 64bits page table entry. In order to match this and to avoid useless define convert all HMM pfn to directly use uint64_t. It is a first step on the road to allow driver to directly use pfn value return by HMM (saving memory and CPU cyc

[PATCH 10/15] mm/hmm: do not differentiate between empty entry or missing directory v2

2018-03-19 Thread jglisse
From: Jérôme Glisse There is no point in differentiating between a range for which there is not even a directory (and thus entries) and empty entry (pte_none() or pmd_none() returns true). Simply drop the distinction ie remove HMM_PFN_EMPTY flag and merge now duplicate hmm_vma_walk_hole() and hm

[PATCH 14/15] mm/hmm: change hmm_vma_fault() to allow write fault on page basis

2018-03-19 Thread jglisse
From: Jérôme Glisse This change hmm_vma_fault() to not take a global write fault flag for a range but instead rely on caller to populate HMM pfns array with proper fault flag ie HMM_PFN_VALID if driver want read fault for that address or HMM_PFN_VALID and HMM_PFN_WRITE for write. Moreover by set

[PATCH 11/15] mm/hmm: rename HMM_PFN_DEVICE_UNADDRESSABLE to HMM_PFN_DEVICE_PRIVATE

2018-03-19 Thread jglisse
From: Jérôme Glisse Make naming consistent across code, DEVICE_PRIVATE is the name use outside HMM code so use that one. Signed-off-by: Jérôme Glisse Reviewed-by: John Hubbard Cc: Evgeny Baskakov Cc: Ralph Campbell Cc: Mark Hairgrove --- include/linux/hmm.h | 4 ++-- mm/hmm.c|

[PATCH 12/15] mm/hmm: move hmm_pfns_clear() closer to where it is use

2018-03-19 Thread jglisse
From: Jérôme Glisse Move hmm_pfns_clear() closer to where it is use to make it clear it is not use by page table walkers. Signed-off-by: Jérôme Glisse Reviewed-by: John Hubbard Cc: Evgeny Baskakov Cc: Ralph Campbell Cc: Mark Hairgrove --- mm/hmm.c | 16 1 file changed, 8 i

[PATCH 13/15] mm/hmm: factor out pte and pmd handling to simplify hmm_vma_walk_pmd()

2018-03-19 Thread jglisse
From: Jérôme Glisse No functional change, just create one function to handle pmd and one to handle pte (hmm_vma_handle_pmd() and hmm_vma_handle_pte()). Signed-off-by: Jérôme Glisse Cc: Evgeny Baskakov Cc: Ralph Campbell Cc: Mark Hairgrove Cc: John Hubbard --- mm/hmm.c | 174 +++

[PATCH 15/15] mm/hmm: use device driver encoding for HMM pfn v2

2018-03-19 Thread jglisse
From: Jérôme Glisse User of hmm_vma_fault() and hmm_vma_get_pfns() provide a flags array and pfn shift value allowing them to define their own encoding for HMM pfn that are fill inside the pfns array of the hmm_range struct. With this device driver can get pfn that match their own private encodin

[PATCH 05/15] mm/hmm: hmm_pfns_bad() was accessing wrong struct

2018-03-19 Thread jglisse
From: Jérôme Glisse The private field of mm_walk struct point to an hmm_vma_walk struct and not to the hmm_range struct desired. Fix to get proper struct pointer. Signed-off-by: Jérôme Glisse Cc: sta...@vger.kernel.org Cc: Evgeny Baskakov Cc: Ralph Campbell Cc: Mark Hairgrove Cc: John Hubbar

[PATCH 01/15] mm/hmm: documentation editorial update to HMM documentation

2018-03-19 Thread jglisse
From: Ralph Campbell This patch updates the documentation for HMM to fix minor typos and phrasing to be a bit more readable. Signed-off-by: Ralph Campbell Signed-off-by: Jérôme Glisse Cc: Stephen Bates Cc: Jason Gunthorpe Cc: Logan Gunthorpe Cc: Evgeny Baskakov Cc: Mark Hairgrove Cc: Joh

[PATCH 04/15] mm/hmm: unregister mmu_notifier when last HMM client quit v2

2018-03-21 Thread jglisse
From: Jérôme Glisse This code was lost in translation at one point. This properly call mmu_notifier_unregister_no_release() once last user is gone. This fix the zombie mm_struct as without this patch we do not drop the refcount we have on it. Changed since v1: - close race window between a las

[PATCH 03/15] mm/hmm: HMM should have a callback before MM is destroyed v3

2018-03-21 Thread jglisse
From: Ralph Campbell The hmm_mirror_register() function registers a callback for when the CPU pagetable is modified. Normally, the device driver will call hmm_mirror_unregister() when the process using the device is finished. However, if the process exits uncleanly, the struct_mm can be destroyed

[PATCH 04/15] mm/hmm: unregister mmu_notifier when last HMM client quit v3

2018-03-21 Thread jglisse
From: Jérôme Glisse This code was lost in translation at one point. This properly call mmu_notifier_unregister_no_release() once last user is gone. This fix the zombie mm_struct as without this patch we do not drop the refcount we have on it. Changed since v1: - close race window between a las

[PATCH 09/15] mm/hmm: cleanup special vma handling (VM_SPECIAL)

2018-03-22 Thread jglisse
From: Jérôme Glisse Special vma (one with any of the VM_SPECIAL flags) can not be access by device because there is no consistent model across device drivers on those vma and their backing memory. This patch directly use hmm_range struct for hmm_pfns_special() argument as it is always affecting

[PATCH 12/15] mm/hmm: move hmm_pfns_clear() closer to where it is use

2018-03-22 Thread jglisse
From: Jérôme Glisse Move hmm_pfns_clear() closer to where it is use to make it clear it is not use by page table walkers. Signed-off-by: Jérôme Glisse Reviewed-by: John Hubbard Cc: Evgeny Baskakov Cc: Ralph Campbell Cc: Mark Hairgrove --- mm/hmm.c | 16 1 file changed, 8 i

[PATCH 10/15] mm/hmm: do not differentiate between empty entry or missing directory v3

2018-03-22 Thread jglisse
From: Jérôme Glisse There is no point in differentiating between a range for which there is not even a directory (and thus entries) and empty entry (pte_none() or pmd_none() returns true). Simply drop the distinction ie remove HMM_PFN_EMPTY flag and merge now duplicate hmm_vma_walk_hole() and hm

[PATCH 07/15] mm/hmm: remove HMM_PFN_READ flag and ignore peculiar architecture v2

2018-03-22 Thread jglisse
From: Jérôme Glisse Only peculiar architecture allow write without read thus assume that any valid pfn do allow for read. Note we do not care for write only because it does make sense with thing like atomic compare and exchange or any other operations that allow you to get the memory value throug

[PATCH 08/15] mm/hmm: use uint64_t for HMM pfn instead of defining hmm_pfn_t to ulong v2

2018-03-22 Thread jglisse
From: Jérôme Glisse All device driver we care about are using 64bits page table entry. In order to match this and to avoid useless define convert all HMM pfn to directly use uint64_t. It is a first step on the road to allow driver to directly use pfn value return by HMM (saving memory and CPU cyc

[PATCH 11/15] mm/hmm: rename HMM_PFN_DEVICE_UNADDRESSABLE to HMM_PFN_DEVICE_PRIVATE

2018-03-22 Thread jglisse
From: Jérôme Glisse Make naming consistent across code, DEVICE_PRIVATE is the name use outside HMM code so use that one. Signed-off-by: Jérôme Glisse Reviewed-by: John Hubbard Cc: Evgeny Baskakov Cc: Ralph Campbell Cc: Mark Hairgrove --- include/linux/hmm.h | 4 ++-- mm/hmm.c|

[PATCH 14/15] mm/hmm: change hmm_vma_fault() to allow write fault on page basis

2018-03-22 Thread jglisse
From: Jérôme Glisse This change hmm_vma_fault() to not take a global write fault flag for a range but instead rely on caller to populate HMM pfns array with proper fault flag ie HMM_PFN_VALID if driver want read fault for that address or HMM_PFN_VALID and HMM_PFN_WRITE for write. Moreover by set

[PATCH 06/15] mm/hmm: use struct for hmm_vma_fault(), hmm_vma_get_pfns() parameters v2

2018-03-22 Thread jglisse
From: Jérôme Glisse Both hmm_vma_fault() and hmm_vma_get_pfns() were taking a hmm_range struct as parameter and were initializing that struct with others of their parameters. Have caller of those function do this as they are likely to already do and only pass this struct to both function this sho

[PATCH 15/15] mm/hmm: use device driver encoding for HMM pfn v2

2018-03-22 Thread jglisse
From: Jérôme Glisse User of hmm_vma_fault() and hmm_vma_get_pfns() provide a flags array and pfn shift value allowing them to define their own encoding for HMM pfn that are fill inside the pfns array of the hmm_range struct. With this device driver can get pfn that match their own private encodin

[PATCH 13/15] mm/hmm: factor out pte and pmd handling to simplify hmm_vma_walk_pmd() v2

2018-03-22 Thread jglisse
From: Jérôme Glisse No functional change, just create one function to handle pmd and one to handle pte (hmm_vma_handle_pmd() and hmm_vma_handle_pte()). Changed since v1: - s/pfns/pfn for pte as in that case we are dealing with a single pfn Signed-off-by: Jérôme Glisse Reviewed-by: John Hubba

[PATCH 03/15] mm/hmm: HMM should have a callback before MM is destroyed v3

2018-03-22 Thread jglisse
From: Ralph Campbell The hmm_mirror_register() function registers a callback for when the CPU pagetable is modified. Normally, the device driver will call hmm_mirror_unregister() when the process using the device is finished. However, if the process exits uncleanly, the struct_mm can be destroyed

[PATCH 00/15] hmm: fixes and documentations v4

2018-03-22 Thread jglisse
From: Jérôme Glisse Fixes and improvement to HMM only impact HMM user. Changes since the last posting: Allow release callback to wait on any device driver workqueue which is processing fault without having to worry for deadlock. Some driver do call migrate_vma() from their page fault workqueue w

[PATCH 02/15] mm/hmm: fix header file if/else/endif maze v2

2018-03-22 Thread jglisse
From: Jérôme Glisse The #if/#else/#endif for IS_ENABLED(CONFIG_HMM) were wrong. Because of this after multiple include there was multiple definition of both hmm_mm_init() and hmm_mm_destroy() leading to build failure if HMM was enabled (CONFIG_HMM set). Changed since v1: - Fix the maze when CO

[PATCH 04/15] mm/hmm: unregister mmu_notifier when last HMM client quit v3

2018-03-22 Thread jglisse
From: Jérôme Glisse This code was lost in translation at one point. This properly call mmu_notifier_unregister_no_release() once last user is gone. This fix the zombie mm_struct as without this patch we do not drop the refcount we have on it. Changed since v1: - close race window between a las

[PATCH 05/15] mm/hmm: hmm_pfns_bad() was accessing wrong struct

2018-03-22 Thread jglisse
From: Jérôme Glisse The private field of mm_walk struct point to an hmm_vma_walk struct and not to the hmm_range struct desired. Fix to get proper struct pointer. Signed-off-by: Jérôme Glisse Cc: sta...@vger.kernel.org Cc: Evgeny Baskakov Cc: Ralph Campbell Cc: Mark Hairgrove Cc: John Hubbar

[PATCH 01/15] mm/hmm: documentation editorial update to HMM documentation

2018-03-22 Thread jglisse
From: Ralph Campbell This patch updates the documentation for HMM to fix minor typos and phrasing to be a bit more readable. Signed-off-by: Ralph Campbell Signed-off-by: Jérôme Glisse Cc: Stephen Bates Cc: Jason Gunthorpe Cc: Logan Gunthorpe Cc: Evgeny Baskakov Cc: Mark Hairgrove Cc: Joh

[RFC PATCH 0/3] mmu_notifier contextual information

2018-03-23 Thread jglisse
From: Jérôme Glisse This patchset are the improvements to mmu_notifier i wish to discuss at next LSF/MM. I am sending now to give time to people to look at them and think about them. git://people.freedesktop.org/~glisse/linux mmu-notifier-rfc https://cgit.freedesktop.org/~glisse/linux/log/?h=mmu

[RFC PATCH 1/3] mm/mmu_notifier: use struct for invalidate_range_start/end parameters

2018-03-23 Thread jglisse
From: Jérôme Glisse Using a struct for mmu_notifier_invalidate_range_start()|end() allows to add more parameters in the future without having to change every call sites or every callback. They are no functional change with this patch. Signed-off-by: Jérôme Glisse Cc: David Rientjes Cc: Joerg R

[RFC PATCH 3/3] mm/mmu_notifier: keep track of ranges being invalidated

2018-03-23 Thread jglisse
From: Jérôme Glisse This keep a list of all virtual address range being invalidated (ie inside a mmu_notifier_invalidate_range_start/end section). Also add an helper to check if a range is under going such invalidation. With this it easy for a concurrent thread to ignore invalidation that do not

[RFC PATCH 2/3] mm/mmu_notifier: provide context information about range invalidation

2018-03-23 Thread jglisse
From: Jérôme Glisse This patch just add the information it does not introduce any optimi- zation, thus there are no functional change with this patch. The mmu_notifier callback for range invalidation happens for a number of reasons. Provide some context information to callback to allow for optim

[PATCH] Documentation/vm/hmm.txt: typos and syntaxes fixes

2018-04-09 Thread jglisse
From: Jérôme Glisse This fix typos and syntaxes, thanks to Randy Dunlap for pointing them out (they were all my faults). Signed-off-by: Jérôme Glisse Cc: Randy Dunlap Cc: Ralph Campbell Cc: Andrew Morton --- Documentation/vm/hmm.txt | 108 +++ 1 f

[PATCH v2 0/3] mmu notifier contextual informations

2018-12-04 Thread jglisse
From: Jérôme Glisse Changes since v1: - Fixed the case where mmu notifier is not enabled and avoid wasting memory and resource when that is the case. - Fixed bug in migrate code. - Use kernel doc format for describing kernel enum v1 cover letter: This patchset add contextual information, wh

[PATCH v2 2/3] mm/mmu_notifier: use structure for invalidate_range_start/end calls v2

2018-12-04 Thread jglisse
From: Jérôme Glisse To avoid having to change many call sites everytime we want to add a parameter use a structure to group all parameters for the mmu_notifier invalidate_range_start/end cakks. No functional changes with this patch. Changes since v1: - introduce mmu_notifier_range_init() as

[PATCH] dma-buf: fix debugfs versus rcu and fence dumping v2

2018-12-06 Thread jglisse
From: Jérôme Glisse The debugfs take reference on fence without dropping them. Also the rcu section are not well balance. Fix all that ... Changed since v1: - moved fobj logic around to be rcu safe Signed-off-by: Jérôme Glisse Cc: Christian König Cc: Daniel Vetter Cc: Sumit Semwal Cc: l

[PATCH] dma-buf: balance refcount inbalance

2018-12-06 Thread jglisse
From: Jérôme Glisse The debugfs take reference on fence without dropping them. Signed-off-by: Jérôme Glisse Cc: Christian König Cc: Daniel Vetter Cc: Sumit Semwal Cc: linux-me...@vger.kernel.org Cc: dri-de...@lists.freedesktop.org Cc: linaro-mm-...@lists.linaro.org Cc: Stéphane Marchesin Cc

[RFC PATCH 01/14] mm/hms: heterogeneous memory system (sysfs infrastructure)

2018-12-03 Thread jglisse
From: Jérôme Glisse System with complex memory topology needs a more versatile memory topology description than just node where a node is a collection of memory and CPU. In heterogeneous memory system we consider four types of object: - target: which is any kind of memory - initiator:

[RFC PATCH 00/14] Heterogeneous Memory System (HMS) and hbind()

2018-12-03 Thread jglisse
From: Jérôme Glisse Heterogeneous memory system are becoming more and more the norm, in those system there is not only the main system memory for each node, but also device memory and|or memory hierarchy to consider. Device memory can comes from a device like GPU, FPGA, ... or from a memory only

[RFC PATCH 02/14] mm/hms: heterogenenous memory system (HMS) documentation

2018-12-03 Thread jglisse
From: Jérôme Glisse Add documentation to what is HMS and what it is for (see patch content). Signed-off-by: Jérôme Glisse Cc: Rafael J. Wysocki Cc: Ross Zwisler Cc: Dan Williams Cc: Dave Hansen Cc: Haggai Eran Cc: Balbir Singh Cc: Aneesh Kumar K.V Cc: Benjamin Herrenschmidt Cc: Felix Ku

[RFC PATCH 03/14] mm/hms: add target memory to heterogeneous memory system infrastructure

2018-12-03 Thread jglisse
From: Jérôme Glisse A target is some kind of memory, it can be regular main memory or some more specialize memory like CPU's HBM (High Bandwidth Memory) or some device's memory. Some target memory might not be accessible by all initiators (anything that can trigger memory access). For instance s

[RFC PATCH 11/14] mm/hbind: add bind command to heterogeneous memory policy

2018-12-03 Thread jglisse
From: Jérôme Glisse This patch add bind command to hbind() ioctl, this allow to bind a range of virtual address to given list of target memory. New memory allocated in the range will try to use memory from the target memory list. Note that this patch does not modify existing page fault path and

[RFC PATCH 04/14] mm/hms: add initiator to heterogeneous memory system infrastructure

2018-12-03 Thread jglisse
From: Jérôme Glisse An initiator is anything that can initiate memory access, either a CPU or a device. Here CPUs and devices are treated as equals. See HMS Documentation/vm/hms.txt for further detail.. Signed-off-by: Jérôme Glisse Cc: Rafael J. Wysocki Cc: Ross Zwisler Cc: Dan Williams Cc:

[RFC PATCH 06/14] mm/hms: add bridge to heterogeneous memory system infrastructure

2018-12-03 Thread jglisse
From: Jérôme Glisse A bridge connect two links with each others and apply only to listed initiators. With links, this allows to describe any kind of system topology ie any kind of directed graph. Moreover with bridges the userspace can choose to use different bridges to load balance bandwidth us

[RFC PATCH 07/14] mm/hms: register main memory with heterogenenous memory system

2018-12-03 Thread jglisse
From: Jérôme Glisse Register main memory as target under HMS scheme. Memory is registered per node (one target device per node). We also create a default link to connect main memory and CPU that are in the same node. For details see Documentation/vm/hms.rst. This is done to allow application to

[RFC PATCH 10/14] mm/hbind: add heterogeneous memory policy tracking infrastructure

2018-12-03 Thread jglisse
From: Jérôme Glisse This patch add infrastructure to track heterogeneous memory policy within the kernel. Policy are defined over range of virtual address of a process and attach to the correspond mm_struct. User can reset to default policy for range of virtual address using hbind() default comm

[RFC PATCH 09/14] mm/hms: hbind() for heterogeneous memory system (aka mbind() for HMS)

2018-12-03 Thread jglisse
From: Jérôme Glisse With the advance of heterogeneous computing and the new kind of memory topology that are now becoming more widespread (CPU HBM, persistent memory, ...). We no longer just have a flat memory topology inside a numa node. Instead there is a hierarchy of memory for instance HBM fo

[RFC PATCH 12/14] mm/hbind: add migrate command to hbind() ioctl

2018-12-03 Thread jglisse
From: Jérôme Glisse This patch add migrate commands to hbind() ioctl, user space can use this commands to migrate a range of virtual address to list of target memory. This does not change the policy for the range, it also ignores any of the existing policy range, it does not changes the policy f

[RFC PATCH 08/14] mm/hms: register main CPUs with heterogenenous memory system

2018-12-03 Thread jglisse
From: Jérôme Glisse Register CPUs as initiator under HMS scheme. CPUs are registered per node (one initiator device per node per CPU). We also add the CPU to the node default link so it is connected to main memory for the node. For details see Documentation/vm/hms.rst. Signed-off-by: Jérôme Glis

[RFC PATCH 13/14] drm/nouveau: register GPU under heterogeneous memory system

2018-12-03 Thread jglisse
From: Jérôme Glisse This register NVidia GPU under heterogeneous memory system so that one can use the GPU memory with new syscall like hbind() for compute work load. Signed-off-by: Jérôme Glisse --- drivers/gpu/drm/nouveau/Kbuild| 1 + drivers/gpu/drm/nouveau/nouveau_hms.c | 80 +

[RFC PATCH 05/14] mm/hms: add link to heterogeneous memory system infrastructure

2018-12-03 Thread jglisse
From: Jérôme Glisse A link connect initiators (CPUs or devices) and targets memory with each others. It does necessarily match one to one with a physical inter-connect ie a given physical inter-connect by be presented as multiple links or multiple physical inter-connect can be presented as just o

[RFC PATCH 14/14] test/hms: tests for heterogeneous memory system

2018-12-03 Thread jglisse
From: Jérôme Glisse Set of tests for heterogeneous memory system (migration, binding, ...) Signed-off-by: Jérôme Glisse --- tools/testing/hms/Makefile| 17 ++ tools/testing/hms/hbind-create-device-file.sh | 11 + tools/testing/hms/test-hms-migrate.c | 77 ++

[PATCH v3 1/3] mm/mmu_notifier: use structure for invalidate_range_start/end callback v2

2018-12-13 Thread jglisse
From: Jérôme Glisse To avoid having to change many callback definition everytime we want to add a parameter use a structure to group all parameters for the mmu_notifier invalidate_range_start/end callback. No functional changes with this patch. Changed since v1: - fix make htmldocs warning i

[PATCH v3 0/3] mmu notifier contextual informations

2018-12-13 Thread jglisse
From: Jérôme Glisse Changes since v2: - fix build warning with CONFIG_MMU_NOTIFIER=n - fix make htmldocs warning Changes since v1: - fix build with CONFIG_MMU_NOTIFIER=n - kernel docs Original cover letter: This patchset add contextual information, why an invalidation is happening, to mmu

[PATCH v3 2/3] mm/mmu_notifier: use structure for invalidate_range_start/end calls v3

2018-12-13 Thread jglisse
From: Jérôme Glisse To avoid having to change many call sites everytime we want to add a parameter use a structure to group all parameters for the mmu_notifier invalidate_range_start/end cakks. No functional changes with this patch. Changes since v2: - fix build warning in migrate.c when CON

[PATCH v3 3/3] mm/mmu_notifier: contextual information for event triggering invalidation v2

2018-12-13 Thread jglisse
From: Jérôme Glisse CPU page table update can happens for many reasons, not only as a result of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also as a result of kernel activities (memory compression, reclaim, migration, ...). Users of mmu notifier API track changes to the CPU p

[PATCH] mm/thp: fix call to mmu_notifier in set_pmd_migration_entry()

2018-10-12 Thread jglisse
From: Jérôme Glisse Inside set_pmd_migration_entry() we are holding page table locks and thus we can not sleep so we can not call invalidate_range_start/end() So remove call to mmu_notifier_invalidate_range_start/end() and add call to mmu_notifier_invalidate_range(). Note that we are already cal

[PATCH] mm/thp: fix call to mmu_notifier in set_pmd_migration_entry() v2

2018-10-12 Thread jglisse
From: Jérôme Glisse Inside set_pmd_migration_entry() we are holding page table locks and thus we can not sleep so we can not call invalidate_range_start/end() So remove call to mmu_notifier_invalidate_range_start/end() because they are call inside the function calling set_pmd_migration_entry() (

[PATCH 1/6] mm/hmm: fix utf8 ...

2018-10-19 Thread jglisse
From: Jérôme Glisse Somehow utf=8 must have been broken. Signed-off-by: Jérôme Glisse Cc: Andrew Morton --- include/linux/hmm.h | 2 +- mm/hmm.c| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 4c92e3ba3e16..1ff4

[PATCH 6/6] mm/hmm: invalidate device page table at start of invalidation

2018-10-19 Thread jglisse
From: Jérôme Glisse Invalidate device page table at start of invalidation and invalidate in progress CPU page table snapshooting at both start and end of any invalidation. This is helpful when device need to dirty page because the device page table report the page as dirty. Dirtying page must ha

[PATCH 5/6] mm/hmm: use a structure for update callback parameters v2

2018-10-19 Thread jglisse
From: Jérôme Glisse Use a structure to gather all the parameters for the update callback. This make it easier when adding new parameters by avoiding having to update all callback function signature. The hmm_update structure is always associated with a mmu_notifier callbacks so we are not planing

[PATCH 2/6] mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly v3

2018-10-19 Thread jglisse
From: Ralph Campbell Private ZONE_DEVICE pages use a special pte entry and thus are not present. Properly handle this case in map_pte(), it is already handled in check_pte(), the map_pte() part was lost in some rebase most probably. Without this patch the slow migration path can not migrate back

[PATCH 4/6] mm/hmm: properly handle migration pmd v3

2018-10-19 Thread jglisse
From: Jérôme Glisse Before this patch migration pmd entry (!pmd_present()) would have been treated as a bad entry (pmd_bad() returns true on migration pmd entry). The outcome was that device driver would believe that the range covered by the pmd was bad and would either SIGBUS or simply kill all

[PATCH 3/6] mm/hmm: fix race between hmm_mirror_unregister() and mmu_notifier callback

2018-10-19 Thread jglisse
From: Ralph Campbell In hmm_mirror_unregister(), mm->hmm is set to NULL and then mmu_notifier_unregister_no_release() is called. That creates a small window where mmu_notifier can call mmu_notifier_ops with mm->hmm equal to NULL. Fix this by first unregistering mmu notifier callbacks and then set

[PATCH 0/6] HMM updates, improvements and fixes v2

2018-10-19 Thread jglisse
From: Jérôme Glisse [Andrew this is for 4.20, stable fixes as cc to stable] Few fixes that only affect HMM users. Improve the synchronization call back so that we match was other mmu_notifier listener do and add proper support to the new blockable flags in the process. For curious folks here ar

[RFC PATCH 05/79] mm/swap: add an helper to get address_space from swap_entry_t

2018-04-04 Thread jglisse
From: Jérôme Glisse Each swap entry is associated to a file and thus an address_space. That address_space is use for reading/writing to swap storage. This patch add an helper to get the address_space from swap_entry_t. Signed-off-by: Jérôme Glisse Cc: Michal Hocko Cc: Johannes Weiner Cc: Andr

[RFC PATCH 78/79] mm/ksm: rename PAGE_MAPPING_KSM to PAGE_MAPPING_RONLY

2018-04-04 Thread jglisse
From: Jérôme Glisse This just rename all KSM specific helper to generic page read only name. No functional change. Signed-off-by: Jérôme Glisse Cc: Andrea Arcangeli --- fs/proc/page.c | 2 +- include/linux/page-flags.h | 30 +- mm/ksm.c

[RFC PATCH 77/79] mm/ksm: hide set_page_stable_node() and page_stable_node()

2018-04-04 Thread jglisse
From: Jérôme Glisse Hiding this 2 functions as preparatory step for generalizing ksm write protection to other users. Moreover those two helpers can not be use meaningfully outside ksm.c as the struct they deal with is defined inside ksm.c. Signed-off-by: Jérôme Glisse Cc: Andrea Arcangeli ---

[RFC PATCH 68/79] mm/vma_address: convert page's index lookup to be against specific mapping

2018-04-04 Thread jglisse
From: Jérôme Glisse Pass down the mapping ... Signed-off-by: Jérôme Glisse Cc: Andrew Morton Cc: Mel Gorman Cc: linux...@kvack.org Cc: Alexander Viro Cc: linux-fsde...@vger.kernel.org --- mm/internal.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/internal.h b/mm

[RFC PATCH 74/79] mm/page_ronly: add config option for generic read only page framework.

2018-04-04 Thread jglisse
From: Jérôme Glisse It's really just a config option patch. Signed-off-by: Jérôme Glisse Cc: Andrea Arcangeli --- mm/Kconfig | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index c782e8fb7235..aeffb6e8dd21 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -149,6 +149,

[RFC PATCH 76/79] mm/ksm: have ksm select PAGE_RONLY config.

2018-04-04 Thread jglisse
From: Jérôme Glisse Signed-off-by: Jérôme Glisse Cc: Andrea Arcangeli --- mm/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/Kconfig b/mm/Kconfig index aeffb6e8dd21..6994a1fdf847 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -308,6 +308,7 @@ config MMU_NOTIFIER config KSM

[RFC PATCH 73/79] mm: pass down struct address_space to set_page_dirty()

2018-04-04 Thread jglisse
From: Jérôme Glisse Pass down struct address_space to set_page_dirty() everywhere it is already available. <- @exists@ expression E; identifier F, M; @@ F(..., struct address_space * M, ...) { ... -set_page_dirty(NULL, E) +set_p

[RFC PATCH 75/79] mm/page_ronly: add page read only core structure and helpers.

2018-04-04 Thread jglisse
From: Jérôme Glisse Page read only is a generic framework for page write protection. It reuses the same mechanism as KSM by using the lower bit of the page->mapping fields, and KSM is converted to use this generic framework. Signed-off-by: Jérôme Glisse Cc: Andrea Arcangeli --- include/linux/

[RFC PATCH 72/79] mm: add struct address_space to set_page_dirty_lock()

2018-04-04 Thread jglisse
From: Jérôme Glisse For the holy crusade to stop relying on struct page mapping field, add struct address_space to set_page_dirty_lock() arguments. <- @@ identifier I1; type T1; @@ int -set_page_dirty_lock(T1 I1) +set_page_dirty

[RFC PATCH 79/79] mm/ksm: set page->mapping to page_ronly struct instead of stable_node.

2018-04-04 Thread jglisse
From: Jérôme Glisse Set page->mapping to the page_ronly struct instead of stable_node struct. There is no functional change as page_ronly is just a field of stable_node. Signed-off-by: Jérôme Glisse Cc: Andrea Arcangeli --- mm/ksm.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(

  1   2   3   >