On Thu, 21 Nov 2024 at 19:11, Josh Poimboeuf wrote:
>
> On Thu, Nov 21, 2024 at 05:02:06PM -0800, Linus Torvalds wrote:
> > [ Time passes ]
> >
> > Ugh. I tried it. It looks like this:
> >
> > #define inlined_get_user(res, ptr) ({ \
> > __label__ fail2, fail1;
John Hubbard writes:
> On 11/21/24 5:40 PM, Alistair Popple wrote:
>> Prior to freeing a block file systems supporting FS DAX must check
>> that the associated pages are both unmapped from user-space and not
>> undergoing DMA or other access from eg. get_user_pages(). This is
>> achieved by unm
On 11/21/24 5:40 PM, Alistair Popple wrote:
Longterm pinning of FS DAX pages should already be disallowed by
various pXX_devmap checks. However a future change will cause these
checks to be invalid for FS DAX pages so make
folio_is_longterm_pinnable() return false for FS DAX pages.
Signed-off-by
On Thu, Nov 21, 2024 at 05:02:06PM -0800, Linus Torvalds wrote:
> [ Time passes ]
>
> Ugh. I tried it. It looks like this:
>
> #define inlined_get_user(res, ptr) ({ \
> __label__ fail2, fail1; \
> __auto_type __up = (ptr);
On 11/21/24 5:40 PM, Alistair Popple wrote:
Prior to freeing a block file systems supporting FS DAX must check
that the associated pages are both unmapped from user-space and not
undergoing DMA or other access from eg. get_user_pages(). This is
achieved by unmapping the file range and scanning th
Main updates since v2:
- Rename the DAX specific dax_insert_XXX functions to vmf_insert_XXX
and have them pass the vmf struct.
- Seperate out the device DAX changes.
- Restore the page share mapping counting and associated warnings.
- Rework truncate to require file-systems to have previ
Currently DAX folio/page reference counts are managed differently to
normal pages. To allow these to be managed the same as normal pages
introduce vmf_insert_folio_pud. This will map the entire PUD-sized folio
and take references as it would for a normally mapped page.
This is distinct from the cu
As the snd_soc_card_get_kcontrol() is updated to use
snd_ctl_find_id_mixer() in
commit 897cc72b0837 ("ASoC: soc-card: Use
snd_ctl_find_id_mixer() instead of open-coding")
which make the iface fix to be IFACE_MIXER.
if driver need to use snd_soc_card_get_kcontrol()
the id.type need to be IFACE_MIXE
On Thu, Nov 21, 2024 at 10:50 PM Shengjiu Wang wrote:
>
> As the snd_soc_card_get_kcontrol() is updated to use
> snd_ctl_find_id_mixer() in
> commit 897cc72b0837 ("ASoC: soc-card: Use
> snd_ctl_find_id_mixer() instead of open-coding")
> which make the iface fix to be IFACE_MIXER.
Should this have
File systems call dax_break_mapping() prior to reallocating file
system blocks to ensure the page is not undergoing any DMA or other
accesses. Generally this is needed when a file is truncated to ensure
that if a block is reallocated nothing is writing to it. However
filesystems currently don't cal
Prior to freeing a block file systems supporting FS DAX must check
that the associated pages are both unmapped from user-space and not
undergoing DMA or other access from eg. get_user_pages(). This is
achieved by unmapping the file range and scanning the FS DAX
page-cache to see if any pages within
As the snd_soc_card_get_kcontrol() is updated to use
snd_ctl_find_id_mixer() in
commit 897cc72b0837 ("ASoC: soc-card: Use
snd_ctl_find_id_mixer() instead of open-coding")
which make the iface fix to be IFACE_MIXER.
Signed-off-by: Shengjiu Wang
---
sound/soc/fsl/fsl_spdif.c | 2 +-
1 file changed
As the snd_soc_card_get_kcontrol() is updated to use
snd_ctl_find_id_mixer() in
commit 897cc72b0837 ("ASoC: soc-card: Use
snd_ctl_find_id_mixer() instead of open-coding")
which make the iface fix to be IFACE_MIXER.
Signed-off-by: Shengjiu Wang
---
sound/soc/fsl/fsl_xcvr.c | 2 +-
1 file changed,
The devmap PTE special bit was used to detect mappings of FS DAX
pages. This tracking was required to ensure the generic mm did not
manipulate the page reference counts as FS DAX implemented it's own
reference counting scheme.
Now that FS DAX pages have their references counted the same way as
nor
Now that DAX and all other reference counts to ZONE_DEVICE pages are
managed normally there is no need for the special devmap PTE/PMD/PUD
page table bits. So drop all references to these, freeing up a
software defined page table bit on architectures supporting it.
Signed-off-by: Alistair Popple
A
DEVMAP PTEs are no longer required to support ZONE_DEVICE so remove
them.
Signed-off-by: Alistair Popple
Suggested-by: Chunyan Zhang
---
arch/riscv/Kconfig| 1 -
arch/riscv/include/asm/pgtable-64.h | 20
arch/riscv/include/asm/pgtable-bits.h | 1 -
a
Currently fs dax pages are considered free when the refcount drops to
one and their refcounts are not increased when mapped via PTEs or
decreased when unmapped. This requires special logic in mm paths to
detect that these pages should not be properly refcounted, and to
detect when the refcount drop
Device DAX pages are currently not reference counted when mapped,
instead relying on the devmap PTE bit to ensure mapping code will not
get/put references. This requires special handling in various page
table walkers, particularly GUP, to manage references on the
underlying pgmap to ensure the page
memcontrol currently ignores device dax and fs dax pages because these
pages are considered special. To maintain existing behaviour once
these pages are treated as normal pages and returned from
vm_normal_page() add a test to explicitly skip charging them.
Signed-off-by: Alistair Popple
---
mm/m
At present mlock skips ptes mapping ZONE_DEVICE pages. A future change
to remove pmd_devmap will allow pmd_trans_huge_lock() to return
ZONE_DEVICE folios so make sure we continue to skip those.
Signed-off-by: Alistair Popple
---
mm/mlock.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm
Longterm pinning of FS DAX pages should already be disallowed by
various pXX_devmap checks. However a future change will cause these
checks to be invalid for FS DAX pages so make
folio_is_longterm_pinnable() return false for FS DAX pages.
Signed-off-by: Alistair Popple
---
include/linux/mm.h | 4
The procfs mmu files such as smaps currently ignore device dax and fs
dax pages because these pages are considered special. To maintain
existing behaviour once these pages are treated as normal pages and
returned from vm_normal_page() add tests to explicitly skip them.
Signed-off-by: Alistair Popp
Add helpers to determine if a page or folio is a device dax or fs dax
page or folio.
Signed-off-by: Alistair Popple
---
include/linux/memremap.h | 22 ++
1 file changed, 22 insertions(+)
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 0256a42..f2a8d13
Currently DAX folio/page reference counts are managed differently to
normal pages. To allow these to be managed the same as normal pages
introduce vmf_insert_folio_pmd. This will map the entire PMD-sized folio
and take references as it would for a normally mapped page.
This is distinct from the cu
In preparation for using insert_page() for DAX, enhance
insert_page_into_pte_locked() to handle establishing writable
mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a
PTE which bypasses the typical set_pte_range() in finish_fault.
Signed-off-by: Alistair Popple
Suggested-by:
Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This
creates a special devmap PTE entry for the pfn but does not take a
reference on the underlying struct page for the mapping. This is
because DAX page refcounts are treated specially, as indicated by the
presence of a devmap entry.
PAGE_MAPPING_DAX_SHARED is the same as PAGE_MAPPING_ANON. This isn't
currently a problem because FS DAX pages are treated
specially. However a future change will make FS DAX pages more like
normal pages, so folio_test_anon() must not return true for a FS DAX
page.
We could explicitly test for a FS
Zone device pages are used to represent various type of device memory
managed by device drivers. Currently compound zone device pages are
not supported. This is because MEMORY_DEVICE_FS_DAX pages are the only
user of higher order zone device pages and have their own page
reference counting.
A futu
PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the
check in __gup_device_huge() is redundant. Remove it
Signed-off-by: Alistair Popple
Reviewed-by: Jason Gunthorpe
Reviewed-by: Dan Wiliams
Acked-by: David Hildenbrand
---
mm/gup.c | 5 -
1 file changed, 5 deletions(-)
diff
The reference counts for ZONE_DEVICE private pages should be
initialised by the driver when the page is actually allocated by the
driver allocator, not when they are first created. This is currently
the case for MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_COHERENT pages
but not MEMORY_DEVICE_PCI_P2PDMA
Prior to any truncation operations file systems call
dax_break_mapping() to ensure pages in the range are not under going
DMA. Later DAX page-cache entries will be removed by
truncate_folio_batch_exceptionals() in the generic page-cache code.
However this makes it possible for folios to be removed
A FS DAX page is considered idle when its refcount drops to one. This
is currently open-coded in all file systems supporting FS DAX. Move
the idle detection to a common function to make future changes easier.
Signed-off-by: Alistair Popple
Reviewed-by: Jan Kara
Reviewed-by: Christoph Hellwig
Re
Several functions internal to FS DAX use the following pattern when
trying to obtain an unlocked entry:
xas_for_each(&xas, entry, end_idx) {
if (dax_is_locked(entry))
entry = get_unlocked_entry(&xas, 0);
This is problematic because get_unlocked_entry() will get the next
pr
dax_layout_busy_page_range() is used by file systems to scan the DAX
page-cache to unmap mapping pages from user-space and to determine if
any pages in the given range are busy, either due to ongoing DMA or
other get_user_pages() usage.
Currently it checks to see the file mapping is mapped into us
FS DAX requires file systems to call into the DAX layout prior to
unlinking inodes to ensure there is no ongoing DMA or other remote
access to the direct mapped page. The fuse file system implements
fuse_dax_break_layouts() to do this which includes a comment
indicating that passing dmap_end == 0 l
On Thu, 21 Nov 2024 at 16:12, Josh Poimboeuf wrote:
>
> The asm looks good, but the C exploded a bit... why not just have an
> inline get_user()?
That was originally one of my goals for the "unsafe" ones - if done
right, they'd be the proper building blocks for a get_user(), and we'd
only really
On Thu, Nov 21, 2024 at 02:16:12PM -0800, Linus Torvalds wrote:
> mov%gs:0x0,%rax # current
> incl 0x1a9c(%rax) # current->pagefault_disable++
> movabs $0x123456789abcdef,%rcx # magic virtual address size
> cmp%rsi,%rcx
On Thu, 21 Nov 2024 at 13:40, Josh Poimboeuf wrote:
>
> The profile is showing futex_get_value_locked():
Ahh.
> That has several callers, so we can probably just use get_user() there?
Yeah, that's the simplest thing. That thing isn't even some inline
function, so the real cost is the call.
Tha
On Thu, Nov 7, 2024 at 2:38 PM Luis Chamberlain wrote:
>
> On Wed, Nov 06, 2024 at 02:19:38PM -0800, Matthew Maurer wrote:
> > >
> > > > If booted against an old kernel, it will
> > > > behave as though there is no modversions information.
> > >
> > > Huh? This I don't get. If you have the new lib
> On Fri, 15 Nov 2024 at 15:06, Josh Poimboeuf wrote:
> So I think the thing to do is
>
> (a) find out which __get_user() it is that matters so much for that load
>
> Do you have a profile somewhere?
>
> (b) convert them to use "unsafe_get_user()", with that whole
>
> if (can
Vaibhav Jain writes:
> Hi Ritesh,
>
> Thanks for looking into this patch. My responses on behalf of Narayana
> below:
>
> "Ritesh Harjani (IBM)" writes:
>
>> Narayana Murty N writes:
>>
>>> The PE Reset State "0" obtained from RTAS calls
>>> ibm_read_slot_reset_[state|state2] indicates that
>>>
Hello Michael,
> The Linux CHRP code only supports a handful of machines, all 32-bit, eg.
> IBM B50, bplan/Genesi Pegasos/Pegasos2, Total Impact briQ, and possibly
> some from Motorola? No Apple machines should be affected.
I have a Pegasos 2 and I planned on keeping it.
Have you asked among the
>> Pegasos2 users still exist, but admittedly they mainly use MorphOS and
>> AmigaOS4 on these machines.
>
> The Linux CHRP support is still present in v6.12, which will be an LTS
> for the next 2 years at least, so if there's folks who occasionally boot
> Linux they will still be able to do that f
43 matches
Mail list logo