Private bug reported:
Description:
Even with the ext4_break_layouts() support added by "ext4: handle layout
changes to pinned DAX mappings" Still seeing occasional cases with unit test
where we are truncating a page that has an elevated reference count.
Investigate.
—
The root cause of this issue is that while the ei->i_mmap_sem provides
synchronization between ext4_break_layouts() and page faults, it doesn't
provide synchronize us with the direct I/O path. This exact same issue exists
in XFS AFAICT, with the synchronization tool there being the XFS_MMAPLOCK.
This allows the direct I/O path to do I/O and raise & lower page->_refcount
while we're executing a truncate/hole punch. This leads to us trying to free
a page with an elevated refcount.
Here's one instance of the race:
CPU 0 CPU 1
----- -----
ext4_punch_hole()
ext4_break_layouts() # all pages have refcount=1
ext4_direct_IO()
... lots of layers ...
follow_page_pte()
get_page() # elevates refcount
truncate_pagecache_range()
... a few layers ...
dax_disassociate_entry() # sees elevated refcount, WARN_ON_ONCE()
A similar race occurs when the refcount is being dropped while we're running
ext4_break_layouts(), and this is the one that my test was actually hitting:
CPU 0 CPU 1
----- -----
ext4_direct_IO()
... lots of layers ...
follow_page_pte()
get_page()
elevates refcount of page X
ext4_punch_hole()
ext4_break_layouts() # two pages, X & Y, have refcount == 2
__wait_var_event() # called for page X
__put_devmap_managed_page()
drops refcount of X to 1
__wait_var_events() checks X's refcount in "if (condition)", and breaks.
We never actually called ext4_wait_dax_page(), so 'retry' in
ext4_break_layouts() is still false. Exit do/while loop in
ext4_break_layouts, never attempting to wait on page Y which still has an
elevated refcount of 2.
truncate_pagecache_range()
... a few layers ...
dax_disassociate_entry() # sees elevated refcount for Y, WARN_ON_ONCE()
Essentially the solution will most likely involve adding synchronization
between the direct I/O path and truncate/hole punch type operations, and
it'll need to happen for both ext4 and XFS, so the filesystem folks need
to be involved.
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
** Tags: intel-kernel-18.10
** Information type changed from Public to Private
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1787089
Title:
[AEP-bug] ext4: more rare direct I/O vs unmap failures
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787089/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs