Problem Summary:
---------------

If an xattr directory inode and its xattr child inode are on the _same_ 
disposal list,
and the xattr directory inode is _before_ its xattr child inode in this 
disposal list...

Then zfs_purgedir() of the xattr directory calls zfs_zget() for the xattr child 
inode
and it loops forever -- it can only stop if the xattr child inode is 
disposed/evicted,
but it could only occur _after_ in the disposal list and current list node is 
looping...

Because zfs_zget() gets non-NULL from dmu_buf_get_user() (which could go NULL 
only in
the ZFS evict path later in disposal list) so it goes to igrab() but that 
returns NULL
(because the inode.i_state got I_FREEING), then 'goto again:', which repeats 
that over.

Function path:

shrink_slab
- do_shrink_slab
  - shrinker->scan_objects == super_cache_scan
    - prune_icache_sb
      - list_sru_shrink_walk 
        (creates disposal list with xattr dir&child inodes)
        - inode_lru_isolate(inode)
          - inode->i_state |= I_FREEING
            (problem for igrab of xattr child inode, below)
      - dispose_list 
        - evict(xattr dir inode)
          - op->evict_inode == zpl_evict_inode
            - zfs_inactive
              - zfs_zinactive
                - zfs_rmnode
                  - zfs_purgedir
                    - zfs_zget (xattr child nodes)
                      - dmu_buf_get_user (non-NULL)
                      - igrab (NULL)
                      - goto again;
        ... thus never reaching ...
        - evict(xattr child inode)
          - op->evict_inode == zpl_evict_inode
            - zfs_inactive
              - zfs_zinactive
                - zfs_znode_dmu_fini
                  - sa_handle_destroy
                    - dmu_buf_remove_user
                      (not calling this yet is a problem for dmu_buf_get_user, 
above)
                      (this would make it return NULL and not go into the igrab 
call)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1839521

Title:
  Xenial: ZFS deadlock in shrinker path with xattrs

Status in zfs-linux package in Ubuntu:
  Invalid
Status in zfs-linux source package in Xenial:
  In Progress
Status in zfs-linux source package in Bionic:
  Invalid
Status in zfs-linux source package in Disco:
  Invalid
Status in zfs-linux source package in Eoan:
  Invalid

Bug description:
  [Impact]

   * Xenial's ZFS can deadlock in the memory shrinker path
     after removing files with extended attributes (xattr).

   * Extended attributes are enabled by default, but are
     _not_ used by default, which reduces the likelyhood.

   * It's very difficult/rare to reproduce this problem,
     due to file/xattr/remove/shrinker/lru order/timing
     circumstances required. (weeks for a reporter user)
     but a synthetic test-case has been found for tests.

  [Test Case]

   * A synthetic reproducer is available for this LP,
     with a few steps to touch/setfattr/rm/drop_caches
     plus a kernel module to massage the disposal list.

   * In the original ZFS module:
     the xattr dir inode is not purged immediately on
     file removal, but possibly purged _two_ shrinker
     invocations later.  This allows for other thread
     started before file remove to call zfs_zget() on
     the xattr child inode and iput() it, so it makes
     to the same disposal list as the xattr dir inode.

   * In the modified ZFS module:
     the xattr dir inode is purged immediately on file
     removal not possibly later on shrinker invocation,
     so the problem window above doesn't exist anymore.

  [Regression Potential]

   * Low. The patches are confined to extended attributes
     in ZFS, specifically node removal/purge, and another
     change how an xattr child inode tracks its xattr dir
     (parent) inode, so that it can be purged immediately
     on removal.

   * The ZFS test-suite has been run on original/modified
     zfs-dkms package/kernel modules, with no regressions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1839521/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to