Public bug reported: == SRU Justification ==
[Impact] Oops during heavy NFS + FSCache + Cachefiles use: kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321! kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639! [Cause] 1)Two threads are trying to do operate on a cookie and two objects. 2a)One thread tries to unmount the filesystem and in process goes over a huge list of objects marking them dead and deleting the objects. cookie->usage is also decremented in following path nfs_fscache_release_super_cookie -> __fscache_relinquish_cookie ->__fscache_cookie_put ->BUG_ON(atomic_read(&cookie->usage) <= 0); 2b)second thread tries to lookup an object for reading data in following path fscache_alloc_object 1) cachefiles_alloc_object -> fscache_object_init -> assign cookie, but usage not bumped. 2) fscache_attach_object -> fails in cant_attach_object because the cookie's backing object or cookie's->parent object are going away 3)fscache_put_object -> cachefiles_put_object ->fscache_object_destroy ->fscache_cookie_put ->BUG_ON(atomic_read(&cookie->usage) <= 0); [Fix] Bump up the cookie usage in fscache_object_init, when it is first being assigned a cookie atomically such that the cookie is added and bumped up if its refcount is not zero. remove the assignment in the attach_object. [Testcase] A user has run ~100 hours of NFS stress tests and not seen this bug recur. [Regression Potential] - Limited to fscache/cachefiles. ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Description changed: == SRU Justification == [Impact] Oops during heavy NFS + FSCache + Cachefiles use: - kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321! - kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639! + kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321! + kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639! [Cause] - 1)Two threads are trying to do operate on a cookie and two objects. - 2a)One thread tries to unmount the filesystem and in process goes over - a huge list of objects marking them dead and deleting the objects. - cookie->usage is also decremented in - nfs_fscache_release_super_cookie - -> __fscache_relinquish_cookie - ->__fscache_cookie_put - ->BUG_ON(atomic_read(&cookie->usage) <= 0); + 1)Two threads are trying to do operate on a cookie and two objects. + 2a)One thread tries to unmount the filesystem and in process goes over + a huge list of objects marking them dead and deleting the objects. + cookie->usage is also decremented in following path + nfs_fscache_release_super_cookie + -> __fscache_relinquish_cookie + ->__fscache_cookie_put + ->BUG_ON(atomic_read(&cookie->usage) <= 0); - 2b)second thread tries to lookup an object for reading data in fscache_alloc_object - 1) cachefiles_alloc_object-> fscache_object_init -> assign cookie, but usage not bumped. - 2) fscache_attach_object -> fails in cant_attach_object because the cookie's backing object - or cookie's->parent object are going away - 3)fscache_put_object - -> cachefiles_put_object - ->fscache_object_destroy - ->fscache_cookie_put - ->BUG_ON(atomic_read(&cookie->usage) <= 0); + 2b)second thread tries to lookup an object for reading data in + following path + + fscache_alloc_object + 1) cachefiles_alloc_object + -> fscache_object_init + -> assign cookie, but usage not bumped. + 2) fscache_attach_object -> fails in cant_attach_object because the + cookie's backing object or cookie's->parent object are going away + 3)fscache_put_object + -> cachefiles_put_object + ->fscache_object_destroy + ->fscache_cookie_put + ->BUG_ON(atomic_read(&cookie->usage) <= 0); [Fix] - Bump up the cookie usage in fscache_object_init, - when it is first being assigned a cookie atomically such that the cookie - is added and bumped up if its refcount is not zero. - remove the assignment in the attach_object. + Bump up the cookie usage in fscache_object_init, + when it is first being assigned a cookie atomically such that the cookie + is added and bumped up if its refcount is not zero. + remove the assignment in the attach_object. [Testcase] A user has run ~100 hours of NFS stress tests and not seen this bug recur. [Regression Potential] - - Limited to fscache/cachefiles. + - Limited to fscache/cachefiles. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1776277 Title: fscache cookie refcount updated incorrectly during fscache object allocation To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs