** Summary changed: - Oops when Kerberos credentials are invalid + SUNRPC: Use after free when GSSD credentials are invalid causes oops
** Description changed: - There's a bug in Linux Kernel 5.0 which is triggered by invalid - credentials when the NFS clients is trying to aquire them via GSSD. This - affects NFS-Shares that are protected by krb5* security. They become - unusable until the system is re-booted. The problem is quite severe on - terminal servers with multiple users - some of them not caring about - refreshing their kerberos tickets. + BugLink: https://bugs.launchpad.net/bugs/1842037 - A fix is available here: + [Impact] - <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/net/sunrpc/clnt.c?id=7987b694ade8cc465ce10fb3dceaa614f13ceaf3> + There is a use after free which normally causes a null pointer + dereference when NFS clients send invalid credentials via GSSD to a NFS + server which has shares protected by kerberos krb5* security. + + The call trace is below: + + BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 + Call Trace: + rpc_check_timeout+0x22/0xe0 [sunrpc] + call_decode+0x12c/0x190 [sunrpc] + __rpc_execute+0x7a/0x340 [sunrpc] + rpc_execute+0xa3/0xb0 [sunrpc] + rpc_run_task+0xf7/0x140 [sunrpc] + nfs4_proc_get_lease_time+0xf3/0x130 [nfsv4] + nfs41_setup_state_renewal+0x3d/0x90 [nfsv4] + ? nfs4_realloc_slot_table+0x5b/0x130 [nfsv4] + ? nfs4_setup_session_slot_tables+0x77/0xc0 [nfsv4] + nfs41_finish_session_reset+0x26/0x30 [nfsv4] + nfs41_init_clientid+0x44/0x70 [nfsv4] + nfs4_establish_lease+0x61/0xa0 [nfsv4] + ? nfs4_handle_reclaim_lease_error+0x108/0x130 [nfsv4] + nfs4_state_manager+0x1b1/0x750 [nfsv4] + ? kernel_sigaction+0x43/0xe0 + nfs4_run_state_manager+0x24/0x40 [nfsv4] + kthread+0x120/0x140 + ? nfs4_state_manager+0x750/0x750 [nfsv4] + ? __kthread_parkme+0x70/0x70 + ret_from_fork+0x35/0x40 + + This then causes outages on a heavily trafficked NFS server and no one can access their shares until the server is rebooted. + + [Fix] + + The fix comes in the following two commits: + + commit cea57789e4081870ac3498fbefabbbd0d0fd8434 + Author: Trond Myklebust <trond.mykleb...@hammerspace.com> + Date: Sat Mar 9 16:06:47 2019 -0500 + Subject: SUNRPC: Clean up + + commit 7987b694ade8cc465ce10fb3dceaa614f13ceaf3 + Author: Trond Myklebust <trond.mykleb...@hammerspace.com> + Date: Wed May 29 12:49:52 2019 -0400 + Subject: SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential + + There is a small subtlety to be addressed here. + 7987b694ade8cc465ce10fb3dceaa614f13ceaf3 is marked as "Fixes" cea57789e4081870ac3498fbefabbbd0d0fd8434, and cea57789e4081870ac3498fbefabbbd0d0fd8434 is not present in the disco kernel. The code path which triggers the use after free can still be reached without cea57789e4081870ac3498fbefabbbd0d0fd8434 being present, and applying both commits resolves the problem, while still maintaining as little backporting as necessary. + + Please note, both commits have been backported. + + cea57789e4081870ac3498fbefabbbd0d0fd8434 required a small change to the + final comment, as well as some minor context adjustments in the final + hunk. + + 7987b694ade8cc465ce10fb3dceaa614f13ceaf3 required medium to heavy + backporting. The upstream patch uses a switch statement based on error + codes, while the disco kernel uses a goto and label type architecture. + The commits in the middle were numerous and completely unrelated, so I + backported the commit to use goto and labels. Please review this + backport a little more closely than you normally do, but it has been + tested, and I believe the code to be sound. + + cea57789e4081870ac3498fbefabbbd0d0fd8434 landed in 5.1 upstream. + 7987b694ade8cc465ce10fb3dceaa614f13ceaf3 landed in 5.2 upstream. + + 7987b694ade8cc465ce10fb3dceaa614f13ceaf3 was also selected for upstream + -stable, in 5.1.9, and was omitted from disco stable updates process, + probably because of the patch not cleanly applying and requiring the + backport and infrastructure provided in + cea57789e4081870ac3498fbefabbbd0d0fd8434. + + [Testcase] + + This is difficult to reproduce on test systems, and has instead been + verified on a production NFS v4.1 system in a customer environment. This + server is heavily trafficked and has a large number of different NFS + clients connected to it. + + I have built a test kernel that contains the above patch, and also + patches for Bug 1828978. It is available here: + + https://launchpad.net/~mruffell/+archive/ubuntu/sf241068-test + + Note that the above kernel is for bionic HWE, and not explicitly disco. + + Discussion about the patch validation can be found at the bottom of Bug + 1842037. + + On unpatched kernels, expect to see the symptoms mentioned in Impact, + and on patched systems, everything working as intended. + + [Regression Potential] + + The changes are limited to users of sunrpc and the change itself is + limited to cases where the RPCSEC_GSS credential is rejected. Under blue + skies scenarios, the code should only be triggered when misbehaving + clients do not keep their authentication tickets up to date. + + In case of regression, misbehaving clients may cause outages on services + which use sunrpc. In which case, the server administrator would have to + revert to an earlier kernel in order to restore those services, as you + cannot stop misbehaving clients from connecting. + + Since the fix was selected for upstream stable, it has been vetted and + widely accepted by the community. The backport I performed should have + no functional difference at all. ** Changed in: linux (Ubuntu) Status: Incomplete => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842037 Title: SUNRPC: Use after free when GSSD credentials are invalid causes oops Status in linux package in Ubuntu: Fix Released Status in linux source package in Disco: In Progress Bug description: BugLink: https://bugs.launchpad.net/bugs/1842037 [Impact] There is a use after free which normally causes a null pointer dereference when NFS clients send invalid credentials via GSSD to a NFS server which has shares protected by kerberos krb5* security. The call trace is below: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 Call Trace: rpc_check_timeout+0x22/0xe0 [sunrpc] call_decode+0x12c/0x190 [sunrpc] __rpc_execute+0x7a/0x340 [sunrpc] rpc_execute+0xa3/0xb0 [sunrpc] rpc_run_task+0xf7/0x140 [sunrpc] nfs4_proc_get_lease_time+0xf3/0x130 [nfsv4] nfs41_setup_state_renewal+0x3d/0x90 [nfsv4] ? nfs4_realloc_slot_table+0x5b/0x130 [nfsv4] ? nfs4_setup_session_slot_tables+0x77/0xc0 [nfsv4] nfs41_finish_session_reset+0x26/0x30 [nfsv4] nfs41_init_clientid+0x44/0x70 [nfsv4] nfs4_establish_lease+0x61/0xa0 [nfsv4] ? nfs4_handle_reclaim_lease_error+0x108/0x130 [nfsv4] nfs4_state_manager+0x1b1/0x750 [nfsv4] ? kernel_sigaction+0x43/0xe0 nfs4_run_state_manager+0x24/0x40 [nfsv4] kthread+0x120/0x140 ? nfs4_state_manager+0x750/0x750 [nfsv4] ? __kthread_parkme+0x70/0x70 ret_from_fork+0x35/0x40 This then causes outages on a heavily trafficked NFS server and no one can access their shares until the server is rebooted. [Fix] The fix comes in the following two commits: commit cea57789e4081870ac3498fbefabbbd0d0fd8434 Author: Trond Myklebust <trond.mykleb...@hammerspace.com> Date: Sat Mar 9 16:06:47 2019 -0500 Subject: SUNRPC: Clean up commit 7987b694ade8cc465ce10fb3dceaa614f13ceaf3 Author: Trond Myklebust <trond.mykleb...@hammerspace.com> Date: Wed May 29 12:49:52 2019 -0400 Subject: SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential There is a small subtlety to be addressed here. 7987b694ade8cc465ce10fb3dceaa614f13ceaf3 is marked as "Fixes" cea57789e4081870ac3498fbefabbbd0d0fd8434, and cea57789e4081870ac3498fbefabbbd0d0fd8434 is not present in the disco kernel. The code path which triggers the use after free can still be reached without cea57789e4081870ac3498fbefabbbd0d0fd8434 being present, and applying both commits resolves the problem, while still maintaining as little backporting as necessary. Please note, both commits have been backported. cea57789e4081870ac3498fbefabbbd0d0fd8434 required a small change to the final comment, as well as some minor context adjustments in the final hunk. 7987b694ade8cc465ce10fb3dceaa614f13ceaf3 required medium to heavy backporting. The upstream patch uses a switch statement based on error codes, while the disco kernel uses a goto and label type architecture. The commits in the middle were numerous and completely unrelated, so I backported the commit to use goto and labels. Please review this backport a little more closely than you normally do, but it has been tested, and I believe the code to be sound. cea57789e4081870ac3498fbefabbbd0d0fd8434 landed in 5.1 upstream. 7987b694ade8cc465ce10fb3dceaa614f13ceaf3 landed in 5.2 upstream. 7987b694ade8cc465ce10fb3dceaa614f13ceaf3 was also selected for upstream -stable, in 5.1.9, and was omitted from disco stable updates process, probably because of the patch not cleanly applying and requiring the backport and infrastructure provided in cea57789e4081870ac3498fbefabbbd0d0fd8434. [Testcase] This is difficult to reproduce on test systems, and has instead been verified on a production NFS v4.1 system in a customer environment. This server is heavily trafficked and has a large number of different NFS clients connected to it. I have built a test kernel that contains the above patch, and also patches for Bug 1828978. It is available here: https://launchpad.net/~mruffell/+archive/ubuntu/sf241068-test Note that the above kernel is for bionic HWE, and not explicitly disco. Discussion about the patch validation can be found at the bottom of Bug 1842037. On unpatched kernels, expect to see the symptoms mentioned in Impact, and on patched systems, everything working as intended. [Regression Potential] The changes are limited to users of sunrpc and the change itself is limited to cases where the RPCSEC_GSS credential is rejected. Under blue skies scenarios, the code should only be triggered when misbehaving clients do not keep their authentication tickets up to date. In case of regression, misbehaving clients may cause outages on services which use sunrpc. In which case, the server administrator would have to revert to an earlier kernel in order to restore those services, as you cannot stop misbehaving clients from connecting. Since the fix was selected for upstream stable, it has been vetted and widely accepted by the community. The backport I performed should have no functional difference at all. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1842037/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp