On 07.04.2025 04:39, Jarkko Sakkinen wrote:
> From: Jarkko Sakkinen <jarkko.sakki...@opinsys.com>
>
> Add an isolated list of unreferenced keys to be queued for deletion, and
> try to pin the keys in the garbage collector before processing anything.
> Skip unpinnable keys.
>
> Use this list for blocking the reaping process during the teardown:
>
> 1. First off, the keys added to `keys_graveyard` are snapshotted, and the
>     list is flushed. This the very last step in `key_put()`.
> 2. `key_put()` reaches zero. This will mark key as busy for the garbage
>     collector.
> 3. `key_garbage_collector()` will try to increase refcount, which won't go
>     above zero. Whenever this happens, the key will be skipped.
>
> Cc: sta...@vger.kernel.org # v6.1+
> Signed-off-by: Jarkko Sakkinen <jarkko.sakki...@opinsys.com>

This patch landed in today's linux-next as commit b0d023797e3e ("keys: 
Add a list for unreferenced keys"). In my tests I found that it triggers 
the following lockdep issue:

================================
WARNING: inconsistent lock state
6.15.0-rc1-next-20250407 #15630 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
ksoftirqd/3/32 [HC0[0]:SC1[1]:HE1:SE0] takes:
c13fdd68 (key_serial_lock){+.?.}-{2:2}, at: key_put+0x74/0x128
{SOFTIRQ-ON-W} state was registered at:
   lock_acquire+0x134/0x384
   _raw_spin_lock+0x38/0x48
   key_alloc+0x2fc/0x4d8
   keyring_alloc+0x40/0x90
   system_trusted_keyring_init+0x50/0x7c
   do_one_initcall+0x68/0x314
   kernel_init_freeable+0x1c0/0x224
   kernel_init+0x1c/0x12c
   ret_from_fork+0x14/0x28
irq event stamp: 234
hardirqs last  enabled at (234): [<c0cb7060>] 
_raw_spin_unlock_irqrestore+0x5c/0x60
hardirqs last disabled at (233): [<c0cb6dd0>] 
_raw_spin_lock_irqsave+0x64/0x68
softirqs last  enabled at (42): [<c013bcd8>] handle_softirqs+0x328/0x520
softirqs last disabled at (47): [<c013bf10>] run_ksoftirqd+0x40/0x68

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(key_serial_lock);
   <Interrupt>
     lock(key_serial_lock);

  *** DEADLOCK ***

1 lock held by ksoftirqd/3/32:
  #0: c137c040 (rcu_callback){....}-{0:0}, at: rcu_core+0x4d0/0x14a4

stack backtrace:
CPU: 3 UID: 0 PID: 32 Comm: ksoftirqd/3 Not tainted 
6.15.0-rc1-next-20250407 #15630 PREEMPT
Hardware name: Samsung Exynos (Flattened Device Tree)
Call trace:
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x68/0x88
  dump_stack_lvl from print_usage_bug.part.0+0x24c/0x270
  print_usage_bug.part.0 from mark_lock.part.0+0xc20/0x129c
  mark_lock.part.0 from __lock_acquire+0xafc/0x2970
  __lock_acquire from lock_acquire+0x134/0x384
  lock_acquire from _raw_spin_lock+0x38/0x48
  _raw_spin_lock from key_put+0x74/0x128
  key_put from put_cred_rcu+0x20/0xd0
  put_cred_rcu from rcu_core+0x478/0x14a4
  rcu_core from handle_softirqs+0x130/0x520
  handle_softirqs from run_ksoftirqd+0x40/0x68
  run_ksoftirqd from smpboot_thread_fn+0x17c/0x330
  smpboot_thread_fn from kthread+0x138/0x25c
  kthread from ret_from_fork+0x14/0x28
Exception stack(0xf090dfb0 to 0xf090dff8)
...

To fix this issue I had to change all calls around key_serial_lock and 
key_graveyard_lock spinlocks with the irqsave/irqrestore variants (in 
security/keys/key.c and security/keys/gc.c), but I'm not sure if this is 
desired solution.

> ---
> v7:
> - Fixed multiple definitions (from rebasing, sorry).
> v6:
> - Rebase went wrong in v5.
> v5:
> - Rebased on top of v6.15-rc
> - Updated commit message to explain how spin lock and refcount
>    isolate the time window in key_put().
> v4:
> - Pin the key while processing key type teardown. Skip dead keys.
> - Revert key_gc_graveyard back key_gc_unused_keys.
> - Rewrote the commit message.
> - "unsigned long flags" declaration somehow did make to the previous
>    patch (sorry).
> v3:
> - Using spin_lock() fails since key_put() is executed inside IRQs.
>    Using spin_lock_irqsave() would neither work given the lock is
>    acquired for /proc/keys. Therefore, separate the lock for
>    graveyard and key_graveyard before reaping key_serial_tree.
> v2:
> - Rename key_gc_unused_keys as key_gc_graveyard, and re-document the
>    function.
> ---
>   include/linux/key.h      |  7 ++-----
>   security/keys/gc.c       | 40 ++++++++++++++++++++++++----------------
>   security/keys/internal.h |  5 +++++
>   security/keys/key.c      |  7 +++++--
>   4 files changed, 36 insertions(+), 23 deletions(-)
>
> diff --git a/include/linux/key.h b/include/linux/key.h
> index ba05de8579ec..c50659184bdf 100644
> --- a/include/linux/key.h
> +++ b/include/linux/key.h
> @@ -195,10 +195,8 @@ enum key_state {
>   struct key {
>       refcount_t              usage;          /* number of references */
>       key_serial_t            serial;         /* key serial number */
> -     union {
> -             struct list_head graveyard_link;
> -             struct rb_node  serial_node;
> -     };
> +     struct list_head        graveyard_link; /* key->usage == 0 */
> +     struct rb_node          serial_node;
>   #ifdef CONFIG_KEY_NOTIFICATIONS
>       struct watch_list       *watchers;      /* Entities watching this key 
> for changes */
>   #endif
> @@ -236,7 +234,6 @@ struct key {
>   #define KEY_FLAG_ROOT_CAN_INVAL     7       /* set if key can be 
> invalidated by root without permission */
>   #define KEY_FLAG_KEEP               8       /* set if key should not be 
> removed */
>   #define KEY_FLAG_UID_KEYRING        9       /* set if key is a user or user 
> session keyring */
> -#define KEY_FLAG_FINAL_PUT   10      /* set if final put has happened on key 
> */
>   
>       /* the key type and key description string
>        * - the desc is used to match a key against search criteria
> diff --git a/security/keys/gc.c b/security/keys/gc.c
> index f27223ea4578..0a3beb68633c 100644
> --- a/security/keys/gc.c
> +++ b/security/keys/gc.c
> @@ -189,6 +189,7 @@ static void key_garbage_collector(struct work_struct 
> *work)
>       struct rb_node *cursor;
>       struct key *key;
>       time64_t new_timer, limit, expiry;
> +     unsigned long flags;
>   
>       kenter("[%lx,%x]", key_gc_flags, gc_state);
>   
> @@ -206,21 +207,35 @@ static void key_garbage_collector(struct work_struct 
> *work)
>   
>       new_timer = TIME64_MAX;
>   
> +     spin_lock_irqsave(&key_graveyard_lock, flags);
> +     list_splice_init(&key_graveyard, &graveyard);
> +     spin_unlock_irqrestore(&key_graveyard_lock, flags);
> +
> +     list_for_each_entry(key, &graveyard, graveyard_link) {
> +             spin_lock(&key_serial_lock);
> +             kdebug("unrefd key %d", key->serial);
> +             rb_erase(&key->serial_node, &key_serial_tree);
> +             spin_unlock(&key_serial_lock);
> +     }
> +
>       /* As only this function is permitted to remove things from the key
>        * serial tree, if cursor is non-NULL then it will always point to a
>        * valid node in the tree - even if lock got dropped.
>        */
>       spin_lock(&key_serial_lock);
> +     key = NULL;
>       cursor = rb_first(&key_serial_tree);
>   
>   continue_scanning:
> +     key_put(key);
>       while (cursor) {
>               key = rb_entry(cursor, struct key, serial_node);
>               cursor = rb_next(cursor);
> -
> -             if (test_bit(KEY_FLAG_FINAL_PUT, &key->flags)) {
> -                     smp_mb(); /* Clobber key->user after FINAL_PUT seen. */
> -                     goto found_unreferenced_key;
> +             /* key_get(), unless zero: */
> +             if (!refcount_inc_not_zero(&key->usage)) {
> +                     key = NULL;
> +                     gc_state |= KEY_GC_REAP_AGAIN;
> +                     goto skip_dead_key;
>               }
>   
>               if (unlikely(gc_state & KEY_GC_REAPING_DEAD_1)) {
> @@ -274,6 +289,7 @@ static void key_garbage_collector(struct work_struct 
> *work)
>               spin_lock(&key_serial_lock);
>               goto continue_scanning;
>       }
> +     key_put(key);
>   
>       /* We've completed the pass.  Set the timer if we need to and queue a
>        * new cycle if necessary.  We keep executing cycles until we find one
> @@ -286,6 +302,10 @@ static void key_garbage_collector(struct work_struct 
> *work)
>               key_schedule_gc(new_timer);
>       }
>   
> +     spin_lock(&key_graveyard_lock);
> +     list_splice_init(&key_graveyard, &graveyard);
> +     spin_unlock(&key_graveyard_lock);
> +
>       if (unlikely(gc_state & KEY_GC_REAPING_DEAD_2) ||
>           !list_empty(&graveyard)) {
>               /* Make sure that all pending keyring payload destructions are
> @@ -328,18 +348,6 @@ static void key_garbage_collector(struct work_struct 
> *work)
>       kleave(" [end %x]", gc_state);
>       return;
>   
> -     /* We found an unreferenced key - once we've removed it from the tree,
> -      * we can safely drop the lock.
> -      */
> -found_unreferenced_key:
> -     kdebug("unrefd key %d", key->serial);
> -     rb_erase(&key->serial_node, &key_serial_tree);
> -     spin_unlock(&key_serial_lock);
> -
> -     list_add_tail(&key->graveyard_link, &graveyard);
> -     gc_state |= KEY_GC_REAP_AGAIN;
> -     goto maybe_resched;
> -
>       /* We found a restricted keyring and need to update the restriction if
>        * it is associated with the dead key type.
>        */
> diff --git a/security/keys/internal.h b/security/keys/internal.h
> index 2cffa6dc8255..4e3d9b322390 100644
> --- a/security/keys/internal.h
> +++ b/security/keys/internal.h
> @@ -63,9 +63,14 @@ struct key_user {
>       int                     qnbytes;        /* number of bytes allocated to 
> this user */
>   };
>   
> +extern struct list_head key_graveyard;
> +extern spinlock_t key_graveyard_lock;
> +
>   extern struct rb_root       key_user_tree;
>   extern spinlock_t   key_user_lock;
>   extern struct key_user      root_key_user;
> +extern struct list_head      key_graveyard;
> +extern spinlock_t    key_graveyard_lock;
>   
>   extern struct key_user *key_user_lookup(kuid_t uid);
>   extern void key_user_put(struct key_user *user);
> diff --git a/security/keys/key.c b/security/keys/key.c
> index 7198cd2ac3a3..7511f2017b6b 100644
> --- a/security/keys/key.c
> +++ b/security/keys/key.c
> @@ -22,6 +22,8 @@ DEFINE_SPINLOCK(key_serial_lock);
>   
>   struct rb_root      key_user_tree; /* tree of quota records indexed by UID 
> */
>   DEFINE_SPINLOCK(key_user_lock);
> +LIST_HEAD(key_graveyard);
> +DEFINE_SPINLOCK(key_graveyard_lock);
>   
>   unsigned int key_quota_root_maxkeys = 1000000;      /* root's key count 
> quota */
>   unsigned int key_quota_root_maxbytes = 25000000; /* root's key space quota 
> */
> @@ -658,8 +660,9 @@ void key_put(struct key *key)
>                               key->user->qnbytes -= key->quotalen;
>                               spin_unlock_irqrestore(&key->user->lock, flags);
>                       }
> -                     smp_mb(); /* key->user before FINAL_PUT set. */
> -                     set_bit(KEY_FLAG_FINAL_PUT, &key->flags);
> +                     spin_lock_irqsave(&key_graveyard_lock, flags);
> +                     list_add_tail(&key->graveyard_link, &key_graveyard);
> +                     spin_unlock_irqrestore(&key_graveyard_lock, flags);
>                       schedule_work(&key_gc_work);
>               }
>       }

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


Reply via email to