Hi Linus and folks,

I've been developing a tool for detecting deadlock possibilities by
tracking wait/event rather than lock(?) acquisition order to try to
cover all synchonization machanisms. It's done on v5.17-rc1 tag.

https://github.com/lgebyungchulpark/linux-dept/commits/dept1.14_on_v5.17-rc1

Benifit:

        0. Works with all lock primitives.
        1. Works with wait_for_completion()/complete().
        2. Works with 'wait' on PG_locked.
        3. Works with 'wait' on PG_writeback.
        4. Works with swait/wakeup.
        5. Works with waitqueue.
        6. Multiple reports are allowed.
        7. Deduplication control on multiple reports.
        8. Withstand false positives thanks to 6.
        9. Easy to tag any wait/event.

Future work:

        0. To make it more stable.
        1. To separates Dept from Lockdep.
        2. To improves performance in terms of time and space.
        3. To use Dept as a dependency engine for Lockdep.
        4. To add any missing tags of wait/event in the kernel.
        5. To deduplicate stack trace.

How to interpret reports:

        1. E(event) in each context cannot be triggered because of the
           W(wait) that cannot be woken.
        2. The stack trace helping find the problematic code is located
           in each conext's detail.

Thanks,
Byungchul

---

Changes from v3:

        1. Dept shouldn't create dependencies between different depths
           of a class that were indicated by *_lock_nested(). Dept
           normally doesn't but it does once another lock class comes
           in. So fixed it. (feedback from Hyeonggon)
        2. Dept considered a wait as a real wait once getting to
           __schedule() even if it has been set to TASK_RUNNING by wake
           up sources in advance. Fixed it so that Dept doesn't consider
           the case as a real wait. (feedback from Jan Kara)
        3. Stop tracking dependencies with a map once the event
           associated with the map has been handled. Dept will start to
           work with the map again, on the next sleep.

Changes from v2:

        1. Disable Dept on bit_wait_table[] in sched/wait_bit.c
           reporting a lot of false positives, which is my fault.
           Wait/event for bit_wait_table[] should've been tagged in a
           higher layer for better work, which is a future work.
           (feedback from Jan Kara)
        2. Disable Dept on crypto_larval's completion to prevent a false
           positive.

Changes from v1:

        1. Fix coding style and typo. (feedback from Steven)
        2. Distinguish each work context from another in workqueue.
        3. Skip checking lock acquisition with nest_lock, which is about
           correct lock usage that should be checked by Lockdep.

Changes from RFC:

        1. Prevent adding a wait tag at prepare_to_wait() but __schedule().
           (feedback from Linus and Matthew)
        2. Use try version at lockdep_acquire_cpus_lock() annotation.
        3. Distinguish each syscall context from another.

Byungchul Park (24):
  llist: Move llist_{head,node} definition to types.h
  dept: Implement Dept(Dependency Tracker)
  dept: Embed Dept data in Lockdep
  dept: Add a API for skipping dependency check temporarily
  dept: Apply Dept to spinlock
  dept: Apply Dept to mutex families
  dept: Apply Dept to rwlock
  dept: Apply Dept to wait_for_completion()/complete()
  dept: Apply Dept to seqlock
  dept: Apply Dept to rwsem
  dept: Add proc knobs to show stats and dependency graph
  dept: Introduce split map concept and new APIs for them
  dept: Apply Dept to wait/event of PG_{locked,writeback}
  dept: Apply SDT to swait
  dept: Apply SDT to wait(waitqueue)
  locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread
  dept: Distinguish each syscall context from another
  dept: Distinguish each work from another
  dept: Disable Dept within the wait_bit layer by default
  dept: Add nocheck version of init_completion()
  dept: Disable Dept on struct crypto_larval's completion for now
  dept: Don't create dependencies between different depths in any case
  dept: Let it work with real sleeps in __schedule()
  dept: Disable Dept on that map once it's been handled until next turn

 crypto/api.c                       |    7 +-
 include/linux/completion.h         |   50 +-
 include/linux/dept.h               |  535 +++++++
 include/linux/dept_page.h          |   78 ++
 include/linux/dept_sdt.h           |   62 +
 include/linux/hardirq.h            |    3 +
 include/linux/irqflags.h           |   33 +-
 include/linux/llist.h              |    8 -
 include/linux/lockdep.h            |  158 ++-
 include/linux/lockdep_types.h      |    3 +
 include/linux/mutex.h              |   33 +
 include/linux/page-flags.h         |   45 +-
 include/linux/pagemap.h            |    7 +-
 include/linux/percpu-rwsem.h       |   10 +-
 include/linux/rtmutex.h            |    7 +
 include/linux/rwlock.h             |   52 +
 include/linux/rwlock_api_smp.h     |    8 +-
 include/linux/rwlock_types.h       |    7 +
 include/linux/rwsem.h              |   33 +
 include/linux/sched.h              |    7 +
 include/linux/seqlock.h            |   59 +-
 include/linux/spinlock.h           |   26 +
 include/linux/spinlock_types_raw.h |   13 +
 include/linux/swait.h              |    4 +
 include/linux/types.h              |    8 +
 include/linux/wait.h               |    6 +-
 init/init_task.c                   |    2 +
 init/main.c                        |    4 +
 kernel/Makefile                    |    1 +
 kernel/cpu.c                       |    2 +-
 kernel/dependency/Makefile         |    4 +
 kernel/dependency/dept.c           | 2716 ++++++++++++++++++++++++++++++++++++
 kernel/dependency/dept_hash.h      |   10 +
 kernel/dependency/dept_internal.h  |   26 +
 kernel/dependency/dept_object.h    |   13 +
 kernel/dependency/dept_proc.c      |   92 ++
 kernel/entry/common.c              |    3 +
 kernel/exit.c                      |    1 +
 kernel/fork.c                      |    2 +
 kernel/locking/lockdep.c           |   12 +-
 kernel/module.c                    |    2 +
 kernel/sched/completion.c          |   12 +-
 kernel/sched/core.c                |    8 +
 kernel/sched/swait.c               |   10 +
 kernel/sched/wait.c                |   16 +
 kernel/sched/wait_bit.c            |    5 +-
 kernel/softirq.c                   |    6 +-
 kernel/trace/trace_preemptirq.c    |   19 +-
 kernel/workqueue.c                 |    3 +
 lib/Kconfig.debug                  |   21 +
 mm/filemap.c                       |   68 +
 mm/page_ext.c                      |    5 +
 52 files changed, 4266 insertions(+), 59 deletions(-)
 create mode 100644 include/linux/dept.h
 create mode 100644 include/linux/dept_page.h
 create mode 100644 include/linux/dept_sdt.h
 create mode 100644 kernel/dependency/Makefile
 create mode 100644 kernel/dependency/dept.c
 create mode 100644 kernel/dependency/dept_hash.h
 create mode 100644 kernel/dependency/dept_internal.h
 create mode 100644 kernel/dependency/dept_object.h
 create mode 100644 kernel/dependency/dept_proc.c

-- 
1.9.1

Reply via email to