While look at the back catalog of bugs, started to use AI
to analyze Bugzilla ID 662.
POSIX mutexes are by default private to the process creating them.
Several places in DPDK use pthread_mutex to protect resources in
shared memory that are accessed by multiple processes (primary and
secondary). These mutexes must be initialized with the
PTHREAD_PROCESS_SHARED attribute for correct synchronization.
Without this attribute, synchronization between processes is
undefined behavior according to POSIX, and may silently fail
on some implementations.
This problem was originally reported against the failsafe driver,
but the issue exists in multiple locations. Every place calling
pthread_mutex_init() in DPDK was suspected.
The following table summarizes the analysis:
| Component | Mutex(es) | In
Shared Mem | Multi-process | Needs Fix |
|------------------------------------|--------------------------------|---------------|---------------|-----------|
| lib/ethdev/ethdev_driver.c | flow_ops_mutex | Yes
| Yes | YES |
| drivers/net/failsafe/failsafe.c | hotplug_mutex | Yes
| Yes | YES |
| drivers/net/atlantic/atl_ethdev.c | mbox_mutex | Yes
| Yes | YES |
| drivers/net/axgbe/axgbe_ethdev.c | xpcs_mutex, i2c_mutex, | Yes
| Yes | YES |
| | an_mutex, phy_mutex |
| | |
| drivers/net/bnxt/bnxt_ethdev.c | flow_lock, def_cp_lock, | Yes
| Yes | YES |
| | health_check_lock, |
| | |
| | err_recovery_lock, |
| | |
| | vfr_start_lock |
| | |
| drivers/net/bnxt/bnxt_txq.c | txq_lock | Yes
| Yes | YES |
| drivers/net/bnxt/tf_ulp/bnxt_ulp.c | bnxt_ulp_mutex | Yes
| Yes | YES |
| drivers/net/bnxt/tf_ulp/bnxt_ulp_tf.c | flow_db_lock | Yes
| Yes | YES |
| drivers/net/bnxt/tf_ulp/bnxt_ulp_tfc.c | flow_db_lock | Yes
| Yes | YES |
| drivers/net/hinic/base/hinic_compat.h | hinic_mutex_init wrapper | Yes
| Yes | YES |
| lib/vhost/socket.c | conn_mutex, mutex | No
(heap) | No | OK |
| lib/vhost/fd_man.c | fd_mutex | No
(heap) | No | OK |
| drivers/common/cnxk/roc_bphy_cgx.c | lock | TBD
| Unlikely | TBD |
| drivers/common/cnxk/roc_dev.c | sync.mutex | No
| No | OK |
| drivers/net/cnxk/cnxk_rep.c | repte_msg_proc.mutex | No
| No | OK |
| drivers/net/intel/iavf/iavf_vchnl.c| event_handler.lock | No
(static) | No | OK |
| drivers/net/intel/ixgbe/base/ixgbe_osdep.h | ixgbe_lock macro | Yes
| Likely | Maintainer|
This RFC since obviously don't have the hardware to retest all
these devices. Some of the bugs are in the base code, but
a bug is a bug and should be fixed.
Note: There may be additional drivers with similar issues that
were not analyzed in this series. A grep for pthread_mutex_init
in the source tree shows many potential locations that should
be audited.
Stephen Hemminger (6):
ethdev: fix flow_ops_mutex for multi-process
net/failsafe: fix hotplug_mutex for multi-process
net/atlantic: fix mbox_mutex for multi-process
net/axgbe: fix mutexes for multi-process
net/bnxt: fix mutexes for multi-process
net/hinic: fix mutexes for multi-process
drivers/net/atlantic/atl_ethdev.c | 14 +++++++++++++-
drivers/net/axgbe/axgbe_ethdev.c | 19 +++++++++++++++----
drivers/net/bnxt/bnxt_ethdev.c | 11 ++++++-----
drivers/net/bnxt/bnxt_txq.c | 3 ++-
drivers/net/bnxt/bnxt_util.c | 13 +++++++++++++
drivers/net/bnxt/bnxt_util.h | 2 ++
drivers/net/bnxt/tf_ulp/bnxt_ulp.c | 2 +-
drivers/net/bnxt/tf_ulp/bnxt_ulp_tf.c | 2 +-
drivers/net/bnxt/tf_ulp/bnxt_ulp_tfc.c | 2 +-
drivers/net/failsafe/failsafe.c | 15 ++++++++++++---
drivers/net/hinic/base/hinic_compat.h | 13 ++++++++++++-
lib/ethdev/ethdev_driver.c | 18 +++++++++++++++++-
12 files changed, 95 insertions(+), 19 deletions(-)
--
2.51.0