On Wed, Oct 04, 2023 at 11:03:27PM +0300, Vitaliy Makkoveev wrote: > On Wed, Oct 04, 2023 at 09:13:59PM +0200, Alexander Bluhm wrote: > > On Wed, Oct 04, 2023 at 08:42:48PM +0200, Kirill Miazine wrote: > > > > If it happns again, could you send an 'ps axlww | grep ifconifg' > > > > output? Then we see the wait channel where it hangs in the kernel. > > > > > > > > $ ps axlww > > > > UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME > > > > COMMAND > > > > > > Here it happened again: > > > > > > 0 75339 23922 0 10 0 360 296 wg_ifq D+U p0 0:00.00 > > > ifconfig wg1 destroy > > > > wg_peer_destroy() > > ... > > NET_LOCK(); > > while (!ifq_empty(&sc->sc_if.if_snd)) { > > NET_UNLOCK(); > > tsleep_nsec(sc, PWAIT, "wg_ifq", 1000); > > NET_LOCK(); > > } > > NET_UNLOCK(); > > > > This net lock dance looks fishy. And the sleep has a timeout of 1 > > milli second. But that is may be per packet. So if you have a > > long queue or the queue refills somehow, it will take forever. > > > > I think the difference in the usage is constant traffic that keeps > > the send queue full. The timeout hides the problem when there are > > only a few packets. > > > > This should ensure wg_qstart() will not dereference dying `peer'. Looks > crappy and potentially could block forever, but should work. However > netlock it unnecessary here. netlocked wg_output() could fill `if_snd' > while netlock released before tsleep(), so it serializes nothing but > stops packets processing. > > Kirill, does this diff help?
I doubt that it changes much. When netlock is not taken, the queue can still be filled with packets. Removing this ugly netlock makes sense anyway. But without any synchronisation just reading a variable feels wrong. Can we add a read once like for mq_len in sys/mbuf.h? And the ifq_set_maxlen() also looks very unsafe. For mbuf queues I added a mutex, interface queues should do the same. ok? bluhm Index: net/ifq.c =================================================================== RCS file: /data/mirror/openbsd/cvs/src/sys/net/ifq.c,v retrieving revision 1.50 diff -u -p -r1.50 ifq.c --- net/ifq.c 30 Jul 2023 05:39:52 -0000 1.50 +++ net/ifq.c 4 Oct 2023 21:04:20 -0000 @@ -529,6 +529,14 @@ ifq_hdatalen(struct ifqueue *ifq) return (len); } +void +ifq_set_maxlen(struct ifqueue *ifq, unsigned int maxlen) +{ + mtx_enter(&ifq->ifq_mtx); + ifq->ifq_maxlen = maxlen; + mtx_leave(&ifq->ifq_mtx); +} + unsigned int ifq_purge(struct ifqueue *ifq) { Index: net/ifq.h =================================================================== RCS file: /data/mirror/openbsd/cvs/src/sys/net/ifq.h,v retrieving revision 1.38 diff -u -p -r1.38 ifq.h --- net/ifq.h 30 Jul 2023 05:39:52 -0000 1.38 +++ net/ifq.h 4 Oct 2023 21:09:04 -0000 @@ -435,6 +435,7 @@ void ifq_deq_commit(struct ifqueue *, void ifq_deq_rollback(struct ifqueue *, struct mbuf *); struct mbuf *ifq_dequeue(struct ifqueue *); int ifq_hdatalen(struct ifqueue *); +void ifq_set_maxlen(struct ifqueue *, unsigned int); void ifq_mfreem(struct ifqueue *, struct mbuf *); void ifq_mfreeml(struct ifqueue *, struct mbuf_list *); unsigned int ifq_purge(struct ifqueue *); @@ -448,9 +449,8 @@ int ifq_deq_sleep(struct ifqueue *, st const char *, volatile unsigned int *, volatile unsigned int *); -#define ifq_len(_ifq) ((_ifq)->ifq_len) -#define ifq_empty(_ifq) (ifq_len(_ifq) == 0) -#define ifq_set_maxlen(_ifq, _l) ((_ifq)->ifq_maxlen = (_l)) +#define ifq_len(_ifq) READ_ONCE((_ifq)->ifq_len) +#define ifq_empty(_ifq) (ifq_len(_ifq) == 0) static inline int ifq_is_priq(struct ifqueue *ifq) @@ -490,8 +490,8 @@ int ifiq_input(struct ifiqueue *, stru int ifiq_enqueue(struct ifiqueue *, struct mbuf *); void ifiq_add_data(struct ifiqueue *, struct if_data *); -#define ifiq_len(_ifiq) ml_len(&(_ifiq)->ifiq_ml) -#define ifiq_empty(_ifiq) ml_empty(&(_ifiq)->ifiq_ml) +#define ifiq_len(_ifiq) READ_ONCE(ml_len(&(_ifiq)->ifiq_ml)) +#define ifiq_empty(_ifiq) (ifiq_len(_ifiq) == 0) #endif /* _KERNEL */