Hello,
On Sat, Dec 09, 2023 at 03:10:02AM +0300, Vitaliy Makkoveev wrote: > On Sat, Dec 09, 2023 at 12:28:10AM +0100, Alexander Bluhm wrote: > > On Sat, Dec 09, 2023 at 02:07:06AM +0300, Vitaliy Makkoveev wrote: > > > > > SLIST_ENTRY(pflow_softc) sc_next; > > > > > > > > This list is protected by net lock. Can you add an [N] here? > > > > > > > > > > This is not true. The netlock is not taken while export_pflow() called > > > from pf_purge_states(). I privately shared the diff to fix this, but not > > > committed it yet. I will update it and share after committing this diff. > > > > Why not hold the shared net lock while in pf? > > > > Because the export_flow() should be called only if the PFSTATE_PFLOW > flag is set? The netlock should be taken before PF_LOCK, so there is no > reason to always take it in this path only to make optional > export_pflow() call: > > pf_purge_expired_states(const unsigned int limit, const unsigned int collect) > { > /* skip */ > > rw_enter_write(&pf_state_list.pfs_rwl); > PF_LOCK(); > PF_STATE_ENTER_WRITE(); > SLIST_FOREACH(st, &gcl, gc_list) { > if (st->timeout != PFTM_UNLINKED) > pf_remove_state(st); > > pf_free_state(st); > } > > pf_remove_state(struct pf_state *st) > { > PF_ASSERT_LOCKED(); > > /* skip */ > > RBT_REMOVE(pf_state_tree_id, &tree_id, st); > #if NPFLOW > 0 > if (st->state_flags & PFSTATE_PFLOW) > export_pflow(st); > > However, I'm not the pf hacker, I's better to ask sashan@ and dlg@. > dlg@ and I are basically trying to remove all NET_LOCK() operations from pf(4), because we don't want pf(4) to be playing with global NET_LOCK(). all callers to pf(4) should either obtain NET_LOCK() in case they need it. pf(4) should not care about NET_LOCK() at all. That's the ideal situation where we are heading to. I took a closer look at export_pflow() in current. It seems to me the function assumes caller holds a shared NET_LOCK() at least, just to protect consistency of `pflowif_list`. There should be an ASSERT() for this and pf(4) indeed should make sure to call export_pflow() with all locks it needs obtained in right order. Because the ASSERT() was missing the bugs got discovered the hard way. I certainly don't mind to add NET_LOCK() into pf_purge_expired_states() to keep export_pflow() happy. on the other hand if there is a way to implement pflowif_list as lock-less (or move it ouf of NET_LOCK() scope), then this is a preferred way forward. thanks and regards sashan