https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=283903
--- Comment #25 from Guillaume Outters <guillaume-free...@outters.eu> --- (In reply to Bjoern A. Zeeb from comment #16) With your Dtrace I got a flagrant difference between the "before" and "after" tipping point. -------------------------------- -- The paths to allocation and freeing There are 2 paths to linuxkpi_alloc_skb, let's call them A as Alloc: - A1 lkpi_80211_txq_task (explicitely Tx) - A2 lkpi_napi_task > rtw_pci_napi_poll (which would be the RX path?) And 3 paths to linuxkpi_kfree_skb (D for Dealloc): - D1 linux_work_fn > rtw_c2h_work > rtw_fw_c2h_cmd_handle > rtw_tx_report_handle > linuxkpi_ieee80211_tx_status Note that there is a "branch" at rtw_c2h_work, with: D1.1 rtw_c2h_work+0x62 leading to the full path above, at the tail of which linuxkpi_kfree_skb is called (by linuxkpi_ieee80211_tx_status) D1.2 rtw_c2h_work+0x6a does a direct call to linuxkpi_kfree_skb - D2 lkpi_napi_task > rtw_pci_napi_poll > linuxkpi_ieee80211_rx (explicitely Rx) - D3 [softclock_thread > softclock_call_cc] > rtw_tx_report_purge_timer contrary to all of the above which start with a [taskqueue_thread_loop > taskqueue_run_locked], this one's name suggests seems to be called on a periodic -------------------------------- -- My tests I did a first test just after the reboot (I don't remember if it was mostly Rx or Tx), with interesting results (see (2) below for the full result): - 283 allocs through A1 got freed by D1.1 - 1144 allocs through A2, of which: - 861 freed by D2 - 283 freed by D1.2 Note how the 283 matches: there's an interesting mix of allocations handled by the "other side"'s dealloc (and due to this mix I couldn't say for sure that 1 is Tx and 2 is Rx). Now after some time running, and vmstat -m starting to show an increase in skb mem consumption, I had totally different paths: Be it in Tx or Rx (see the details in ): - during the transfer I got 17870 A2, some of which got freed by a D2. But I had some A1 too, with: A2 = A1 + D2 This is **really surprising**, as I intended a balanced A1 + A2 = D2 (the sum of allocations = the sum of deallocations) The increase of vmstat was of exactly 2 * A1 * 4 KB This explains very well: to achieve balance we should have freed as many as we had allocated: (A1 + A2) - (D2) = 0, with an expected D2 = A1 + A2; However here D2 = A2 - A1, thus our balance is of A1 + A2 - D2 = A1 + A2 - (A2 - A1) = 2 * A1 - if waiting after the transfer, apart from some negligible new allocations on this mode (A2 = A1 + D2), we see a **new process** (D3, with a rtw_tx_report_purge_timer on softclock_thread) running to free a bit of memory; although it seems to be dedicated to garbage collecting, **it doesn't keep up with the pace** -------------------------------- -- My god! It looks like SOME SKBUFFERS GET **ALLOCATED** INSTEAD OF **FREED**. -------------------------------- Notes --- (1) Test procedure vms() { vmstat -m | egrep -e '(Use|lkpi|mbuf)' ; } dt() { sudo dtrace -s rtw88-skb.d | egrep -v 'kernel.0xffff|fork_exit|taskqueue_thread_loop|softclock_thread' ; } for p in "rx b:/tmp/1 /tmp/" "tx /tmp/1 b:/tmp/" # /tmp/1 contains the first 10000000 bytes of a gzip file. do set -- $p { dt $1.during & vms echo "# scp ($1) of a 10 MiB file" time scp $2 $3 2>&1 sudo killall dtrace vms dt $1.after & echo "# Sleep 20" sleep 20 sudo killall dtrace vms } | tee rtw88-skb.results.pourri.$1 done --- (2) Before linuxkpi_alloc_skb kernel`linuxkpi_dev_alloc_skb+0xd kernel`lkpi_80211_txq_task+0x1ec kernel`taskqueue_run_locked+0x182 kernel`taskqueue_thread_loop+0xc2 283 linuxkpi_kfree_skb kernel`linuxkpi_ieee80211_tx_status_ext+0x163 kernel`linuxkpi_ieee80211_tx_status+0x45 if_rtw88.ko`rtw_tx_report_handle+0x136 if_rtw88.ko`rtw_fw_c2h_cmd_handle+0x15a if_rtw88.ko`rtw_c2h_work+0x62 kernel`linux_work_fn+0xe4 kernel`taskqueue_run_locked+0x182 kernel`taskqueue_thread_loop+0xc2 283 linuxkpi_alloc_skb kernel`linuxkpi_dev_alloc_skb+0xd if_rtw88.ko`rtw_pci_napi_poll+0x254 kernel`lkpi_napi_task+0xf kernel`taskqueue_run_locked+0x182 kernel`taskqueue_thread_loop+0xc2 1144 linuxkpi_kfree_skb if_rtw88.ko`rtw_c2h_work+0x6a kernel`linux_work_fn+0xe4 kernel`taskqueue_run_locked+0x182 kernel`taskqueue_thread_loop+0xc2 283 linuxkpi_kfree_skb kernel`linuxkpi_ieee80211_rx+0x5a3 if_rtw88.ko`rtw_pci_napi_poll+0x31b kernel`lkpi_napi_task+0xf kernel`taskqueue_run_locked+0x182 kernel`taskqueue_thread_loop+0xc2 861 -- You are receiving this mail because: You are on the CC list for the bug.