On 04.09.2023 11:45, Mark Millard wrote:
On Sep 4, 2023, at 06:09, Alexander Motin <m...@freebsd.org> wrote:
per_txg_dirty_frees_percent is directly related to the delete delays we see 
here.  You are forcing ZFS to commit transactions each 5% of dirty ARC limit, 
which is 5% of 10% or memory size.  I haven't looked on that code recently, but 
I guess setting it too low can make ZFS commit transactions too often, 
increasing write inflation for the underlying storage.  I would propose you to 
restore the default and try again.

While this machine is different, the original problem was worse than
the issue here: the load average was less than 1 for the most part
the parallel bulk build when 30 was used. The fraction of time waiting
was much longer than with 5. If I understand right, both too high and
too low for a type of context can lead to increased elapsed time and
getting it set to a near optimal is a non-obvious exploration.

IIRC this limit was modified several times since originally implemented. May be it could benefit from another look, if default 30% is not good. It would be good if generic ZFS issues like this were reported to OpenZFS upstream to be visible to a wider public. Unfortunately I have several other project I must work on, so if it is not a regression I can't promise I'll take it right now, so anybody else is welcome.

An overall point for the goal of my activity is: what makes a
good test context for checking if ZFS is again safe to use?
May be other tradeoffs make, say, 4 hardware threads more
reasonable than 32.

Thank you for your testing. The best test is one that nobody else run. It also correlates with the topic of "safe to use", which also depends on what it is used for. :)

--
Alexander Motin

Reply via email to