On 01/12/2021 19:13, Paul Vixie wrote:
Miroslav Lachman wrote on 2021-12-01 08:52:
On 01/12/2021 17:17, John Doherty via freebsd-virtualization wrote:
...
I am sorry for hijacking this thread but your information is very
interesting. I was playing with VMs in VirtualBox and Bhyve and
compared performance with increasing vCPU count. The more cores VM get
the slower was even a simple single threaded task like loading PF
rules from /etc/pf.conf. It was tested on FreeBSD 11.4 and 12.2, I
tested ULE and 4BSD schedulers. Maybe it was somewhat HW related but
it always shows VMs with more than 2 v CPUs significantly slower. VMs
with 6+ vCPU was almost unusable (loading of PF ruleset takes about 8
seconds instead of fraction on single vCPU VM).
...
loading a PF ruleset requires a fair bit of locking and unlocking of
kernel data structures for each system call, per rule. while pfctl is
single threaded, the acquisition process of those kernel locks probably
requires a memory buffer flush to guaranty atomicity, and the lock's
domain may overlap with other non-PF kernel activities that different
hypervisors virtualize differently.
this makes loading a PF ruleset a poor benchmark for hypervisors, unless
that activity is so common that the unusable slowness is interfering
with other work. it could be debugged or optimized in that case, but how
often do you really need to add a PF ruleset?
I don't take pfctl as a benchmark. The whole case started with bad
webserver performance (Apache + PHP) so I added 2 more vCPUs and the
problem was even bigger - performance slower. It was so slow that on
reboot I think it freezed on loading PF rules. That's why I take it as
an example because it was the most visible. Loading of rules was not the
only one problem. I can live with 10 seconds of loading PF rules but not
with the bad webserver performance.
This is an old case which resulted in migration of client to another
service provider which uses jails instead of bhyve = no performance
problems with similar HW setup.
Miroslav Lachman