On 2025-03-28 10:38, Thomas Lamprecht wrote: > Am 28.03.25 um 09:28 schrieb Lukas Wagner: >> We of course can cache the FQDN, but realistically speaking, this is only >> called once per >> notification being sent, thus any real-world performance impact is >> absolutely tiny. > > Not so sure about that in general, e.g. sending out notifications could > correlate with an overloaded system, and for really overloaded systems > things that are normally cheap suddenly ain't – e.g., on low memory > situations even a doing a plain fork+exec of a tiny binary can hang for > a long time then, socket operations like our helper does are definitively > less problematic (I think, as I did not evaluate it [0] and that's something > one can less easily experience directly compared to the former, where even > starting a new basic dash shell on such overloaded system can need minutes).
Yes, my 'impact is tiny' was mostly based on already using PVE::Tools::get_fqdn. For the old fork+exec version you are absolutely right, this could take very long on a system at its limits. > > And as the notification system now also handles things like HA events it's > definitively part of the more critical systems which _can_ justify some extra > scrutiny. That said, switching to the get_fqdn method makes this indeed quite > cheap to get [0], so I'm fine with not doing any caching here for now, but > let's not underestimate the impact of such things too much, especially for > anything in critical chains that can be important in critical (load) situation > (as general strategy for all, as I'm really not thinking about you here, and > it certainly is a balance). That makes a lot of sense. Thanks for the detailed explanation, highly appreciated. Actually I will add caching right away, it a pretty trivial change anyways. > > [0]: FWIW, I just did a quick evaluation of querying the fqdn 100 000 times > with the socket variant and the hostname one, this was done on a very healthy > system though, I'd expect that the fork+exec one degrades a lot worse with > higher cpu/memory pressure. Test and result: > > # perl -wE 'use PVE::Tools; use Time::HiRes qw(gettimeofday tv_interval); my > $t0 = [gettimeofday]; for(my $i = 0; $i <= 100_000; $i++) { my $fqdn = > PVE::Tools::get_fqdn("nina"); } my $elapsed = tv_interval ( $t0, > [gettimeofday]); say "elapsed (s): ". $elapsed;' > elapsed (s): 0.436712 > > Same with 1 million runs gets me 4.368217 s, so seems to scale quite linearly. > > # perl -wE 'use Time::HiRes qw(gettimeofday tv_interval); my $t0 = > [gettimeofday]; for(my $i = 0; $i <= 100_000; $i++) { my $fqdn = `hostname > -f`; } my $elapsed = tv_interval ( $t0, [gettimeofday]); say "elapsed: ". > $elapsed;' > elapsed (s): 82.484177 > > Same with 1 million runs gets me 577.653117 s, so not fully linearly, but > in any way about 188x and 132x times slower, respectively. Good to know, thanks for backing that up with some concrete data! :) -- - Lukas _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel