Re: [pve-devel] [PATCH manager 2/3] pvescheduler: reworking child pid tracking
On 18.11.21 14:28, Dominik Csapak wrote: previously, systemd timers were responsible for running replication jobs. those timers would not restart if the previous one is still running. though trying again while it is running does no harm really, it spams the log with errors about not being able to acquire the correct lock to fix this, we rework the handling of child processes such that we only start one per loop if there is currently none running. for that, introduce the types of forks we do and allow one child process per type (for now, we have 'jobs' and 'replication' as types) Signed-off-by: Dominik Csapak --- PVE/Service/pvescheduler.pm | 42 - 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/PVE/Service/pvescheduler.pm b/PVE/Service/pvescheduler.pm index d4f73702..466cc599 100755 --- a/PVE/Service/pvescheduler.pm +++ b/PVE/Service/pvescheduler.pm @@ -17,12 +17,16 @@ my $cmdline = [$0, @ARGV]; my %daemon_options = (stop_wait_time => 180, max_workers => 0); my $daemon = __PACKAGE__->new('pvescheduler', $cmdline, %daemon_options); +my @types = qw(replication jobs); + my $finish_jobs = sub { my ($self) = @_; -foreach my $cpid (keys %{$self->{jobs}}) { - my $waitpid = waitpid($cpid, WNOHANG); - if (defined($waitpid) && ($waitpid == $cpid)) { - delete ($self->{jobs}->{$cpid}); +for my $type (@types) { + if (my $cpid = $self->{jobs}->{$type}) { + my $waitpid = waitpid($cpid, WNOHANG); + if (defined($waitpid) && ($waitpid == $cpid)) { + $self->{jobs}->{$type} = undef; + } } } }; @@ -41,7 +45,11 @@ sub run { }; my $fork = sub { - my ($sub) = @_; + my ($type, $sub) = @_; + + # don't fork again if the previous iteration still runs + return if defined($self->{jobs}->{$type}); + my $child = fork(); if (!defined($child)) { die "fork failed: $!\n"; @@ -56,16 +64,16 @@ sub run { POSIX::_exit(0); } - $jobs->{$child} = 1; + $jobs->{$type} = $child; }; my $run_jobs = sub { - $fork->(sub { + $fork->('replication', sub { PVE::API2::Replication::run_jobs(undef, sub {}, 0, 1); }); - $fork->(sub { + $fork->('jobs', sub { PVE::Jobs::run_jobs(); }); }; @@ -92,14 +100,16 @@ sub run { } } -# jobs have a lock timeout of 60s, wait a bit more for graceful termination +# replication jobs have a lock timeout of 60s, wait a bit more for graceful termination my $timeout = 0; -while (keys %$jobs > 0 && $timeout < 75) { - kill 'TERM', keys %$jobs; - $timeout += sleep(5); +for my $type (@types) { + while (defined($jobs->{$type}) && $timeout < 75) { + kill 'TERM', $jobs->{$type}; + $timeout += sleep(5); + } + # ensure the rest gets stopped + kill 'KILL', $jobs->{$type} if defined($jobs->{$type}); } -# ensure the rest gets stopped -kill 'KILL', keys %$jobs if (keys %$jobs > 0); } Nit: this changes the behavior a bit, as it can happen that the timeout is "used up" for one job type and all following ones only get the KILL singal. Because of the code below, each child still gets at least one TERM, so not a big deal. sub shutdown { @@ -107,7 +117,9 @@ sub shutdown { syslog('info', 'got shutdown request, signal running jobs to stop'); -kill 'TERM', keys %{$self->{jobs}}; +for my $type (@types) { + kill 'TERM', $self->{jobs}->{$type} if $self->{jobs}->{$type}; +} $self->{shutdown_request} = 1; } ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH qemu-server 1/2] migrate: improve start STDIN-parameter parsing
only do the compat fallback if no explicit spice ticket was given, and warn on unknown parameters on STDIN. Signed-off-by: Fabian Grünbichler --- PVE/API2/Qemu.pm | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm index 6830009..8b834bb 100644 --- a/PVE/API2/Qemu.pm +++ b/PVE/API2/Qemu.pm @@ -2328,9 +2328,11 @@ __PACKAGE__->register_method({ $nbd_protocol_version = $1; } elsif ($line =~ m/^replicated_volume: (.*)$/) { $replicated_volumes->{$1} = 1; - } else { + } elsif (!$spice_ticket) { # fallback for old source node $spice_ticket = $line; + } else { + warn "unknown 'start' parameter on STDIN: '$line'\n"; } } } -- 2.30.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH qemu-server 2/2] migrate: send updated tpmstate volid to target node
and update the in-memory config for starting the target VM accordingly. this possibly breaks migration new -> old iff - spice is not used (else the explicit ticket wins because it comes later) - a local TPM state volume is used - that local TPM state volume has a different volume id on the target node (switched storage, volname already taken, ..) because the target node will then mis-interpret the tpmstate0 line as spice ticket and set it accordingly. if the old tpm state volume ID does not exist on the target node, migration will fail. if it exists by chance, it might work albeit with a wrong spice ticket (new because of this patch) and tpm state volume (pre-existing breakage). Signed-off-by: Fabian Grünbichler --- Notes: https://forum.proxmox.com/threads/error-live-migrate-a-windows-vm-with-tpm.99906/#post-431345 IMHO this is okay since new->old is always best effort and might fail for $reasons. we could extend this to check the broadcasted version on the source node once pve-manager requires a new enough qemu-server, and die early there to avoid running into the compat issue.. the original compat logic was unfortunate - an "old" source node would only ever send an unprefixed spice ticket as first line on STDIN, if the compat logic had included a check for that this would be a bit easier.. PVE/API2/Qemu.pm | 4 PVE/QemuMigrate.pm | 8 PVE/QemuServer.pm | 10 +- 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm index 8b834bb..6992f6f 100644 --- a/PVE/API2/Qemu.pm +++ b/PVE/API2/Qemu.pm @@ -2319,6 +2319,7 @@ __PACKAGE__->register_method({ my $spice_ticket; my $nbd_protocol_version = 0; my $replicated_volumes = {}; + my $tpmstate_vol; if ($stateuri && ($stateuri eq 'tcp' || $stateuri eq 'unix') && $migratedfrom && ($rpcenv->{type} eq 'cli')) { while (defined(my $line = )) { chomp $line; @@ -2328,6 +2329,8 @@ __PACKAGE__->register_method({ $nbd_protocol_version = $1; } elsif ($line =~ m/^replicated_volume: (.*)$/) { $replicated_volumes->{$1} = 1; + } elsif ($line =~ m/^tpmstate0: (.*)$/) { + $tpmstate_vol = $1; } elsif (!$spice_ticket) { # fallback for old source node $spice_ticket = $line; @@ -2369,6 +2372,7 @@ __PACKAGE__->register_method({ storagemap => $storagemap, nbd_proto_version => $nbd_protocol_version, replicated_volumes => $replicated_volumes, + tpmstate_vol => $tpmstate_vol, }; my $params = { diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm index ae3eaf1..c9bc39d 100644 --- a/PVE/QemuMigrate.pm +++ b/PVE/QemuMigrate.pm @@ -870,6 +870,14 @@ sub phase2 { # version > 0 for unix socket support my $nbd_protocol_version = 1; my $input = "nbd_protocol_version: $nbd_protocol_version\n"; + +if ($conf->{tpmstate0}) { + my $tpmdrive = PVE::QemuServer::parse_drive('tpmstate0', $conf->{tpmstate0}); + my $tpmvol = $tpmdrive->{file}; + $input .= "tpmstate0: $self->{volume_map}->{$tpmvol}" + if $self->{volume_map}->{$tpmvol} && $tpmvol ne $self->{volume_map}->{$tpmvol}; +} + $input .= "spice_ticket: $spice_ticket\n" if $spice_ticket; my @online_replicated_volumes = $self->filter_local_volumes('online', 1); diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm index 76d45a2..45b704d 100644 --- a/PVE/QemuServer.pm +++ b/PVE/QemuServer.pm @@ -5370,7 +5370,8 @@ sub vm_start { # network => CIDR of migration network # type => secure/insecure - tunnel over encrypted connection or plain-text # nbd_proto_version => int, 0 for TCP, 1 for UNIX -# replicated_volumes = which volids should be re-used with bitmaps for nbd migration +# replicated_volumes => which volids should be re-used with bitmaps for nbd migration +# tpmstate_vol => new volid of tpmstate0, not yet contained in config sub vm_start_nolock { my ($storecfg, $vmid, $conf, $params, $migrate_opts) = @_; @@ -5395,6 +5396,13 @@ sub vm_start_nolock { # this way we can reuse the old ISO with the correct config PVE::QemuServer::Cloudinit::generate_cloudinitconfig($conf, $vmid) if !$migratedfrom; +# override TPM state vol if migrated, conf is out of date still +if (my $tpmvol = $migrate_opts->{tpmstate_vol}) { +my $parsed = parse_drive("tpmstate0", $conf->{tpmstate0}); +$parsed->{file} = $tpmvol; +$conf->{tpmstate0} = print_drive($parsed); +} + my $defaults = load_defaults(); # set environment variable useful inside network script -- 2.30.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/
[pve-devel] applied-series: [PATCH qemu-server 1/2] migrate: improve start STDIN-parameter parsing
On 22.11.21 11:30, Fabian Grünbichler wrote: > only do the compat fallback if no explicit spice ticket was given, and > warn on unknown parameters on STDIN. > > Signed-off-by: Fabian Grünbichler > --- > PVE/API2/Qemu.pm | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > applied, thanks! Adapted the commit message of 2/2 (included the link and tried to also spell out what was wrong). ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] applied-series: [PATCH manager 1/3] pvescheduler: catch errors in forked childs
On 18.11.21 14:28, Dominik Csapak wrote: > if '$sub' dies, the error handler of PVE::Daemon triggers, which > initiates a shutdown of the child, resulting in confusing error logs > (e.g. 'got shutdown request, signal running jobs to stop') > > instead, run it under 'eval' and print the error to the sylog instead > > Signed-off-by: Dominik Csapak > --- > PVE/Service/pvescheduler.pm | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > applied series, thanks! I reworked the worker termination on stop though and also fixed a issue with a possible artificial delay of the reload command (it was also aligned to the minute boundary). ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH widget-toolkit 3/3] ui: logpanel: work around browser 'setTimeout' quirks
for some reason, some browsers omit the scroll event if the 'scrollTo' was done before layouting in some cases, so manually trigger our 'onScroll' handler to set the new start parameter alwyas after scrolling Signed-off-by: Dominik Csapak --- src/panel/LogView.js | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/panel/LogView.js b/src/panel/LogView.js index 4a273a1..be3850e 100644 --- a/src/panel/LogView.js +++ b/src/panel/LogView.js @@ -67,7 +67,7 @@ Ext.define('Proxmox.panel.LogView', { if (view.scrollToEnd && scrollPos <= 5) { // we use setTimeout to work around scroll handling on touchscreens - setTimeout(function() { view.scrollTo(0, Infinity); }, 10); + setTimeout(function() { view.scrollTo(0, Infinity); me.onScroll(); }, 10); } }, -- 2.30.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH widget-toolkit 0/3] improve logpanel behaviour for fast tasks
improves the handling of fast moving tasks * no glitching * catching up to the end of the task log * workaround browser quirks Dominik Csapak (3): ui: logpanel: fix glitching on fast task logs ui: logpanel: catch up to very fast task logs with api calls ui: logpanel: work around browser 'setTimeout' quirks src/panel/LogView.js | 16 1 file changed, 12 insertions(+), 4 deletions(-) -- 2.30.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH widget-toolkit 2/3] ui: logpanel: catch up to very fast task logs with api calls
if we have a task log that produces more lines than we ask for (start+limit), set the start to the (current) total-limit. otherwise, keep the logic to set the start so that our current line is about in the middle of the requested data (to have a buffer in both directions) Signed-off-by: Dominik Csapak --- src/panel/LogView.js | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/src/panel/LogView.js b/src/panel/LogView.js index bbf38ee..4a273a1 100644 --- a/src/panel/LogView.js +++ b/src/panel/LogView.js @@ -132,14 +132,20 @@ Ext.define('Proxmox.panel.LogView', { let line = view.getScrollY()/lineHeight; let start = viewModel.get('params.start'); let limit = viewModel.get('params.limit'); + let total = viewModel.get('data.total'); let viewLines = view.getHeight()/lineHeight; let viewStart = Math.max(parseInt(line - 1 - view.viewBuffer, 10), 0); let viewEnd = parseInt(line + viewLines + 1 + view.viewBuffer, 10); if (viewStart < start || viewEnd > start+limit) { - viewModel.set('params.start', - Math.max(parseInt(line - (limit / 2) + 10, 10), 0)); + if (view.scrollToEnd && total) { + viewModel.set('params.start', + Math.max(parseInt(total - limit, 10), 0)); + } else { + viewModel.set('params.start', + Math.max(parseInt(line - (limit / 2) + 10, 10), 0)); + } view.loadTask.delay(200); } }, -- 2.30.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH widget-toolkit 1/3] ui: logpanel: fix glitching on fast task logs
when we follow (scrollToEnd === true) a task log that produces many lines per second, we often get a 'total' back that exceeds our start+limit. in that case we do not want to pad the bottom of the task log with empty lines, we only want that for finished task logs to preserve the scroll bar position+size. Signed-off-by: Dominik Csapak --- src/panel/LogView.js | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/panel/LogView.js b/src/panel/LogView.js index 1772737..bbf38ee 100644 --- a/src/panel/LogView.js +++ b/src/panel/LogView.js @@ -97,7 +97,9 @@ Ext.define('Proxmox.panel.LogView', { lines[line.n - 1] = Ext.htmlEncode(line.t); }); - lines.length = total; + if (!view.scrollToEnd) { + lines.length = total; + } me.updateView(lines.join(''), first - 1, total); me.running = false; if (me.requested) { -- 2.30.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel