On Wed, Nov 15, 2023 at 09:55:22AM +0100, Fabian Grünbichler wrote:
> On November 14, 2023 3:02 pm, Fiona Ebner wrote:
> > Currently, volume activation, PCI reservation and resetting systemd
> > scope happen in between and the 5 second expiretime used for port
> > reservation might not be enough.
> > 
> > Still not ideal, because entering systemd scope and maybe starting
> > swtpm still happen after reservation before the QEMU binary can be
> > invoked and actually use the port, but the reservation needs to happen
> > outside of the fork, because the result is used there too.
> > 
> > Signed-off-by: Fiona Ebner <f.eb...@proxmox.com>
> 
> Acked-by: Fabian Grünbichler <f.gruenbich...@proxmox.com>
> 
> we could move the whole statefile handling further down, but then some
> additional side-effects need to be taken care of/refactored, this seems
> like a minimal-invasive version for the uncommon (insecure) case.
> 
> > ---
> >  PVE/QemuServer.pm | 20 ++++++++++++++------
> >  1 file changed, 14 insertions(+), 6 deletions(-)
> > 
> > diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> > index c465fb6f..aeaea8eb 100644
> > --- a/PVE/QemuServer.pm
> > +++ b/PVE/QemuServer.pm
> > @@ -5697,6 +5697,9 @@ sub vm_start_nolock {
> >     return $migration_ip;
> >      };
> >  
> > +    # helper to move port reservation and usage closer together to avoid 
> > expiry (bug #4501)
> > +    my $append_tcp_migration_cmdline;
> > +
> >      if ($statefile) {
> >     if ($statefile eq 'tcp') {
> >         my $migrate = $res->{migrate} = { proto => 'tcp' };
> > @@ -5717,12 +5720,13 @@ sub vm_start_nolock {
> >             $migrate->{addr} = "[$migrate->{addr}]" if 
> > Net::IP::ip_is_ipv6($migrate->{addr});
> >         }
> >  
> > -       my $pfamily = PVE::Tools::get_host_address_family($nodename);
> > -       $migrate->{port} = PVE::Tools::next_migrate_port($pfamily);
> > -       $migrate->{uri} = "tcp:$migrate->{addr}:$migrate->{port}";
> > -       push @$cmd, '-incoming', $migrate->{uri};
> > -       push @$cmd, '-S';
> > -
> 
> nit: I'd maybe add another comment here, maybe something like
> 
> # delay migration port reservation to prevent expiry before binding

What about adding an option to `next_migrate_port()` to actually return
the open socket to keep the reservation?

Also, did we consider passing the file descriptor through to qemu via
`-incoming fd:$number`?

> 
> > +       $append_tcp_migration_cmdline = sub {
> > +           my $pfamily = PVE::Tools::get_host_address_family($nodename);
> > +           $migrate->{port} = PVE::Tools::next_migrate_port($pfamily);
> > +           $migrate->{uri} = "tcp:$migrate->{addr}:$migrate->{port}";
> > +           push @$cmd, '-incoming', $migrate->{uri};
> > +           push @$cmd, '-S';
> > +       };
> >     } elsif ($statefile eq 'unix') {
> >         # should be default for secure migrations as a ssh TCP forward
> >         # tunnel is not deterministic reliable ready and fails regurarly
> > @@ -5840,6 +5844,10 @@ sub vm_start_nolock {
> >      $systemd_properties{timeout} = 10 if $statefile; # setting up the 
> > scope shoul be quick
> >  
> >      my $run_qemu = sub {
> > +   # sets the port+uri for $res->{migrate} which is printed below and part 
> > of the result, so
> > +   # needs to happen outside of the fork.
> > +   $append_tcp_migration_cmdline->() if $append_tcp_migration_cmdline;
> > +
> >     PVE::Tools::run_fork sub {
> >         PVE::Systemd::enter_systemd_scope($vmid, "Proxmox VE VM $vmid", 
> > %systemd_properties);
> >  
> > -- 
> > 2.39.2


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to