[pve-devel] applied-series: [PATCH v2 storage 1/3] zfs: list: only

2022-12-21 Thread Fabian Grünbichler
thanks!

some possible room for further improvements:
- zfs_list_zvol could add `-d1` to only list $scfg->{pool} and direct children
  (then we don't need to filter out any indirect descendants, just the "pool"
  itself..)
- $list in zfs_list_zvol could be initialized as `{}`, then we don't need an
  explicit fallback in list_images

On December 20, 2022 2:16 pm, Fiona Ebner wrote:
> The plugin for remote ZFS storages currently also uses the same
> list_images() as the plugin for local ZFS storages. There is only
> one cache which does not remember the target host where the
> information originated.
> 
> This is problematic for rescan in qemu-server:
> 1. Via list_images() and zfs_list_zvol(), ZFSPlugin.pm's zfs_request()
>is executed for a remote ZFS.
> 2. $cache->{zfs} is set to the result of scanning the images there.
> 3. Another list_images() for a local ZFS storage happens and uses the
>cache with the wrong information.
> 
> The only two operations using the cache currently are
> 1. Disk rescan in qemu-server which is affected by the issue. It is
>done as part of (or as a) long-running operations.
> 2. prepare_local_job for pvesr, but the non-local ZFS plugin doesn't
>support replication, so it cannot trigger there. The cache only
>helps if there is a replicated guest with volumes on different
>ZFS storages, but otherwise it will be less efficient than no
>cache or querying every storage by itself.
> 
> Fix the issue by making the cache $storeid-specific, which effectively
> makes the cache superfluous, because there is no caller using the same
> $storeid multiple times. As argued above, this should not really hurt
> the callers described above much and actually improves performance for
> all other callers.
> 
> Signed-off-by: Fiona Ebner 
> ---
> 
> Changes from v1:
> * Always only list images for $scfg->{pool} (and drop patch that
>   would only do so for the remote ZFS plugin).
> * This makes the cache useless, so add a patch removing it.
> * Also add a patch for cleanup.
> 
> See here for previous discussion:
> https://lists.proxmox.com/pipermail/pve-devel/2022-November/054485.html
> 
>  PVE/Storage/ZFSPoolPlugin.pm | 25 +++--
>  1 file changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/PVE/Storage/ZFSPoolPlugin.pm b/PVE/Storage/ZFSPoolPlugin.pm
> index e264fde..0f16e7d 100644
> --- a/PVE/Storage/ZFSPoolPlugin.pm
> +++ b/PVE/Storage/ZFSPoolPlugin.pm
> @@ -254,11 +254,12 @@ sub free_image {
>  sub list_images {
>  my ($class, $storeid, $scfg, $vmid, $vollist, $cache) = @_;
>  
> -$cache->{zfs} = $class->zfs_list_zvol($scfg) if !$cache->{zfs};
> -my $zfspool = $scfg->{pool};
> +$cache->{zfs}->{$storeid} = $class->zfs_list_zvol($scfg)
> + if !$cache->{zfs}->{$storeid};
> +
>  my $res = [];
>  
> -if (my $dat = $cache->{zfs}->{$zfspool}) {
> +if (my $dat = $cache->{zfs}->{$storeid}) {
>  
>   foreach my $image (keys %$dat) {
>  
> @@ -375,20 +376,32 @@ sub zfs_delete_zvol {
>  sub zfs_list_zvol {
>  my ($class, $scfg) = @_;
>  
> -my $text = $class->zfs_request($scfg, 10, 'list', '-o', 
> 'name,volsize,origin,type,refquota', '-t', 'volume,filesystem', '-Hrp');
> +my $text = $class->zfs_request(
> + $scfg,
> + 10,
> + 'list',
> + '-o',
> + 'name,volsize,origin,type,refquota',
> + '-t',
> + 'volume,filesystem',
> + '-Hrp',
> + $scfg->{pool},
> +);
>  my $zvols = zfs_parse_zvol_list($text);
>  return undef if !$zvols;
>  
>  my $list = ();
>  foreach my $zvol (@$zvols) {
> - my $pool = $zvol->{pool};
> + # The "pool" in $scfg is not the same as ZFS pool, so it's necessary to 
> filter here.
> + next if $scfg->{pool} ne $zvol->{pool};
> +
>   my $name = $zvol->{name};
>   my $parent = $zvol->{origin};
>   if($zvol->{origin} && $zvol->{origin} =~ m/^$scfg->{pool}\/(\S+)$/){
>   $parent = $1;
>   }
>  
> - $list->{$pool}->{$name} = {
> + $list->{$name} = {
>   name => $name,
>   size => $zvol->{size},
>   parent => $parent,
> -- 
> 2.30.2
> 
> 
> 
> ___
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] [PATCH pve-guest-common 1/1] partially fix #2530: snapshots: add pre/post/failed-snapshot hooks

2022-12-21 Thread Fabian Grünbichler
this is v2, right? ;)

On December 12, 2022 2:43 pm, Stefan Hanreich wrote:
> This commit adds hooks to the snapshotting process, which can be used
> to run additional setup scripts to prepare the VM for snapshotting.
> 
> Examples for use cases include:
> * forcing processes to flush their writes
> * blocking processes from writing
> * altering the configuration of the VM to make snapshotting possible
> 
> The prepare step has been split into two parts, so the configuration
> can be locked a bit earlier during the snapshotting process. Doing it
> this way ensures that the configuration is already locked during the
> pre-snapshot hook. Because of this split, the VM config gets written
> in two stages now, rather than one.
> 
> In case of failure during the preparation step - after the lock is
> written - error handling has been added so the lock gets released
> properly. The failed-snapshot hook runs when the snapshot fails, if
> the pre-snapshot hook ran already. This enables users to revert any
> changes done during the pre-snapshot hookscript.

see below
 
> The preparation step assumes that the hook does not convert the
> current VM into a template, which is why the basic checks are not
> re-run after the pre-snapshot hook. The storage check runs after the
> pre-snapshot hook, because the hook might get used to setup the
> storage for snapshotting. If the hook would run after the storage
> checks, this becomes impossible.
> 
> cfs_update() gets called after every invocation of a hookscript, since
> it is impossible to know which changes get made by the hookscript.
> Doing this ensures that we see the updated state of the CFS after the
> hookscript got invoked.
> 
> Signed-off-by: Stefan Hanreich 
> ---
>  src/PVE/AbstractConfig.pm | 49 ++-
>  1 file changed, 43 insertions(+), 6 deletions(-)
> 
> diff --git a/src/PVE/AbstractConfig.pm b/src/PVE/AbstractConfig.pm
> index a0c0bc6..3bff600 100644
> --- a/src/PVE/AbstractConfig.pm
> +++ b/src/PVE/AbstractConfig.pm
> @@ -710,8 +710,7 @@ sub __snapshot_prepare {
>  
>  my $snap;
>  
> -my $updatefn =  sub {
> -
> +my $run_checks = sub {
>   my $conf = $class->load_config($vmid);
>  
>   die "you can't take a snapshot if it's a template\n"
> @@ -721,15 +720,21 @@ sub __snapshot_prepare {
>  
>   $conf->{lock} = 'snapshot';
>  
> - my $snapshots = $conf->{snapshots};
> -
>   die "snapshot name '$snapname' already used\n"
> - if defined($snapshots->{$snapname});
> + if defined($conf->{snapshots}->{$snapname});
> +
> + $class->write_config($vmid, $conf);
> +};
>  
> +my $updatefn = sub {
> + my $conf = $class->load_config($vmid);
>   my $storecfg = PVE::Storage::config();
> +
>   die "snapshot feature is not available\n"
>   if !$class->has_feature('snapshot', $conf, $storecfg, undef, undef, 
> $snapname eq 'vzdump');
>  
> + my $snapshots = $conf->{snapshots};
> +
>   for my $snap (sort keys %$snapshots) {
>   my $parent_name = $snapshots->{$snap}->{parent} // '';
>   if ($snapname eq $parent_name) {
> @@ -753,7 +758,32 @@ sub __snapshot_prepare {
>   $class->write_config($vmid, $conf);
>  };
>  
> -$class->lock_config($vmid, $updatefn);
> +$class->lock_config($vmid, $run_checks);
> +
> +eval {
> + my $conf = $class->load_config($vmid);
> + PVE::GuestHelpers::exec_hookscript($conf, $vmid, "pre-snapshot", 1);
> +};
> +my $err = $@;
> +
> +PVE::Cluster::cfs_update();
> +
> +if ($err) {
> + $class->remove_lock($vmid, 'snapshot');
> + die $err;
> +}
> +

I wonder if we don't also want to call the 'failed-snapshot' phase when just the
pre-snapshot invocation failed? might be possible to combine the error handling
then, although I am not sure it makes it more readable if combined..

> +   
> +if (my $err = $@) {
> + my $conf = $class->load_config($vmid);
> + PVE::GuestHelpers::exec_hookscript($conf, $vmid, "failed-snapshot");

this exec_hookscript needs to be inside an eval {}, with warn in case it fails..

also, this call here happens when preparing for making the snapshot, after
possibly saving the VM state, but before taking the volume snapshots..

> + PVE::Cluster::cfs_update();
> +
> + $class->remove_lock($vmid, 'snapshot');
> + die $err;
> +}
>  
>  return $snap;
>  }
> @@ -837,11 +867,18 @@ sub snapshot_create {
>  
>  if ($err) {
>   warn "snapshot create failed: starting cleanup\n";
> +
> + PVE::GuestHelpers::exec_hookscript($conf, $vmid, "failed-snapshot");

eval + warn as well

this call here happens when the volume snapshots might or might not have been
created already (depending on what exactly the error cause is).

> + PVE::Cluster::cfs_update();
> +
>   eval { $class->snapshot_delete($vmid, $snapname, 1, $drivehash); };
>   warn "$@" if $@;
>   die "$err\n";
>  }
>  
> +PVE::GuestHelpers::exec_

Re: [pve-devel] [PATCH qemu-server v3 3/5] await and kill lingering KVM thread when VM start reaches timeout

2022-12-21 Thread Fabian Grünbichler
On December 16, 2022 2:36 pm, Daniel Tschlatscher wrote:
> In some cases the VM API start method would return before the detached
> KVM process would have exited. This is especially problematic with HA,
> because the HA manager would think the VM started successfully, later
> see that it exited and start it again in an endless loop.
> 
> Moreover, another case exists when resuming a hibernated VM. In this
> case, the qemu thread will attempt to load the whole vmstate into
> memory before exiting.
> Depending on vmstate size, disk read speed, and similar factors this
> can take quite a while though and it is not possible to start the VM
> normally during this time.
> 
> To get around this, this patch intercepts the error, looks whether a
> corresponding KVM thread is still running, and waits for/kills it,
> before continuing.
> 
> Signed-off-by: Daniel Tschlatscher 
> ---
> 
> Changes from v2:
> * Rebased to current master
> * Changed warn to use 'log_warn' instead
> * Reworded log message when waiting for lingering qemu process
> 
>  PVE/QemuServer.pm | 40 +---
>  1 file changed, 33 insertions(+), 7 deletions(-)
> 
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 2adbe3a..f63dc3f 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -5884,15 +5884,41 @@ sub vm_start_nolock {
>   $tpmpid = start_swtpm($storecfg, $vmid, $tpm, $migratedfrom);
>   }
>  
> - my $exitcode = run_command($cmd, %run_params);
> - if ($exitcode) {
> - if ($tpmpid) {
> - warn "stopping swtpm instance (pid $tpmpid) due to QEMU 
> startup error\n";
> - kill 'TERM', $tpmpid;
> + eval {
> + my $exitcode = run_command($cmd, %run_params);
> +
> + if ($exitcode) {
> + if ($tpmpid) {
> + log_warn "stopping swtpm instance (pid $tpmpid) due to 
> QEMU startup
error\n";

this warn -> log_warn change kind of slipped in, it's not really part of this
patch?

> + kill 'TERM', $tpmpid;
> + }
> + die "QEMU exited with code $exitcode\n";
>   }
> - die "QEMU exited with code $exitcode\n";
> + };
> +
> + if (my $err = $@) {
> + my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid);
> +
> + if ($pid ne "") {

can be combined:
if (my $pid = ...) {

}

(empty string evaluates to false in perl ;))

> + my $count = 0;
> + my $timeout = 300;
> +
> + print "Waiting $timeout seconds for detached qemu process 
> $pid to exit\n";
> + while (($count < $timeout) &&
> + PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
> + $count++;
> + sleep(1);
> + }
> +

either here

> + if ($count >= $timeout) {
> + log_warn "Reached timeout. Terminating now with 
> SIGKILL\n";

or here, recheck that VM is still running and still has the same PID, and log
accordingly instead of KILLing if not..

the same is also true in _do_vm_stop

> + kill(9, $pid);
> + }
> + }
> +
> + die $err;
>   }
> - };
> + }
>  };
>  
>  if ($conf->{hugepages}) {
> -- 
> 2.30.2
> 
> 
> 
> ___
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] [PATCH pve-guest-common 1/1] partially fix #2530: snapshots: add pre/post/failed-snapshot hooks

2022-12-21 Thread Stefan Hanreich



On 12/21/22 11:44, Fabian Grünbichler wrote:

this is v2, right? ;)


Oh no - for some reason it's only in the cover letter..



On December 12, 2022 2:43 pm, Stefan Hanreich wrote:

This commit adds hooks to the snapshotting process, which can be used
to run additional setup scripts to prepare the VM for snapshotting.

Examples for use cases include:
* forcing processes to flush their writes
* blocking processes from writing
* altering the configuration of the VM to make snapshotting possible

The prepare step has been split into two parts, so the configuration
can be locked a bit earlier during the snapshotting process. Doing it
this way ensures that the configuration is already locked during the
pre-snapshot hook. Because of this split, the VM config gets written
in two stages now, rather than one.

In case of failure during the preparation step - after the lock is
written - error handling has been added so the lock gets released
properly. The failed-snapshot hook runs when the snapshot fails, if
the pre-snapshot hook ran already. This enables users to revert any
changes done during the pre-snapshot hookscript.


see below
  

The preparation step assumes that the hook does not convert the
current VM into a template, which is why the basic checks are not
re-run after the pre-snapshot hook. The storage check runs after the
pre-snapshot hook, because the hook might get used to setup the
storage for snapshotting. If the hook would run after the storage
checks, this becomes impossible.

cfs_update() gets called after every invocation of a hookscript, since
it is impossible to know which changes get made by the hookscript.
Doing this ensures that we see the updated state of the CFS after the
hookscript got invoked.

Signed-off-by: Stefan Hanreich 
---
  src/PVE/AbstractConfig.pm | 49 ++-
  1 file changed, 43 insertions(+), 6 deletions(-)

diff --git a/src/PVE/AbstractConfig.pm b/src/PVE/AbstractConfig.pm
index a0c0bc6..3bff600 100644
--- a/src/PVE/AbstractConfig.pm
+++ b/src/PVE/AbstractConfig.pm
@@ -710,8 +710,7 @@ sub __snapshot_prepare {
  
  my $snap;
  
-my $updatefn =  sub {

-
+my $run_checks = sub {
my $conf = $class->load_config($vmid);
  
  	die "you can't take a snapshot if it's a template\n"

@@ -721,15 +720,21 @@ sub __snapshot_prepare {
  
  	$conf->{lock} = 'snapshot';
  
-	my $snapshots = $conf->{snapshots};

-
die "snapshot name '$snapname' already used\n"
-   if defined($snapshots->{$snapname});
+   if defined($conf->{snapshots}->{$snapname});
+
+   $class->write_config($vmid, $conf);
+};
  
+my $updatefn = sub {

+   my $conf = $class->load_config($vmid);
my $storecfg = PVE::Storage::config();
+
die "snapshot feature is not available\n"
if !$class->has_feature('snapshot', $conf, $storecfg, undef, undef, 
$snapname eq 'vzdump');
  
+	my $snapshots = $conf->{snapshots};

+
for my $snap (sort keys %$snapshots) {
my $parent_name = $snapshots->{$snap}->{parent} // '';
if ($snapname eq $parent_name) {
@@ -753,7 +758,32 @@ sub __snapshot_prepare {
$class->write_config($vmid, $conf);
  };
  
-$class->lock_config($vmid, $updatefn);

+$class->lock_config($vmid, $run_checks);
+
+eval {
+   my $conf = $class->load_config($vmid);
+   PVE::GuestHelpers::exec_hookscript($conf, $vmid, "pre-snapshot", 1);
+};
+my $err = $@;
+
+PVE::Cluster::cfs_update();
+
+if ($err) {
+   $class->remove_lock($vmid, 'snapshot');
+   die $err;
+}
+


I wonder if we don't also want to call the 'failed-snapshot' phase when just the
pre-snapshot invocation failed? might be possible to combine the error handling
then, although I am not sure it makes it more readable if combined..



I thought about it, but I thought that if the user die's in his perl 
script he should be able to run any cleanup code before that. This 
doesn't consider any problems in the hookscript unforeseen by the user 
though, so I think your approach is better, since it is easier to use. 
This places less burden on the author of the hookscript. Might make the 
code a bit more convoluted though (depending on how we want to handle 
errors in failed-snapshot), but the upsides are way better imo.


One thing that would be easier with making the user do his cleanup in 
pre-snapshot would be that the pre-snapshot hook knows exactly what 
failed in pre-snapshot, so cleanup-code could use that information to 
skip certain steps. But again, it assumes that pre-snapshot will 
properly handle any possible error, which might be a bit much to assume.



+
+if (my $err = $@) {
+   my $conf = $class->load_config($vmid);
+   PVE::GuestHelpers::exec_hookscript($conf, $vmid, "failed-snapshot");


this exec_hookscript needs to be inside an eval {}, with warn in case it fails..


Isn't this already handled by the exec_hookscript function, since I am 

Re: [pve-devel] [PATCH pve-guest-common 1/1] partially fix #2530: snapshots: add pre/post/failed-snapshot hooks

2022-12-21 Thread Fabian Grünbichler
On December 21, 2022 12:26 pm, Stefan Hanreich wrote:
> 
> 
> On 12/21/22 11:44, Fabian Grünbichler wrote:
>> this is v2, right? ;)
> 
> Oh no - for some reason it's only in the cover letter..
> 
>> 
>> On December 12, 2022 2:43 pm, Stefan Hanreich wrote:
>>> This commit adds hooks to the snapshotting process, which can be used
>>> to run additional setup scripts to prepare the VM for snapshotting.
>>>
>>> Examples for use cases include:
>>> * forcing processes to flush their writes
>>> * blocking processes from writing
>>> * altering the configuration of the VM to make snapshotting possible
>>>
>>> The prepare step has been split into two parts, so the configuration
>>> can be locked a bit earlier during the snapshotting process. Doing it
>>> this way ensures that the configuration is already locked during the
>>> pre-snapshot hook. Because of this split, the VM config gets written
>>> in two stages now, rather than one.
>>>
>>> In case of failure during the preparation step - after the lock is
>>> written - error handling has been added so the lock gets released
>>> properly. The failed-snapshot hook runs when the snapshot fails, if
>>> the pre-snapshot hook ran already. This enables users to revert any
>>> changes done during the pre-snapshot hookscript.
>> 
>> see below
>>   
>>> The preparation step assumes that the hook does not convert the
>>> current VM into a template, which is why the basic checks are not
>>> re-run after the pre-snapshot hook. The storage check runs after the
>>> pre-snapshot hook, because the hook might get used to setup the
>>> storage for snapshotting. If the hook would run after the storage
>>> checks, this becomes impossible.
>>>
>>> cfs_update() gets called after every invocation of a hookscript, since
>>> it is impossible to know which changes get made by the hookscript.
>>> Doing this ensures that we see the updated state of the CFS after the
>>> hookscript got invoked.
>>>
>>> Signed-off-by: Stefan Hanreich 
>>> ---
>>>   src/PVE/AbstractConfig.pm | 49 ++-
>>>   1 file changed, 43 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/src/PVE/AbstractConfig.pm b/src/PVE/AbstractConfig.pm
>>> index a0c0bc6..3bff600 100644
>>> --- a/src/PVE/AbstractConfig.pm
>>> +++ b/src/PVE/AbstractConfig.pm
>>> @@ -710,8 +710,7 @@ sub __snapshot_prepare {
>>>   
>>>   my $snap;
>>>   
>>> -my $updatefn =  sub {
>>> -
>>> +my $run_checks = sub {
>>> my $conf = $class->load_config($vmid);
>>>   
>>> die "you can't take a snapshot if it's a template\n"
>>> @@ -721,15 +720,21 @@ sub __snapshot_prepare {
>>>   
>>> $conf->{lock} = 'snapshot';
>>>   
>>> -   my $snapshots = $conf->{snapshots};
>>> -
>>> die "snapshot name '$snapname' already used\n"
>>> -   if defined($snapshots->{$snapname});
>>> +   if defined($conf->{snapshots}->{$snapname});
>>> +
>>> +   $class->write_config($vmid, $conf);
>>> +};
>>>   
>>> +my $updatefn = sub {
>>> +   my $conf = $class->load_config($vmid);
>>> my $storecfg = PVE::Storage::config();
>>> +
>>> die "snapshot feature is not available\n"
>>> if !$class->has_feature('snapshot', $conf, $storecfg, undef, undef, 
>>> $snapname eq 'vzdump');
>>>   
>>> +   my $snapshots = $conf->{snapshots};
>>> +
>>> for my $snap (sort keys %$snapshots) {
>>> my $parent_name = $snapshots->{$snap}->{parent} // '';
>>> if ($snapname eq $parent_name) {
>>> @@ -753,7 +758,32 @@ sub __snapshot_prepare {
>>> $class->write_config($vmid, $conf);
>>>   };
>>>   
>>> -$class->lock_config($vmid, $updatefn);
>>> +$class->lock_config($vmid, $run_checks);
>>> +
>>> +eval {
>>> +   my $conf = $class->load_config($vmid);
>>> +   PVE::GuestHelpers::exec_hookscript($conf, $vmid, "pre-snapshot", 1);
>>> +};
>>> +my $err = $@;
>>> +
>>> +PVE::Cluster::cfs_update();
>>> +
>>> +if ($err) {
>>> +   $class->remove_lock($vmid, 'snapshot');
>>> +   die $err;
>>> +}
>>> +
>> 
>> I wonder if we don't also want to call the 'failed-snapshot' phase when just 
>> the
>> pre-snapshot invocation failed? might be possible to combine the error 
>> handling
>> then, although I am not sure it makes it more readable if combined..
>> 
> 
> I thought about it, but I thought that if the user die's in his perl 
> script he should be able to run any cleanup code before that. This 
> doesn't consider any problems in the hookscript unforeseen by the user 
> though, so I think your approach is better, since it is easier to use. 
> This places less burden on the author of the hookscript. Might make the 
> code a bit more convoluted though (depending on how we want to handle 
> errors in failed-snapshot), but the upsides are way better imo.
> 
> One thing that would be easier with making the user do his cleanup in 
> pre-snapshot would be that the pre-snapshot hook knows exactly what 
> failed in pre-snapshot, so cleanup-code could use that information to 
> skip certain ste

Re: [pve-devel] [PATCH pve-guest-common 1/1] partially fix #2530: snapshots: add pre/post/failed-snapshot hooks

2022-12-21 Thread Stefan Hanreich



On 12/21/22 13:41, Fabian Grünbichler wrote:

On December 21, 2022 12:26 pm, Stefan Hanreich wrote:



On 12/21/22 11:44, Fabian Grünbichler wrote:

this is v2, right? ;)


Oh no - for some reason it's only in the cover letter..



On December 12, 2022 2:43 pm, Stefan Hanreich wrote:

This commit adds hooks to the snapshotting process, which can be used
to run additional setup scripts to prepare the VM for snapshotting.

Examples for use cases include:
* forcing processes to flush their writes
* blocking processes from writing
* altering the configuration of the VM to make snapshotting possible

The prepare step has been split into two parts, so the configuration
can be locked a bit earlier during the snapshotting process. Doing it
this way ensures that the configuration is already locked during the
pre-snapshot hook. Because of this split, the VM config gets written
in two stages now, rather than one.

In case of failure during the preparation step - after the lock is
written - error handling has been added so the lock gets released
properly. The failed-snapshot hook runs when the snapshot fails, if
the pre-snapshot hook ran already. This enables users to revert any
changes done during the pre-snapshot hookscript.


see below
   

The preparation step assumes that the hook does not convert the
current VM into a template, which is why the basic checks are not
re-run after the pre-snapshot hook. The storage check runs after the
pre-snapshot hook, because the hook might get used to setup the
storage for snapshotting. If the hook would run after the storage
checks, this becomes impossible.

cfs_update() gets called after every invocation of a hookscript, since
it is impossible to know which changes get made by the hookscript.
Doing this ensures that we see the updated state of the CFS after the
hookscript got invoked.

Signed-off-by: Stefan Hanreich 
---
   src/PVE/AbstractConfig.pm | 49 ++-
   1 file changed, 43 insertions(+), 6 deletions(-)

diff --git a/src/PVE/AbstractConfig.pm b/src/PVE/AbstractConfig.pm
index a0c0bc6..3bff600 100644
--- a/src/PVE/AbstractConfig.pm
+++ b/src/PVE/AbstractConfig.pm
@@ -710,8 +710,7 @@ sub __snapshot_prepare {
   
   my $snap;
   
-my $updatefn =  sub {

-
+my $run_checks = sub {
my $conf = $class->load_config($vmid);
   
   	die "you can't take a snapshot if it's a template\n"

@@ -721,15 +720,21 @@ sub __snapshot_prepare {
   
   	$conf->{lock} = 'snapshot';
   
-	my $snapshots = $conf->{snapshots};

-
die "snapshot name '$snapname' already used\n"
-   if defined($snapshots->{$snapname});
+   if defined($conf->{snapshots}->{$snapname});
+
+   $class->write_config($vmid, $conf);
+};
   
+my $updatefn = sub {

+   my $conf = $class->load_config($vmid);
my $storecfg = PVE::Storage::config();
+
die "snapshot feature is not available\n"
if !$class->has_feature('snapshot', $conf, $storecfg, undef, undef, 
$snapname eq 'vzdump');
   
+	my $snapshots = $conf->{snapshots};

+
for my $snap (sort keys %$snapshots) {
my $parent_name = $snapshots->{$snap}->{parent} // '';
if ($snapname eq $parent_name) {
@@ -753,7 +758,32 @@ sub __snapshot_prepare {
$class->write_config($vmid, $conf);
   };
   
-$class->lock_config($vmid, $updatefn);

+$class->lock_config($vmid, $run_checks);
+
+eval {
+   my $conf = $class->load_config($vmid);
+   PVE::GuestHelpers::exec_hookscript($conf, $vmid, "pre-snapshot", 1);
+};
+my $err = $@;
+
+PVE::Cluster::cfs_update();
+
+if ($err) {
+   $class->remove_lock($vmid, 'snapshot');
+   die $err;
+}
+


I wonder if we don't also want to call the 'failed-snapshot' phase when just the
pre-snapshot invocation failed? might be possible to combine the error handling
then, although I am not sure it makes it more readable if combined..



I thought about it, but I thought that if the user die's in his perl
script he should be able to run any cleanup code before that. This
doesn't consider any problems in the hookscript unforeseen by the user
though, so I think your approach is better, since it is easier to use.
This places less burden on the author of the hookscript. Might make the
code a bit more convoluted though (depending on how we want to handle
errors in failed-snapshot), but the upsides are way better imo.

One thing that would be easier with making the user do his cleanup in
pre-snapshot would be that the pre-snapshot hook knows exactly what
failed in pre-snapshot, so cleanup-code could use that information to
skip certain steps. But again, it assumes that pre-snapshot will
properly handle any possible error, which might be a bit much to assume.


yes, there is always the question of whether the hookscript does (proper) error
handling.. but if it does, an additional call to failed-snapshot shouldn't hurt
since that should be covered as well (in 

[pve-devel] applied-series: Re: [PATCH qemu 1/3] d/rules: enable slirp again

2022-12-21 Thread Thomas Lamprecht
On 20/12/2022 09:19, Fiona Ebner wrote:
> Commit d03e1b3 ("update submodule and patches to 7.2.0") argued that
> slirp is not explicitly supported in PVE, but that is not true. In
> qemu-server, user networking is supported (via CLI/API) when no bridge
> is set on a virtual NIC. So slirp needs to stay to keep such NICs
> working.
> 
> Signed-off-by: Fiona Ebner 
> ---
>  debian/rules | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
>

applied, thanks!


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH manager 0/3] Ceph API fixes/updates

2022-12-21 Thread Aaron Lauterer
A few updates on the Ceph related API. Mainly:
- fixing outdated indexes
- adding/updating descritions and return value definitions

This series could collide with a few other Ceph patches that are still
outstanding. So I do expect needed rebases here and there.

Aaron Lauterer (3):
  ceph api: fix descriptions
  api: ceph: update index list
  api: ceph: update return value definitions

 PVE/API2/Ceph.pm | 30 +++---
 PVE/API2/Ceph/MON.pm | 14 +++---
 PVE/API2/Ceph/OSD.pm | 13 +
 PVE/API2/Cluster/Ceph.pm | 23 ++-
 4 files changed, 65 insertions(+), 15 deletions(-)

-- 
2.30.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH manager 1/3] ceph api: fix descriptions

2022-12-21 Thread Aaron Lauterer
Signed-off-by: Aaron Lauterer 
---
 PVE/API2/Ceph.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/PVE/API2/Ceph.pm b/PVE/API2/Ceph.pm
index f3442408..0490f4a2 100644
--- a/PVE/API2/Ceph.pm
+++ b/PVE/API2/Ceph.pm
@@ -114,7 +114,7 @@ __PACKAGE__->register_method ({
 permissions => {
check => ['perm', '/', [ 'Sys.Audit', 'Datastore.Audit' ], any => 1],
 },
-description => "Get Ceph configuration.",
+description => "Get the Ceph configuration file.",
 parameters => {
additionalProperties => 0,
properties => {
@@ -141,7 +141,7 @@ __PACKAGE__->register_method ({
 permissions => {
check => ['perm', '/', [ 'Sys.Audit', 'Datastore.Audit' ], any => 1],
 },
-description => "Get Ceph configuration database.",
+description => "Get the Ceph configuration database.",
 parameters => {
additionalProperties => 0,
properties => {
-- 
2.30.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH manager 2/3] api: ceph: update index list

2022-12-21 Thread Aaron Lauterer
some items were missing and placing them in alphabetical order will make
it easier in the future

Signed-off-by: Aaron Lauterer 
---
 PVE/API2/Ceph.pm | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/PVE/API2/Ceph.pm b/PVE/API2/Ceph.pm
index 0490f4a2..55220324 100644
--- a/PVE/API2/Ceph.pm
+++ b/PVE/API2/Ceph.pm
@@ -87,20 +87,23 @@ __PACKAGE__->register_method ({
my ($param) = @_;
 
my $result = [
+   { name => 'cmd-safety' },
+   { name => 'config' },
+   { name => 'configdb' },
+   { name => 'crush' },
+   { name => 'fs' },
{ name => 'init' },
+   { name => 'log' },
+   { name => 'mds' },
+   { name => 'mgr' },
{ name => 'mon' },
{ name => 'osd' },
{ name => 'pools' },
-   { name => 'fs' },
-   { name => 'mds' },
-   { name => 'stop' },
-   { name => 'start' },
{ name => 'restart' },
-   { name => 'status' },
-   { name => 'crush' },
-   { name => 'config' },
-   { name => 'log' },
{ name => 'rules' },
+   { name => 'start' },
+   { name => 'status' },
+   { name => 'stop' },
];
 
return $result;
-- 
2.30.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH manager 3/3] api: ceph: update return value definitions

2022-12-21 Thread Aaron Lauterer
to have at either a more accurate description or some description at
all. For objects returning a lot of data, for example individual Ceph
services, a full description has been omitted as I think that this would
be a bit much.

Signed-off-by: Aaron Lauterer 
---
 PVE/API2/Ceph.pm |  7 ++-
 PVE/API2/Ceph/MON.pm | 14 +++---
 PVE/API2/Ceph/OSD.pm | 13 +
 PVE/API2/Cluster/Ceph.pm | 23 ++-
 4 files changed, 52 insertions(+), 5 deletions(-)

diff --git a/PVE/API2/Ceph.pm b/PVE/API2/Ceph.pm
index 55220324..cc8720b2 100644
--- a/PVE/API2/Ceph.pm
+++ b/PVE/API2/Ceph.pm
@@ -622,7 +622,12 @@ __PACKAGE__->register_method ({
type => 'array',
items => {
type => "object",
-   properties => {},
+   properties => {
+   name => {
+   description => "Name of the CRUSH rule.",
+   type => "string",
+   }
+   },
},
links => [ { rel => 'child', href => "{name}" } ],
 },
diff --git a/PVE/API2/Ceph/MON.pm b/PVE/API2/Ceph/MON.pm
index 5771bb46..b452045c 100644
--- a/PVE/API2/Ceph/MON.pm
+++ b/PVE/API2/Ceph/MON.pm
@@ -212,9 +212,17 @@ __PACKAGE__->register_method ({
items => {
type => "object",
properties => {
-   name => { type => 'string' },
-   addr => { type => 'string', optional => 1 },
-   host => { type => 'string', optional => 1 },
+   name=> { type => 'string' },
+   addr=> { type => 'string', optional => 1 },
+   host=> { type => 'boolean', optional => 1 },
+   direxists   => { type => 'string', optional => 1 },
+   quorum  => { type => 'boolean', optional => 1 },
+   host=> { type => 'string', optional => 1 },
+   rank=> { type => 'integer', optional => 1 },
+   service => { type => 'integer', optional => 1 },
+   state   => { type => 'string', optional => 1 },
+   ceph_version=> { type => 'string', optional => 1 },
+   ceph_version_short  => { type => 'string', optional => 1 },
},
},
links => [ { rel => 'child', href => "{name}" } ],
diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm
index 93433b3a..ce1281d9 100644
--- a/PVE/API2/Ceph/OSD.pm
+++ b/PVE/API2/Ceph/OSD.pm
@@ -88,6 +88,19 @@ __PACKAGE__->register_method ({
 # fixme: return a list instead of extjs tree format ?
 returns => {
type => "object",
+   items => {
+   type => "object",
+   properties => {
+   flags => { type => "string" },
+   root => {
+   type => "object",
+   description => "extjs formatted grid tree",
+   },
+   },
+   },
+
+
+   }
 },
 code => sub {
my ($param) = @_;
diff --git a/PVE/API2/Cluster/Ceph.pm b/PVE/API2/Cluster/Ceph.pm
index 7f825003..49f84b24 100644
--- a/PVE/API2/Cluster/Ceph.pm
+++ b/PVE/API2/Cluster/Ceph.pm
@@ -68,7 +68,20 @@ __PACKAGE__->register_method ({
},
},
 },
-returns => { type => 'object' },
+returns => {
+   type => 'object'
+   description => "Items for each type of service containing objects for 
each instance.",
+   items => {
+   type => "object",
+   properties => {
+   mds => { type => "object" },
+   mgr => { type => "object" },
+   mon => { type => "object" },
+   node => { type => "object" },
+   osd => { type => "object" },
+   }
+   },
+},
 code => sub {
my ($param) = @_;
 
@@ -181,6 +194,14 @@ __PACKAGE__->register_method ({
description => "Flag name.",
type => 'string', enum => $possible_flags_list,
},
+   description => {
+   description => "Flag description.",
+   type => 'string',
+   },
+   value => {
+   description => "Flag value.",
+   type => 'boolean',
+   },
},
},
links => [ { rel => 'child', href => "{name}" } ],
-- 
2.30.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] applied-series: [PATCH v2 docs 1/2] qm: IO thread: be more precise about how QEMU handles IO

2022-12-21 Thread Thomas Lamprecht
On 20/12/2022 12:24, Fiona Ebner wrote:
> Reported in the community forum[0].
> 
> The setting can already help with a single disk. Without the option,
> there is not one IO thread as the old wording suggested, but IO is
> handled in the main event loop or vCPU threads (see the kvm man page).
> 
> [0]: https://forum.proxmox.com/threads/118390/post-518532
> 
> Signed-off-by: Fiona Ebner 
> ---
> 
> Changes from v1:
> * Rebase on current master.
> 
>  qm.adoc | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
>

applied, thanks!


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] applied: [PATCH manager 1/3] ceph api: fix descriptions

2022-12-21 Thread Thomas Lamprecht
On 21/12/2022 14:30, Aaron Lauterer wrote:
> Signed-off-by: Aaron Lauterer 
> ---
>  PVE/API2/Ceph.pm | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
>

applied, thanks!


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] applied: [PATCH manager 2/3] api: ceph: update index list

2022-12-21 Thread Thomas Lamprecht
On 21/12/2022 14:30, Aaron Lauterer wrote:
> some items were missing and placing them in alphabetical order will make
> it easier in the future
> 
> Signed-off-by: Aaron Lauterer 
> ---
>  PVE/API2/Ceph.pm | 19 +++
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
>

applied, amended the commit to note which items missed, thanks!


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] [PATCH manager 3/3] api: ceph: update return value definitions

2022-12-21 Thread Thomas Lamprecht
On 21/12/2022 14:30, Aaron Lauterer wrote:
> to have at either a more accurate description or some description at
> all. For objects returning a lot of data, for example individual Ceph
> services, a full description has been omitted as I think that this would
> be a bit much.
> 
> Signed-off-by: Aaron Lauterer 
> ---
>  PVE/API2/Ceph.pm |  7 ++-
>  PVE/API2/Ceph/MON.pm | 14 +++---
>  PVE/API2/Ceph/OSD.pm | 13 +
>  PVE/API2/Cluster/Ceph.pm | 23 ++-
>  4 files changed, 52 insertions(+), 5 deletions(-)
> 
> diff --git a/PVE/API2/Ceph.pm b/PVE/API2/Ceph.pm
> index 55220324..cc8720b2 100644
> --- a/PVE/API2/Ceph.pm
> +++ b/PVE/API2/Ceph.pm
> @@ -622,7 +622,12 @@ __PACKAGE__->register_method ({
>   type => 'array',
>   items => {
>   type => "object",
> - properties => {},
> + properties => {
> + name => {
> + description => "Name of the CRUSH rule.",
> + type => "string",
> + }
> + },
>   },
>   links => [ { rel => 'child', href => "{name}" } ],
>  },
> diff --git a/PVE/API2/Ceph/MON.pm b/PVE/API2/Ceph/MON.pm
> index 5771bb46..b452045c 100644
> --- a/PVE/API2/Ceph/MON.pm
> +++ b/PVE/API2/Ceph/MON.pm
> @@ -212,9 +212,17 @@ __PACKAGE__->register_method ({
>   items => {
>   type => "object",
>   properties => {
> - name => { type => 'string' },
> - addr => { type => 'string', optional => 1 },
> - host => { type => 'string', optional => 1 },
> + name=> { type => 'string' },
> + addr=> { type => 'string', optional => 1 },
> + host=> { type => 'boolean', optional => 1 },
> + direxists   => { type => 'string', optional => 1 },
> + quorum  => { type => 'boolean', optional => 1 },
> + host=> { type => 'string', optional => 1 },
> + rank=> { type => 'integer', optional => 1 },
> + service => { type => 'integer', optional => 1 },
> + state   => { type => 'string', optional => 1 },
> + ceph_version=> { type => 'string', optional => 1 },
> + ceph_version_short  => { type => 'string', optional => 1 },

please avoid that code formatting style! It looks very out of place in PVE
source and would require touching unrelated lines if a longer variant gets
added or the longest one gets removed, which is just a nuisance and messes
with git history without any real benefit.

>   },
>   },
>   links => [ { rel => 'child', href => "{name}" } ],
> diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm
> index 93433b3a..ce1281d9 100644
> --- a/PVE/API2/Ceph/OSD.pm
> +++ b/PVE/API2/Ceph/OSD.pm
> @@ -88,6 +88,19 @@ __PACKAGE__->register_method ({
>  # fixme: return a list instead of extjs tree format ?
>  returns => {
>   type => "object",
> + items => {
> + type => "object",
> + properties => {
> + flags => { type => "string" },
> + root => {
> + type => "object",
> + description => "extjs formatted grid tree",

It's ExtJS, but actually not really sure why that is relevant here, the ExtJS
tree store format is a very simple and relatively common serialization format
for a tree.

> + },
> + },
> + },
> +
> +
> + }
>  },
>  code => sub {
>   my ($param) = @_;
> diff --git a/PVE/API2/Cluster/Ceph.pm b/PVE/API2/Cluster/Ceph.pm
> index 7f825003..49f84b24 100644
> --- a/PVE/API2/Cluster/Ceph.pm
> +++ b/PVE/API2/Cluster/Ceph.pm
> @@ -68,7 +68,20 @@ __PACKAGE__->register_method ({
>   },
>   },
>  },
> -returns => { type => 'object' },
> +returns => {
> + type => 'object'
> + description => "Items for each type of service containing objects for 
> each instance.",
> + items => {
> + type => "object",
> + properties => {
> + mds => { type => "object" },
> + mgr => { type => "object" },
> + mon => { type => "object" },
> + node => { type => "object" },
> + osd => { type => "object" },

description (e.g., it might not be clear that "node" holds the ceph version on
that node), optionality and maybe (some common/most-useful) sub-properties could
be great to have too.

> + }
> + },
> +},
>  code => sub {
>   my ($param) = @_;
>  
> @@ -181,6 +194,14 @@ __PACKAGE__->register_method ({
>   description => "Flag name.",
>   type => 'string', enum => $possible_flags_list,
>   },
> + description => {
> + description => "Flag description.",
> + type => 'string',
> + },
> + value => {
> +  

Re: [pve-devel] [PATCH storage] fix #4390: rbd: snapshot delete: avoid early return to fix handling TPM drive

2022-12-21 Thread Fabian Grünbichler
On December 9, 2022 11:30 am, Fiona Ebner wrote:
> The only caller where $running can even be truthy is QemuServer.pm's
> qemu_volume_snapshot_delete(). But there, a check if snapshots should
> be done with QEMU is already made and the storage function is only
> called if snapshots should not be done with QEMU (like for TPM drives
> which are not attached directly). So rely on the caller and do not
> return early to fix removing snapshots in such cases.
> 
> Even if a stray call ends up here (can happen already by changing the
> krbd setting while a VM is running to flip the above-mentioned check
> and the early return check removed by this patch), it might not even
> be problematic. At least a quick test worked fine:
> 1. take snapshot via a monitor command in QEMU
> 2. remove snapshot via the storage layer
> 3. create a new file in the VM
> 4. take a snapshot with the same name via monitor command in QEMU
> 5. roll back to the snapshot
> 6. check that the file in the VM is as expected
> Using the storage layer to take the snapshots and QEMU to remove the
> snapshot also worked doing the same test. Even if it were problematic,
> the check in qemu-server should rather be improved then.
> 
> (Trying to issue a snapshot mon command for a krbd-mapped image fails
> with an error on the other hand, but that is also not too bad and not
> relevant to the storage code. Again, it rather would be necessary to
> improve the check in qemu-server).
> 
> The fact that the pve-container code didn't even pass $running is the
> reason removing snapshots worked for containers on a storage with krbd
> disabled (the pve-container code calls map_volume() explicitly, so
> containers can work regardless of the krbd setting in the storage
> configuration; see commit 841fba6 ("call map_volume before using
> volumes.") in pve-container).
> 
> For volume_snapshot(), the early return with $running was already
> removed (or rather the relevant logic moved to QemuServer.pm) in 2015
> by commit f5640e7 ("remove running from Storage and check it in
> QemuServer"), even before krbd support was added in RBDPlugin.pm.

given the last two paragraphs, and the following:
- some plugins don't even have $running as parameter in the signature list
- qemu-server is not calling this code for librbd, qcow2 or qed files
- that (guarded) call site in qemu-server is the only one that actually sets
  $running to anything other than undef or 0 (AFAICT)
- only the RBDPlugin (this patch) and the DirPlugin and its descendants (which
  only supports qcow2 and qed for snapshotting, which are only supported by
  qemu-server, and handled directly by Qemu for running VMs) are handling
  $running at all

we should probably mark this parameter for dropping with 8.0 (both in the plugin
interface and in PVE::Storage itself)?

external plugins already cannot hook into the "use qemu for snapshotting" logic
(which could of course be changed, although I am not sure if there's much
benefit), and only leaving this parameter in for snapshot deletion doesn't bring
any benefit. I kinda doubt that any external plugins are using this parameter,
but technically removing it earlier could be considered a breaking change..

other than this, consider this

Reviewed-by: Fabian Grünbichler 

> 
> Signed-off-by: Fiona Ebner 
> ---
>  PVE/Storage/RBDPlugin.pm | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/PVE/Storage/RBDPlugin.pm b/PVE/Storage/RBDPlugin.pm
> index dc6e79d..9047504 100644
> --- a/PVE/Storage/RBDPlugin.pm
> +++ b/PVE/Storage/RBDPlugin.pm
> @@ -767,8 +767,6 @@ sub volume_snapshot_rollback {
>  sub volume_snapshot_delete {
>  my ($class, $scfg, $storeid, $volname, $snap, $running) = @_;
>  
> -return 1 if $running && !$scfg->{krbd}; # FIXME: 
> -
>  $class->deactivate_volume($storeid, $scfg, $volname, $snap, {});
>  
>  my ($vtype, $name, $vmid) = $class->parse_volname($volname);
> -- 
> 2.30.2
> 
> 
> 
> ___
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH v2 qemu-server 1/1] Do not start VM twice when rollbacking with --start

2022-12-21 Thread Fabian Grünbichler
On November 21, 2022 2:13 pm, Stefan Hanreich wrote:
> When rollbacking to the snapshot of a VM that includes RAM, the VM
> gets started by the rollback task anyway, so no additional start task is
> needed. Previously, when running rollback with the --start parameter
> and the VM snapshot includes RAM, a start task was created. That task
> failed because the VM had already been started by the rollback task.
> 
> Additionally documented this behaviour in the description of the --start
> parameter
> 
> Signed-off-by: Stefan Hanreich 
> ---
> 
> Changes v1 -> v2:
> Do not parse config for checking type of snapshot but rather directly check
> whether VM is running or not via check_running()
> 
>  PVE/API2/Qemu.pm | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 6bdcce2..691202d 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -5064,7 +5064,8 @@ __PACKAGE__->register_method({
>   snapname => get_standard_option('pve-snapshot-name'),
>   start => {
>   type => 'boolean',
> - description => "Whether the VM should get started after rolling 
> back successfully",
> + description => "Whether the VM should get started after rolling 
> back successfully."
> + . " A VM will always be started when rollbacking a snapshot 
> with RAM included, regardless of this parameter.",

this is worded a bit weird (I don't think that "rollbacking" is a word ;)), how 
about:

. "(Note: VMs will be automatically started if the snapshot includes RAM.)",

>   optional => 1,
>   default => 0,
>   },
> @@ -5091,7 +5092,7 @@ __PACKAGE__->register_method({
>   PVE::Cluster::log_msg('info', $authuser, "rollback snapshot VM 
> $vmid: $snapname");
>   PVE::QemuConfig->snapshot_rollback($vmid, $snapname);
>  
> - if ($param->{start}) {
> + if ($param->{start} && !PVE::QemuServer::check_running($vmid)) {

unless I am missing something, this should use

 PVE::QemuServer::Helpers::vm_running_locally($vmid)

we are holding the guest migration lock for the whole rollback worker, and
snapshot_rollback loads the config, so we know it is on the current node at this
point and just checking whether a matching qemu process is running after the
rollback is enough.

>   PVE::API2::Qemu->vm_start({ vmid => $vmid, node => $node });
>   }
>   };
> -- 
> 2.30.2
> 
> 
> ___
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] [PATCH v2 qemu-server 1/1] Do not start VM twice when rollbacking with --start

2022-12-21 Thread Stefan Hanreich



On 12/21/22 16:02, Fabian Grünbichler wrote:

On November 21, 2022 2:13 pm, Stefan Hanreich wrote:

When rollbacking to the snapshot of a VM that includes RAM, the VM
gets started by the rollback task anyway, so no additional start task is
needed. Previously, when running rollback with the --start parameter
and the VM snapshot includes RAM, a start task was created. That task
failed because the VM had already been started by the rollback task.

Additionally documented this behaviour in the description of the --start
parameter

Signed-off-by: Stefan Hanreich 
---

Changes v1 -> v2:
Do not parse config for checking type of snapshot but rather directly check
whether VM is running or not via check_running()

  PVE/API2/Qemu.pm | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 6bdcce2..691202d 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -5064,7 +5064,8 @@ __PACKAGE__->register_method({
snapname => get_standard_option('pve-snapshot-name'),
start => {
type => 'boolean',
-   description => "Whether the VM should get started after rolling back 
successfully",
+   description => "Whether the VM should get started after rolling back 
successfully."
+   . " A VM will always be started when rollbacking a snapshot with 
RAM included, regardless of this parameter.",


this is worded a bit weird (I don't think that "rollbacking" is a word ;)), how 
about:

. "(Note: VMs will be automatically started if the snapshot includes RAM.)",



sounds way better! will also improve the commit message ;)


optional => 1,
default => 0,
},
@@ -5091,7 +5092,7 @@ __PACKAGE__->register_method({
PVE::Cluster::log_msg('info', $authuser, "rollback snapshot VM $vmid: 
$snapname");
PVE::QemuConfig->snapshot_rollback($vmid, $snapname);
  
-	if ($param->{start}) {

+   if ($param->{start} && !PVE::QemuServer::check_running($vmid)) {


unless I am missing something, this should use

  PVE::QemuServer::Helpers::vm_running_locally($vmid)

we are holding the guest migration lock for the whole rollback worker, and
snapshot_rollback loads the config, so we know it is on the current node at this
point and just checking whether a matching qemu process is running after the
rollback is enough.


will do




PVE::API2::Qemu->vm_start({ vmid => $vmid, node => $node });
}
};
--
2.30.2


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel






___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel





___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[pve-devel] applied: [PATCH V7 manager 0/2] fix #2822: add iscsi, lvm, lvmthin & zfs

2022-12-21 Thread Dominik Csapak

applied, thanks


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH qemu-server v3] rollback: Only create start task with --start if VM is not running

2022-12-21 Thread Stefan Hanreich
When rolling back to the snapshot of a VM that includes RAM, the VM
gets started by the rollback task anyway, so no additional start task
is needed. Previously, when rolling back with the start parameter and
the VM snapshot included RAM, a start task was created. That task
failed because the VM had already been started by the rollback task.

Additionally documented this behaviour in the description of the start
parameter

Signed-off-by: Stefan Hanreich 
---
Changes v2 -> v3:
Use vm_running_locally() instead of check_running()
Improved description

Changes v1 -> v2:
Do not parse config for checking type of snapshot but rather directly check
whether VM is running or not via check_running()

 PVE/API2/Qemu.pm | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index e2a420f..c87602d 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -5066,7 +5066,8 @@ __PACKAGE__->register_method({
snapname => get_standard_option('pve-snapshot-name'),
start => {
type => 'boolean',
-   description => "Whether the VM should get started after rolling 
back successfully",
+   description => "Whether the VM should get started after rolling 
back successfully."
+   . " (Note: VMs will be automatically started if the 
snapshot includes RAM.)",
optional => 1,
default => 0,
},
@@ -5093,7 +5094,7 @@ __PACKAGE__->register_method({
PVE::Cluster::log_msg('info', $authuser, "rollback snapshot VM 
$vmid: $snapname");
PVE::QemuConfig->snapshot_rollback($vmid, $snapname);
 
-   if ($param->{start}) {
+   if ($param->{start} && 
!PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
PVE::API2::Qemu->vm_start({ vmid => $vmid, node => $node });
}
};
-- 
2.30.2


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel