commit description missing here as well..

I haven't tested this (or the first patches doing the blockdev conversion) yet, 
but I see a few bigger design/architecture issues left (besides FIXMEs for 
missing pieces that previously worked ;)):

- we should probably move the decision whether a snapshot is done on the 
storage layer or by qemu into the control of the storage plugin, especially 
since we are currently cleaning that API up to allow easier implementation of 
external plugins
- if we do that, we can also make "uses external qcow2 snapshots" a property of 
the storage plugin+config to replace hard-coded checks for the snapext property 
or lvm+qcow2
- there are a few operations here that should not call directly into the 
storage plugin code or do equivalent actions, but should rather get a proper 
interface in that storage plugin API

the first one is the renaming of a blockdev while it is used, which is 
currently done like this:
-- "link" snapshot path to make it available under old and new name
-- handle blockdev additions/reopening/backing-file updates/deletions on the 
qemu layer
-- remove old snapshot path link
-- if LVM, rename actual volume (for non-LVM, linking followed by unlinking the 
source is effectively a rename already)

I wonder whether that couldn't be made more straight-forward by doing
-- rename snapshot volume/image (qemu must already have the old name open 
anyway and should be able to continue using it)
-- do blockdev additions/reopening/backing-file updates/deletions on the qemu 
layer

or is there an issue/check in qemu somewhere that prevents this approach? if 
not, we could just introduce a "volume_snapshot_rename" or extend rename_volume 
with a snapshot parameter..

the second thing that happens is deleting a snapshot volume/path, without 
deleting the whole snapshot.. that one we could easily support by extending 
volume_snapshot_delete by extending the $running parameter (e.g., passing "2") 
or adding a new one to signify that all the housekeeping was already done, and 
just the actual snapshot volume should be deleted. this shouldn't be an issue 
provided all such calls are guarded by first checking that we are using 
external snapshots..

> Alexandre Derumier via pve-devel <pve-devel@lists.proxmox.com> hat am 
> 11.03.2025 11:29 CET geschrieben:
> Signed-off-by: Alexandre Derumier <alexandre.derum...@groupe-cyllene.com>
> ---
>  PVE/QemuConfig.pm       |   4 +-
>  PVE/QemuServer.pm       | 226 +++++++++++++++++++++++++++++++++++++---
>  PVE/QemuServer/Drive.pm |   4 +
>  3 files changed, 220 insertions(+), 14 deletions(-)
> 
> diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm
> index b60cc398..2b3acb15 100644
> --- a/PVE/QemuConfig.pm
> +++ b/PVE/QemuConfig.pm
> @@ -377,7 +377,7 @@ sub __snapshot_create_vol_snapshot {
>  
>      print "snapshotting '$device' ($drive->{file})\n";
>  
> -    PVE::QemuServer::qemu_volume_snapshot($vmid, $device, $storecfg, $volid, 
> $snapname);
> +    PVE::QemuServer::qemu_volume_snapshot($vmid, $device, $storecfg, $drive, 
> $snapname);
>  }
>  
>  sub __snapshot_delete_remove_drive {
> @@ -414,7 +414,7 @@ sub __snapshot_delete_vol_snapshot {
>      my $storecfg = PVE::Storage::config();
>      my $volid = $drive->{file};
>  
> -    PVE::QemuServer::qemu_volume_snapshot_delete($vmid, $storecfg, $volid, 
> $snapname);
> +    PVE::QemuServer::qemu_volume_snapshot_delete($vmid, $storecfg, $drive, 
> $snapname);
>  
>      push @$unused, $volid;
>  }
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 60481acc..6ce3e9c6 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -4449,20 +4449,200 @@ sub qemu_block_resize {
>  }
>  
>  sub qemu_volume_snapshot {
> -    my ($vmid, $deviceid, $storecfg, $volid, $snap) = @_;
> +    my ($vmid, $deviceid, $storecfg, $drive, $snap) = @_;
>  
> +    my $volid = $drive->{file};
>      my $running = check_running($vmid);
> -
> -    if ($running && do_snapshots_with_qemu($storecfg, $volid, $deviceid)) {
> -     mon_cmd($vmid, 'blockdev-snapshot-internal-sync', device => $deviceid, 
> name => $snap);
> +    my $do_snapshots_with_qemu = do_snapshots_with_qemu($storecfg, $volid, 
> $deviceid) if $running;

forbidden syntax

> +    if ($do_snapshots_with_qemu) {
> +     if($do_snapshots_with_qemu == 2) {
> +         my $snapshots = PVE::Storage::volume_snapshot_info($storecfg, 
> $volid);
> +         my $parent_snap = $snapshots->{'current'}->{parent};
> +         my $size = PVE::Storage::volume_size_info($storecfg, $volid, 5);
> +         blockdev_rename($storecfg, $vmid, $deviceid, $drive, 'current', 
> $snap, $parent_snap);
> +         blockdev_external_snapshot($storecfg, $vmid, $deviceid, $drive, 
> $snap, $size);
> +     } else {
> +         mon_cmd($vmid, 'blockdev-snapshot-internal-sync', device => 
> $deviceid, name => $snap);
> +     }
>      } else {
>       PVE::Storage::volume_snapshot($storecfg, $volid, $snap);
>      }
>  }
>  
> +sub blockdev_external_snapshot {
> +    my ($storecfg, $vmid, $deviceid, $drive, $snap, $size) = @_;
> +
> +    my $volid = $drive->{file};
> +
> +    #be sure to add drive in write mode
> +    delete($drive->{ro});

why?

> +
> +    my $new_file_blockdev = generate_file_blockdev($storecfg, $drive);
> +    my $new_fmt_blockdev = generate_format_blockdev($storecfg, $drive, 
> $new_file_blockdev);
> +
> +    my $snap_file_blockdev = generate_file_blockdev($storecfg, $drive, 
> $snap);
> +    my $snap_fmt_blockdev = generate_format_blockdev($storecfg, $drive, 
> $snap_file_blockdev, $snap);
> +
> +    #preallocate add a new current file with reference to backing-file
> +    my ($storeid, $volname) = PVE::Storage::parse_volume_id($volid);
> +    my $name = (PVE::Storage::parse_volname($storecfg, $volid))[1];
> +    PVE::Storage::vdisk_alloc($storecfg, $storeid, $vmid, 'qcow2', $name, 
> $size/1024, $snap_file_blockdev->{filename});

if we instead extend volume_snapshot similarly to what I describe up top 
(adding a parameter that renaming was already done), we don't need to extend 
vdisk_alloc's interface like this.. or maybe we could even combine 
blockdev_rename and blockdev_external_snapshot, to just call 
PVE::Storage::volume_snapshot to do rename+alloc, and then do the blockdev 
dance? in any case, this here would be the *only* external caller of 
vdisk_alloc with a backing file, so I don't think this is the right interface..

> +
> +    #backing need to be forced to undef in blockdev, to avoid reopen of 
> backing-file on blockdev-add
> +    $new_fmt_blockdev->{backing} = undef;
> +
> +    PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-add', 
> %$new_fmt_blockdev);
> +
> +    mon_cmd($vmid, 'blockdev-snapshot', node => 
> $snap_fmt_blockdev->{'node-name'}, overlay => 
> $new_fmt_blockdev->{'node-name'});
> +}
> +
> +sub blockdev_delete {
> +    my ($storecfg, $vmid, $drive, $file_blockdev, $fmt_blockdev) = @_;
> +
> +    #add eval as reopen is auto removing the old nodename automatically only 
> if it was created at vm start in command line argument
> +    eval { mon_cmd($vmid, 'blockdev-del', 'node-name' => 
> $file_blockdev->{'node-name'}) };
> +    eval { mon_cmd($vmid, 'blockdev-del', 'node-name' => 
> $fmt_blockdev->{'node-name'}) };
> +
> +    #delete the file (don't use vdisk_free as we don't want to delete all 
> snapshot chain)
> +    print"delete old $file_blockdev->{filename}\n";
> +
> +    my $storage_name = PVE::Storage::parse_volume_id($drive->{file});
> +    my $scfg = $storecfg->{ids}->{$storage_name};
> +    if ($scfg->{type} eq 'lvm') {
> +     PVE::Storage::LVMPlugin::lvremove($file_blockdev->{filename});
> +    } else {
> +     unlink($file_blockdev->{filename});
> +    }

this really needs to be handled in the storage layer

> +}
> +
> +sub blockdev_rename {
> +    my ($storecfg, $vmid, $deviceid, $drive, $src_snap, $target_snap, 
> $parent_snap) = @_;
> +
> +    print "rename $src_snap to $target_snap\n";
> +
> +    my $volid = $drive->{file};
> +
> +    my $src_file_blockdev = generate_file_blockdev($storecfg, $drive, 
> $src_snap);
> +    my $src_fmt_blockdev = generate_format_blockdev($storecfg, $drive, 
> $src_file_blockdev, $src_snap);
> +    my $target_file_blockdev = generate_file_blockdev($storecfg, $drive, 
> $target_snap);
> +    my $target_fmt_blockdev = generate_format_blockdev($storecfg, $drive, 
> $target_file_blockdev, $target_snap);
> +
> +    #create a hardlink
> +    link($src_file_blockdev->{filename}, $target_file_blockdev->{filename});

this really needs to be handled in the storage layer

> +
> +    if($target_snap eq 'current' || $src_snap eq 'current') {
> +     #rename from|to current
> +
> +     #add backing to target
> +     if ($parent_snap) {
> +         my $parent_fmt_nodename = encode_nodename('fmt', $volid, 
> $parent_snap);
> +         $target_fmt_blockdev->{backing} = $parent_fmt_nodename;
> +     }
> +     PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-add', 
> %$target_fmt_blockdev);
> +
> +     #reopen the current throttlefilter nodename with the target fmt nodename
> +     my $drive_blockdev = generate_drive_blockdev($storecfg, $vmid, $drive);
> +     delete $drive_blockdev->{file};
> +     $drive_blockdev->{file} = $target_fmt_blockdev->{'node-name'};

these two lines can be a single line

> +     PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-reopen', options => 
> [$drive_blockdev]);
> +    } else {
> +     #intermediate snapshot
> +     PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-add', 
> %$target_fmt_blockdev);
> +
> +     #reopen the parent node with the new target fmt backing node
> +     my $parent_file_blockdev = generate_file_blockdev($storecfg, $drive, 
> $parent_snap);
> +     my $parent_fmt_blockdev = generate_format_blockdev($storecfg, $drive, 
> $parent_file_blockdev, $parent_snap);
> +     $parent_fmt_blockdev->{backing} = $target_fmt_blockdev->{'node-name'};
> +     PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-reopen', options => 
> [$parent_fmt_blockdev]);
> +
> +     #change backing-file in qcow2 metadatas
> +     PVE::QemuServer::Monitor::mon_cmd($vmid, 'change-backing-file', device 
> => $deviceid, 'image-node-name' => $parent_fmt_blockdev->{'node-name'}, 
> 'backing-file' => $target_file_blockdev->{filename});
> +    }
> +
> +    # delete old file|fmt nodes
> +    # add eval as reopen is auto removing the old nodename automatically 
> only if it was created at vm start in command line argument

ugh..

> +    eval { PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-del', 
> 'node-name' => $src_file_blockdev->{'node-name'})};
> +    eval { PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-del', 
> 'node-name' => $src_fmt_blockdev->{'node-name'})};
> +
> +    unlink($src_file_blockdev->{filename});

same as above

> +
> +    #rename underlay
> +    my $storage_name = PVE::Storage::parse_volume_id($volid);
> +    my $scfg = $storecfg->{ids}->{$storage_name};
> +    return if $scfg->{type} ne 'lvm';
> +
> +    print "rename underlay lvm volume $src_file_blockdev->{filename} to 
> $target_file_blockdev->{filename}\n";
> +    PVE::Storage::LVMPlugin::lvrename(undef, $src_file_blockdev->{filename}, 
> $target_file_blockdev->{filename});

absolute no-go, this needs to be handled in the storage layer

> +}
> +
> +sub blockdev_commit {
> +    my ($storecfg, $vmid, $deviceid, $drive, $src_snap, $target_snap) = @_;
> +
> +    my $volid = $drive->{file};
> +
> +    print "block-commit $src_snap to base:$target_snap\n";
> +    $src_snap = undef if $src_snap && $src_snap eq 'current';
> +
> +    my $target_file_blockdev = generate_file_blockdev($storecfg, $drive, 
> $target_snap);
> +    my $target_fmt_blockdev = generate_format_blockdev($storecfg, $drive, 
> $target_file_blockdev, $target_snap);
> +
> +    my $src_file_blockdev = generate_file_blockdev($storecfg, $drive, 
> $src_snap);
> +    my $src_fmt_blockdev = generate_format_blockdev($storecfg, $drive, 
> $src_file_blockdev, $src_snap);
> +
> +    my $job_id = "commit-$deviceid";
> +    my $jobs = {};
> +    my $opts = { 'job-id' => $job_id, device => $deviceid };
> +
> +    my $complete = undef;
> +    if ($src_snap) {
> +     $complete = 'auto';
> +     $opts->{'top-node'} = $src_fmt_blockdev->{'node-name'};
> +     $opts->{'base-node'} = $target_fmt_blockdev->{'node-name'};
> +    } else {
> +     $complete = 'complete';
> +     $opts->{'base-node'} = $target_fmt_blockdev->{'node-name'};
> +     $opts->{replaces} = $src_fmt_blockdev->{'node-name'};
> +    }
> +
> +    mon_cmd($vmid, "block-commit", %$opts);
> +    $jobs->{$job_id} = {};
> +    qemu_drive_mirror_monitor ($vmid, undef, $jobs, $complete, 0, 'commit');
> +
> +    blockdev_delete($storecfg, $vmid, $drive, $src_file_blockdev, 
> $src_fmt_blockdev);
> +}
> +
> +sub blockdev_stream {
> +    my ($storecfg, $vmid, $deviceid, $drive, $snap, $parent_snap, 
> $target_snap) = @_;
> +
> +    my $volid = $drive->{file};
> +    $target_snap = undef if $target_snap eq 'current';
> +
> +    my $parent_file_blockdev = generate_file_blockdev($storecfg, $drive, 
> $parent_snap);
> +    my $parent_fmt_blockdev = generate_format_blockdev($storecfg, $drive, 
> $parent_file_blockdev, $parent_snap);
> +
> +    my $target_file_blockdev = generate_file_blockdev($storecfg, $drive, 
> $target_snap);
> +    my $target_fmt_blockdev = generate_format_blockdev($storecfg, $drive, 
> $target_file_blockdev, $target_snap);
> +
> +    my $snap_file_blockdev = generate_file_blockdev($storecfg, $drive, 
> $snap);
> +    my $snap_fmt_blockdev = generate_format_blockdev($storecfg, $drive, 
> $snap_file_blockdev, $snap);
> +
> +    my $job_id = "stream-$deviceid";
> +    my $jobs = {};
> +    my $options = { 'job-id' => $job_id, device => 
> $target_fmt_blockdev->{'node-name'} };
> +    $options->{'base-node'} = $parent_fmt_blockdev->{'node-name'};
> +    $options->{'backing-file'} = $parent_file_blockdev->{filename};
> +
> +    mon_cmd($vmid, 'block-stream', %$options);
> +    $jobs->{$job_id} = {};
> +    qemu_drive_mirror_monitor($vmid, undef, $jobs, 'auto', 0, 'stream');
> +
> +    blockdev_delete($storecfg, $vmid, $drive, $snap_file_blockdev, 
> $snap_fmt_blockdev);
> +}
> +
>  sub qemu_volume_snapshot_delete {
> -    my ($vmid, $storecfg, $volid, $snap) = @_;
> +    my ($vmid, $storecfg, $drive, $snap) = @_;
>  
> +    my $volid = $drive->{file};
>      my $running = check_running($vmid);
>      my $attached_deviceid;
>  
> @@ -4474,13 +4654,35 @@ sub qemu_volume_snapshot_delete {
>       });
>      }
>  
> -    if ($attached_deviceid && do_snapshots_with_qemu($storecfg, $volid, 
> $attached_deviceid)) {
> -     mon_cmd(
> -         $vmid,
> -         'blockdev-snapshot-delete-internal-sync',
> -         device => $attached_deviceid,
> -         name => $snap,
> -     );
> +    my $do_snapshots_with_qemu = do_snapshots_with_qemu($storecfg, $volid, 
> $attached_deviceid) if $running;
> +    if ($attached_deviceid && $do_snapshots_with_qemu) {
> +
> +     if ($do_snapshots_with_qemu == 2) {
> +
> +         my $path = PVE::Storage::path($storecfg, $volid);
> +         my $snapshots = PVE::Storage::volume_snapshot_info($storecfg, 
> $volid);
> +         my $parentsnap = $snapshots->{$snap}->{parent};
> +         my $childsnap = $snapshots->{$snap}->{child};
> +
> +         # if we delete the first snasphot, we commit because the first 
> snapshot original base image, it should be big.
> +            # improve-me: if firstsnap > child : commit, if firstsnap < 
> child do a stream.
> +         if(!$parentsnap) {
> +             print"delete first snapshot $snap\n";
> +             blockdev_commit($storecfg, $vmid, $attached_deviceid, $drive, 
> $childsnap, $snap);
> +             blockdev_rename($storecfg, $vmid, $attached_deviceid, $drive, 
> $snap, $childsnap, $snapshots->{$childsnap}->{child});
> +         } else {
> +             #intermediate snapshot, we always stream the snapshot to child 
> snapshot
> +             print"stream intermediate snapshot $snap to $childsnap\n";
> +             blockdev_stream($storecfg, $vmid, $attached_deviceid, $drive, 
> $snap, $parentsnap, $childsnap);
> +         }
> +     } else {
> +         mon_cmd(
> +             $vmid,
> +             'blockdev-snapshot-delete-internal-sync',
> +             device => $attached_deviceid,
> +             name => $snap,
> +         );
> +     }
>      } else {
>       PVE::Storage::volume_snapshot_delete(
>           $storecfg, $volid, $snap, $attached_deviceid ? 1 : undef);
> diff --git a/PVE/QemuServer/Drive.pm b/PVE/QemuServer/Drive.pm
> index 51513546..7ba401bd 100644
> --- a/PVE/QemuServer/Drive.pm
> +++ b/PVE/QemuServer/Drive.pm
> @@ -1117,6 +1117,8 @@ sub print_drive_throttle_group {
>  sub generate_file_blockdev {
>      my ($storecfg, $drive, $snap, $nodename) = @_;
>  
> +    $snap = undef if $snap && $snap eq 'current';
> +
>      my $volid = $drive->{file};
>      my $blockdev = {};
>  
> @@ -1260,6 +1262,8 @@ sub do_snapshots_with_qemu {
>  sub generate_format_blockdev {
>      my ($storecfg, $drive, $file, $snap, $nodename) = @_;
>  
> +    $snap = undef if $snap && $snap eq 'current';
> +
>      my $volid = $drive->{file};
>      die "format_blockdev can't be used for nbd" if $volid =~ /^nbd:/;
>  
> -- 
> 2.39.5


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to