--- Begin Message ---
Hi

On 08/10/2024 13:50, Max Carrara wrote:
On Fri Oct 4, 2024 at 3:54 PM CEST, Ivaylo Markov wrote:
Greetings,

I am the maintainer of StorPool’s external storage plugin for PVE[0]
which integrates our storage solution as a backend for VM disks. Our
software has the ability to create atomic (crash-consistent) snapshots
of a group of storage volumes. I’d like to use this feature in our
plugin so that customers can perform whole VM snapshots, but that does
not seem possible currently - the snapshot creation method is called
individually for every disk.
Hello!

This does sound quite interesting.

I was directed here to discuss this proposal and my implementation idea
after an initial post in Bugzilla[1]. The goal is to give storage
plugins the option to perform atomic crash-consistent snapshots of the
virtual disks associated with a virtual machine where the backend
supports it  (e.g. Ceph, StorPool, and ZFS) without affecting those
without such a feature.
Since you mentioned that the backends without such a feature won't be
affected, will the disks of the storage types that *do* support it still
be addressable individually? As in: Would it be possible to run both
group snapshots and individual snapshots on a VM's disks? (I'm assuming
that they will, but still wanted to ask.)
No reason to remove the individual snapshot capability, my proposal is concerned entirely with whole-VM snapshots (perhaps I should've stated that explicitly).
I would add a `can_snapshot_volume_group` method to the base
`PVE::Storage::Plugin` class, which would accept an array of the VM’s
disks, and return a binary result whether an atomic snapshot is
possible. The default implementation would return 0, but plugins with
support can override it based on backend capabilities. For example, ZFS
supports atomic snapshot of volume groups, but requires all volumes to
be in the same pool.
The actual snapshot can be performed by a `snapshot_volume_group
method`, which is not expected to be called unless the driver supports
this operation.
For both of these methods you'll have to keep in mind that the
`...::Plugin` class is designed to only handle a single volume at once.

I'm not against implementing methods that support addressing multiple
volumes (disks) at once, though, but the `PVE::Storage` API must handle
this gracefully. See more down below.

In `PVE::AbstractConfig::snapshot_create` these two methods can be used
to check and perform the atomic snapshots where possible, otherwise it
would keep the current behavior of creating a snapshot for each disk
separately. If the VM has drives with completely different storage
backends (e.g. ZFS and LVM for whatever reason), the driver check can be
skipped, and the current behavior used again.
This sounds good to me, though one thing that should be noted here is
that everything should go through the `PVE::Storage` API, *not*
`PVE::Storage::Plugin`.

This means that there need to be API subroutines for the `PVE::Storage`
module that are able to handle all the cases you described above. These
subs then work with `PVE::Storage::Plugin::can_snapshot_volume_group`
and `...::snapshot_volume_group`.

In general, this looks something like this:
   * `PVE::Storage::Plugin` defines interface for concrete storage plugin
     implementations
   * `PVE::Storage` isn't aware of the concrete implementations and
     strictly uses only the `...::Plugin` interface
   * `PVE::AbstractConfig` and others only use the API methods provided
     by `PVE::Storage`, allowing us to abstract over many different types
     of storages

Since you already got your own storage plugin, I assume you're aware of
all of this, but I wanted to mention it regardless to avoid any
misunderstandings.

Sooo, all of this means that this will also require subroutines in
`PVE::Storage` that expose this kind of functionality (checking if group
snapshots are possible, performing group snapshots, ...).

Overall, you'd have to handle the following  cases in `PVE::Storage`,
from what I can think of at the moment:
   * Whether there is more than one underlying storage type
     (if there is, don't support group snapshots)
   * Whether the underlying storage type supports group snapshots *in
     general*
   * Whether a group snapshot may actually be performed on the passed
     constellation of volumes (disks)
   * ...

You'll probably need to introduce more than the two methods in
`PVE::Storage::Plugin` you mentioned above in order to really support
group snapshots for each different storage type and handle all the
(edge) cases.
Noted, thank you for the clarification. I will have to take another look at this part of the code, but I think I get the gist of it.

Though, I really like the idea overall; I don't see why we shouldn't
add this.
Great to hear.

If you haven't already, you should have a look at this regarding
contributions: 
https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright

Will have to run this by the company before I get started, but I don't see it being an issue at all.



[0] https://github.com/storpool/pve-storpool
[1] https://bugzilla.proxmox.com/show_bug.cgi?id=5752

Best regards,

--
Ivaylo Markov
Quality & Automation Engineer
StorPool Storage
https://www.storpool.com



--- End Message ---
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to