[pve-devel] issues with Virtio-SCSI devicde on Proxmox...
Hello, I work for VSI (VMS Software Inc) which is porting the OpenVMS operating system to x86. At this point we successfully on various hypervisors, but have some issues on the KVM running on Proxmox. The OpenVMS VM works just fine with SATA disks and it also works with for example virtio-network device etc., but trying to use virtio-scsi hangs the driver. I have debugged this and I can successfully configure the port/controller, send the IO request to the device. It then gets processed by the device, which posts the results and sets the interrupt bit in the ISR register, but it never asserts the interrupt hence the driver never gets notified and the I/O hangs. I have tried both “virtio-scsi-pci” and “virtio-scsi-single”, but no luck. The emulated virtio-scsi device is a legacy device. But then again, the virtio-network device is also a legacy device and here we are getting interrupts. One thing which bothers me is the fact that the “legacy interrupt disable” bit is set in the PCI config space of the virtio-scsi device (i.e. bit 10 at offset 4) Questions: * is there a way to configure a modern virtio-scsi devcie (i.e. disable_legacy=on) ? * why is the legacy interrupt bit set in the PCI config space ? * Are there any working driver for virtio-scsi on this KVM using Q35 machine? i.e. any other OS Any thoughts why these interrupts are not getting delivered on the PCIE bus? thanks /cmos ___ Christian Moser Mobile:+358-40-5022105 Email: c...@maklee.com URL: www.maklee.com ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] Interest in a file manager interface within pve-manager?
On 7/12/24 00:27, Blythe, Nathan F. - US via pve-devel wrote: Hello, Hi, sorry for the late answer. First thanks for a well written proposal/question. This is what we'd like to see if third-parties are interested in making features/changes/etc. I try to answer the question/proposal to the best of my knowledge, but maybe some of my colleagues have additional things to say or want to correct me (this could take a bit though, due to summer holiday season) We have a need for a non-network based (i.e. hypercall / out-of-band) file browser for containers and VMs within the pve-manager web UI. I'd like to be able to select a VM or container, click "More", and choose "File manager" and then browse the VM or container's file systems and upload or download files from my local system, through the web-browser. For VMs (qemu), the qemu-guest-agent provides the ability to read/write guest files, and the pve API exposes it (with strict file size limits) as nodes->{node}->qemu->{vmid}->agent->file-*. Presumably, a web UI file manager would use that API endpoint, plus maybe an enhancement for directory listing and chunked upload/download (to support arbitrary size files). For containers (LXC), the pct application can read/write files by (I think) mounting the container root file system on the host and touching it directly. But there's no corresponding API endpoint. If there were, it could be used by a future web UI file manager. So to implement this feature, I think there are 5 steps: 1. Enhance the qemu-guest-agent and qm to support reading directory structure and creating directories. 2. Enhance the pve API to expose these new features like file-read and file-write. 3. Enhance the pve API to support chunked file read and write, for large files. Maybe also requiring an enhancement to the qemu-guest-agent? 4. Enhance the pve API to support chunked file read/write, whole file read/write, and directory structure/creating directories (through direct filesystem interaction on the host). 5. Write a JS front-end file manager, invokable from the Other menu, which uses these new/enhanced API endpoints to implement a general purpose out-of-band file manager. Is there interest from the proxmox developers for something like this? If we started working on some of these items, would there be interest in accepting patches? Does my general approach look sound? After a bit of internal discussion, the general tone is that we probably don't want to expose the guest content/state by default this much now, but in the longer term it could be OK, but we see the following issues: * For now QEMU allows those guest calls by default, and there is at the moment not a really good way for configuring what's allowed and what not. * Our privileges and ACLs are not really designed for this currently. * Handling permissions (and other filesystem/guest "quirks") will be very hard (e.g. windows permissions are very different from linux/unix ones) As breaking the host/guest barrier can have bad implications on security/data integrity/etc. those problems would have to be solved before one could/should start with such a feature. There is currently some work done on the QEMU side for limiting/configuring QGA calls: https://lore.kernel.org/qemu-devel/20240604153242.251334-1-berra...@redhat.com/ When that's integrated, one could think about how our ACLs, privileges could be extended to allow such things. Only after those are solved, implementing this feature make sense. What could be done in the meantime probably is to work with upstream QEMU to add a directory listing to the qga and maybe some proper permission handling? This is relatively independent from our side, since we don't actually ship the guest agent ourselves, so this has to be solved upstream anyway. So all in all, such a feature could work, but only in the longer term, this is nothing for the short/midterm. Hope this answers your questions. If not, just write again :) (We also investigated using SPICE's folder-sharing capabilities and the spice-proxy provided by the host, but that doesn't work with the Windows virt-viewer client, only the Linux client. It looks like there was some interest in the past in enhancing xterm.js to support xmodem file transfer, but it didn't go far.) Regards, Nathan Blythe CACI Best regards Dominik ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH v2 qemu-server] remote migration: fix online migration via API clients
As reported in the community forum [0], when a remote migration request comes in via an API client, the -T flag for Perl is set, so an insecure dependency in a call like unlink() in forward_unix_socket() will fail with: > failed to write forwarding command - Insecure dependency in unlink while > running with -T switch To fix it, untaint the problematic socket addresses coming from the remote side. Require that all sockets are below '/run/qemu-server/' and end with '.migrate'. This allows extensions in the future while still being quite strict. [0]: https://forum.proxmox.com/threads/123048/post-691958 Signed-off-by: Fiona Ebner --- Changes in v2: * rule out '../' in path PVE/QemuMigrate.pm | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm index e71face4..d31589e7 100644 --- a/PVE/QemuMigrate.pm +++ b/PVE/QemuMigrate.pm @@ -1095,7 +1095,10 @@ sub phase2 { die "only UNIX sockets are supported for remote migration\n" if $tunnel_info->{proto} ne 'unix'; - my $remote_socket = $tunnel_info->{addr}; + # untaint + my ($remote_socket) = + $tunnel_info->{addr} =~ m|^(/run/qemu-server/(?:(?!\.\./).)+\.migrate)$|; + die "unexpected socket address '$tunnel_info->{addr}'\n" if !$remote_socket; my $local_socket = $remote_socket; $local_socket =~ s/$remote_vmid/$vmid/g; $tunnel_info->{addr} = $local_socket; @@ -1104,6 +1107,9 @@ sub phase2 { PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket); foreach my $remote_socket (@{$tunnel_info->{unix_sockets}}) { + # untaint + ($remote_socket) = $remote_socket =~ m|^(/run/qemu-server/(?:(?!\.\./).)+\.migrate)$| + or die "unexpected socket address '$remote_socket'\n"; my $local_socket = $remote_socket; $local_socket =~ s/$remote_vmid/$vmid/g; next if $self->{tunnel}->{forwarded}->{$local_socket}; -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] Interest in a file manager interface within pve-manager?
sorry for the noise, it seems i forgot to put you in CC, so now again with that... On 8/13/24 10:35, Dominik Csapak wrote: On 7/12/24 00:27, Blythe, Nathan F. - US via pve-devel wrote: Hello, Hi, sorry for the late answer. First thanks for a well written proposal/question. This is what we'd like to see if third-parties are interested in making features/changes/etc. I try to answer the question/proposal to the best of my knowledge, but maybe some of my colleagues have additional things to say or want to correct me (this could take a bit though, due to summer holiday season) We have a need for a non-network based (i.e. hypercall / out-of-band) file browser for containers and VMs within the pve-manager web UI. I'd like to be able to select a VM or container, click "More", and choose "File manager" and then browse the VM or container's file systems and upload or download files from my local system, through the web-browser. For VMs (qemu), the qemu-guest-agent provides the ability to read/write guest files, and the pve API exposes it (with strict file size limits) as nodes->{node}->qemu->{vmid}->agent->file-*. Presumably, a web UI file manager would use that API endpoint, plus maybe an enhancement for directory listing and chunked upload/download (to support arbitrary size files). For containers (LXC), the pct application can read/write files by (I think) mounting the container root file system on the host and touching it directly. But there's no corresponding API endpoint. If there were, it could be used by a future web UI file manager. So to implement this feature, I think there are 5 steps: 1. Enhance the qemu-guest-agent and qm to support reading directory structure and creating directories. 2. Enhance the pve API to expose these new features like file-read and file-write. 3. Enhance the pve API to support chunked file read and write, for large files. Maybe also requiring an enhancement to the qemu-guest-agent? 4. Enhance the pve API to support chunked file read/write, whole file read/write, and directory structure/creating directories (through direct filesystem interaction on the host). 5. Write a JS front-end file manager, invokable from the Other menu, which uses these new/ enhanced API endpoints to implement a general purpose out-of-band file manager. Is there interest from the proxmox developers for something like this? If we started working on some of these items, would there be interest in accepting patches? Does my general approach look sound? After a bit of internal discussion, the general tone is that we probably don't want to expose the guest content/state by default this much now, but in the longer term it could be OK, but we see the following issues: * For now QEMU allows those guest calls by default, and there is at the moment not a really good way for configuring what's allowed and what not. * Our privileges and ACLs are not really designed for this currently. * Handling permissions (and other filesystem/guest "quirks") will be very hard (e.g. windows permissions are very different from linux/unix ones) As breaking the host/guest barrier can have bad implications on security/data integrity/etc. those problems would have to be solved before one could/should start with such a feature. There is currently some work done on the QEMU side for limiting/configuring QGA calls: https://lore.kernel.org/qemu-devel/20240604153242.251334-1-berra...@redhat.com/ When that's integrated, one could think about how our ACLs, privileges could be extended to allow such things. Only after those are solved, implementing this feature make sense. What could be done in the meantime probably is to work with upstream QEMU to add a directory listing to the qga and maybe some proper permission handling? This is relatively independent from our side, since we don't actually ship the guest agent ourselves, so this has to be solved upstream anyway. So all in all, such a feature could work, but only in the longer term, this is nothing for the short/midterm. Hope this answers your questions. If not, just write again :) (We also investigated using SPICE's folder-sharing capabilities and the spice-proxy provided by the host, but that doesn't work with the Windows virt-viewer client, only the Linux client. It looks like there was some interest in the past in enhancing xterm.js to support xmodem file transfer, but it didn't go far.) Regards, Nathan Blythe CACI Best regards Dominik ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] [PATCH qemu-server] remote migration: fix online migration via API clients
Am 09.08.24 um 12:17 schrieb Fiona Ebner: > diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm > index e71face4..7a7183e0 100644 > --- a/PVE/QemuMigrate.pm > +++ b/PVE/QemuMigrate.pm > @@ -1095,7 +1095,8 @@ sub phase2 { > die "only UNIX sockets are supported for remote migration\n" > if $tunnel_info->{proto} ne 'unix'; > > - my $remote_socket = $tunnel_info->{addr}; > + my ($remote_socket) = $tunnel_info->{addr} =~ > m!^(/run/qemu-server/.*\.migrate)$! #untaint I sent a v2 to rule out "../" in the regex: https://lists.proxmox.com/pipermail/pve-devel/2024-August/065075.html ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] issues with Virtio-SCSI devicde on Proxmox...
Hi, Am 12.08.24 um 12:40 schrieb Christian Moser: > Hello, > > I work for VSI (VMS Software Inc) which is porting the OpenVMS operating > system to x86. At this point we successfully on various > hypervisors, but have some issues on the KVM running on Proxmox. > > The OpenVMS VM works just fine with SATA disks and it also works with for > example virtio-network device etc., but trying to use > virtio-scsi hangs the driver. I have debugged this and I can successfully > configure the port/controller, send the IO request to the > device. It then gets processed by the device, which posts the results and > sets the interrupt bit in the ISR register, but it never > asserts the interrupt hence the driver never gets notified and the I/O hangs. > > I have tried both “virtio-scsi-pci” and “virtio-scsi-single”, but no luck. > The emulated virtio-scsi device is a legacy device. But > then again, the virtio-network device is also a legacy device and here we are > getting interrupts. One thing which bothers me is the > fact that the “legacy interrupt disable” bit is set in the PCI config space > of the virtio-scsi device (i.e. bit 10 at offset 4) > > Questions: > * is there a way to configure a modern virtio-scsi devcie (i.e. > disable_legacy=on) ? I've already answered this question when you asked in a mail addressed directly to me: Am 12.08.24 um 11:58 schrieb Fiona Ebner: > Hi, > > It seems that you either need to attach the "virtio-scsi-pci" device to > a pcie bus or explicitly set the "disable_legacy=on" option for the > device, neither of which Proxmox VE currently does or allows > configuring. The only way right now seems to be to attach the disk > yourself via custom arguments (the 'args' in the Proxmox VE VM > configuration), but then the disk will be invisible to Proxmox VE > operations which look at specific disk keys in the configuration! > > Feel free to open a feature request on our bug tracker to make this > configurable: https://bugzilla.proxmox.com/ > > P.S. Please write to the developer list rather than individual > developers for such questions in the feature: > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > Best Regards, > Fiona > * why is the legacy interrupt bit set in the PCI config space ? Most likely because the virtio-scsi-pci is configured without the "disable_legacy=on" option. If not explicitly set, the option will be "disable_legacy=auto" and when not attached to PCIe (which is the case for Proxmox VE currently), then legacy mode will be enabled. > * Are there any working driver for virtio-scsi on this KVM using Q35 > machine? i.e. any other OS The virtio drivers for Windows and the ones in Linux work just fine with our configuration. > Any thoughts why these interrupts are not getting delivered on the PCIE bus? We do not configure the virtio-scsi-pci on a PCIe bus currently, see my initial answer. Best Regards, Fiona ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH docs] pve-network: correct language errors
Signed-off-by: Alexander Zeidler --- pve-network.adoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pve-network.adoc b/pve-network.adoc index acdcf39..434430d 100644 --- a/pve-network.adoc +++ b/pve-network.adoc @@ -447,7 +447,7 @@ slaves in the single logical bonded interface such that different network-peers use different MAC addresses for their network packet traffic. -If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using +If your switch supports the LACP (IEEE 802.3ad) protocol then we recommend using the corresponding bonding mode (802.3ad). Otherwise you should generally use the active-backup mode. @@ -490,7 +490,7 @@ iface vmbr0 inet static [thumbnail="default-network-setup-bond.svg"] -Another possibility it to use the bond directly as bridge port. +Another possibility is to use the bond directly as bridge port. This can be used to make the guest network fault-tolerant. .Example: Use a bond as bridge port -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH docs] local-zfs: correct language errors
Signed-off-by: Alexander Zeidler --- local-zfs.adoc | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/local-zfs.adoc b/local-zfs.adoc index 130f9d6..7ddf50a 100644 --- a/local-zfs.adoc +++ b/local-zfs.adoc @@ -442,7 +442,7 @@ recommend to: # zpool create -f -o ashift=12 log -In above example a single `` and a single `` is used, but you +In the example above, a single `` and a single `` is used, but you can also combine this with other RAID variants, as described in the xref:sysadmin_zfs_create_new_zpool_raid0[Create a new pool with RAID] section. @@ -465,19 +465,19 @@ protection that you want to use for improving the overall performance of your pool. As the maximum size of a log device should be about half the size of the -installed physical memory, it means that the ZIL will mostly likely only take up +installed physical memory, it means that the ZIL will most likely only take up a relatively small part of the SSD, the remaining space can be used as cache. First you have to create two GPT partitions on the SSD with `parted` or `gdisk`. -Then you're ready to add them to an pool: +Then you're ready to add them to a pool: .Add both, a separate log device and a second-level cache, to an existing pool # zpool add -f log cache -Just replay ``, `` and `` with the pool name +Just replace ``, `` and `` with the pool name and the two `/dev/disk/by-id/` paths to the partitions. You can also add ZIL and cache separately. @@ -527,8 +527,8 @@ process of the new disk has progressed. # proxmox-boot-tool init [grub] -NOTE: `ESP` stands for EFI System Partition, which is setup as partition #2 on -bootable disks setup by the {pve} installer since version 5.4. For details, see +NOTE: `ESP` stands for EFI System Partition, which is set up as partition #2 on +bootable disks when using the {pve} installer since version 5.4. For details, see xref:sysboot_proxmox_boot_setup[Setting up a new partition for use as synced ESP]. NOTE: Make sure to pass 'grub' as mode to `proxmox-boot-tool init` if @@ -541,7 +541,7 @@ especially if Secure Boot is enabled! # grub-install NOTE: Plain GRUB is only used on systems installed with {pve} 6.3 or earlier, -which have not been manually migrated to using `proxmox-boot-tool` yet. +which have not been manually migrated to use `proxmox-boot-tool` yet. Configure E-Mail Notification -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] [PATCH docs] pve-network: correct language errors
On 2024-08-13 12:23, Alexander Zeidler wrote: > Signed-off-by: Alexander Zeidler > --- > pve-network.adoc | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/pve-network.adoc b/pve-network.adoc > index acdcf39..434430d 100644 > --- a/pve-network.adoc > +++ b/pve-network.adoc > @@ -447,7 +447,7 @@ slaves in the single logical bonded interface such that > different > network-peers use different MAC addresses for their network packet > traffic. > > -If your switch support the LACP (IEEE 802.3ad) protocol then we recommend > using > +If your switch supports the LACP (IEEE 802.3ad) protocol then we recommend > using ^ There should be comma here: If ..., then ... > the corresponding bonding mode (802.3ad). Otherwise you should generally use > the > active-backup mode. > > @@ -490,7 +490,7 @@ iface vmbr0 inet static > > > [thumbnail="default-network-setup-bond.svg"] > -Another possibility it to use the bond directly as bridge port. > +Another possibility is to use the bond directly as bridge port. > This can be used to make the guest network fault-tolerant. > > .Example: Use a bond as bridge port Apart from this, this looks good to me: Reviewed-by: Lukas Wagner -- - Lukas ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH qemu v2 02/25] PVE backup: fixup error handling for fleecing
The drained section needs to be terminated before breaking out of the loop in the error scenarios. Otherwise, guest IO on the drive would become stuck. If the job is created successfully, then the job completion callback will clean up the snapshot access block nodes. In case failure happened before the job is created, there was no cleanup for the snapshot access block nodes yet. Add it. Signed-off-by: Fiona Ebner --- No changes in v2. pve-backup.c | 38 +- 1 file changed, 25 insertions(+), 13 deletions(-) diff --git a/pve-backup.c b/pve-backup.c index 4e730aa3da..c4178758b3 100644 --- a/pve-backup.c +++ b/pve-backup.c @@ -357,22 +357,23 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque) qemu_co_mutex_unlock(&backup_state.backup_mutex); } +static void cleanup_snapshot_access(PVEBackupDevInfo *di) +{ +if (di->fleecing.snapshot_access) { +bdrv_unref(di->fleecing.snapshot_access); +di->fleecing.snapshot_access = NULL; +} +if (di->fleecing.cbw) { +bdrv_cbw_drop(di->fleecing.cbw); +di->fleecing.cbw = NULL; +} +} + static void pvebackup_complete_cb(void *opaque, int ret) { PVEBackupDevInfo *di = opaque; di->completed_ret = ret; -/* - * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work - * won't be done as a coroutine anyways: - * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would - * just spawn a BH calling bdrv_unref(). - * - For cbw, draining would need to spawn a BH. - */ -if (di->fleecing.snapshot_access) { -bdrv_unref(di->fleecing.snapshot_access); -di->fleecing.snapshot_access = NULL; -} if (di->fleecing.cbw) { /* * With fleecing, failure for cbw does not fail the guest write, but only sets the snapshot @@ -383,10 +384,17 @@ static void pvebackup_complete_cb(void *opaque, int ret) if (di->completed_ret == -EACCES && snapshot_error) { di->completed_ret = snapshot_error; } -bdrv_cbw_drop(di->fleecing.cbw); -di->fleecing.cbw = NULL; } +/* + * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work + * won't be done as a coroutine anyways: + * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would + * just spawn a BH calling bdrv_unref(). + * - For cbw, draining would need to spawn a BH. + */ +cleanup_snapshot_access(di); + /* * Needs to happen outside of coroutine, because it takes the graph write lock. */ @@ -587,6 +595,7 @@ static void create_backup_jobs_bh(void *opaque) { if (!di->fleecing.cbw) { error_setg(errp, "appending cbw node for fleecing failed: %s", local_err ? error_get_pretty(local_err) : "unknown error"); +bdrv_drained_end(di->bs); break; } @@ -599,6 +608,8 @@ static void create_backup_jobs_bh(void *opaque) { if (!di->fleecing.snapshot_access) { error_setg(errp, "setting up snapshot access for fleecing failed: %s", local_err ? error_get_pretty(local_err) : "unknown error"); +cleanup_snapshot_access(di); +bdrv_drained_end(di->bs); break; } source_bs = di->fleecing.snapshot_access; @@ -637,6 +648,7 @@ static void create_backup_jobs_bh(void *opaque) { } if (!job || local_err) { +cleanup_snapshot_access(di); error_setg(errp, "backup_job_create failed: %s", local_err ? error_get_pretty(local_err) : "null"); break; -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH qemu v2 03/25] PVE backup: factor out setting up snapshot access for fleecing
Avoids some line bloat in the create_backup_jobs_bh() function and is in preparation for setting up the snapshot access independently of fleecing, in particular that will be useful for providing access to the snapshot via NBD. Signed-off-by: Fiona Ebner --- No changes in v2. pve-backup.c | 95 1 file changed, 58 insertions(+), 37 deletions(-) diff --git a/pve-backup.c b/pve-backup.c index c4178758b3..051ebffe48 100644 --- a/pve-backup.c +++ b/pve-backup.c @@ -525,6 +525,62 @@ static int coroutine_fn pvebackup_co_add_config( goto out; } +/* + * Setup a snapshot-access block node for a device with associated fleecing image. + */ +static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp) +{ +Error *local_err = NULL; + +if (!di->fleecing.bs) { +error_setg(errp, "no associated fleecing image"); +return -1; +} + +QDict *cbw_opts = qdict_new(); +qdict_put_str(cbw_opts, "driver", "copy-before-write"); +qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs)); +qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs)); + +if (di->bitmap) { +/* + * Only guest writes to parts relevant for the backup need to be intercepted with + * old data being copied to the fleecing image. + */ +qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs)); +qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap)); +} +/* + * Fleecing storage is supposed to be fast and it's better to break backup than guest + * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so + * abort a bit before that. + */ +qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot"); +qdict_put_int(cbw_opts, "cbw-timeout", 45); + +di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err); + +if (!di->fleecing.cbw) { +error_setg(errp, "appending cbw node for fleecing failed: %s", + local_err ? error_get_pretty(local_err) : "unknown error"); +return -1; +} + +QDict *snapshot_access_opts = qdict_new(); +qdict_put_str(snapshot_access_opts, "driver", "snapshot-access"); +qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw)); + +di->fleecing.snapshot_access = +bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err); +if (!di->fleecing.snapshot_access) { +error_setg(errp, "setting up snapshot access for fleecing failed: %s", + local_err ? error_get_pretty(local_err) : "unknown error"); +return -1; +} + +return 0; +} + /* * backup_job_create can *not* be run from a coroutine, so this can't either. * The caller is responsible that backup_mutex is held nonetheless. @@ -569,49 +625,14 @@ static void create_backup_jobs_bh(void *opaque) { const char *job_id = bdrv_get_device_name(di->bs); bdrv_graph_co_rdunlock(); if (di->fleecing.bs) { -QDict *cbw_opts = qdict_new(); -qdict_put_str(cbw_opts, "driver", "copy-before-write"); -qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs)); -qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs)); - -if (di->bitmap) { -/* - * Only guest writes to parts relevant for the backup need to be intercepted with - * old data being copied to the fleecing image. - */ -qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs)); -qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap)); -} -/* - * Fleecing storage is supposed to be fast and it's better to break backup than guest - * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so - * abort a bit before that. - */ -qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot"); -qdict_put_int(cbw_opts, "cbw-timeout", 45); - -di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err); - -if (!di->fleecing.cbw) { -error_setg(errp, "appending cbw node for fleecing failed: %s", - local_err ? error_get_pretty(local_err) : "unknown error"); -bdrv_drained_end(di->bs); -break; -} - -QDict *snapshot_access_opts = qdict_new(); -qdict_put_str(snapshot_access_opts, "driver", "snapshot-access"); -qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw)); - -di->fleecing.snapshot_access = -bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &lo
[pve-devel] [RFC storage v2 11/25] extract backup config: delegate to backup provider if there is one
Signed-off-by: Fiona Ebner --- Changes in v2: * Adapt to method rename. src/PVE/Storage.pm | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm index aea57ab..8993ba7 100755 --- a/src/PVE/Storage.pm +++ b/src/PVE/Storage.pm @@ -1726,6 +1726,16 @@ sub extract_vzdump_config { storage_check_enabled($cfg, $storeid); return PVE::Storage::PBSPlugin->extract_vzdump_config($scfg, $volname, $storeid); } + + my $plugin = PVE::Storage::Plugin->lookup($scfg->{type}); + my $log_function = sub { + my ($log_level, $message) = @_; + my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level); + print "$prefix: $message\n"; + }; + if (my $backup_provider = $plugin->new_backup_provider($scfg, $storeid, $log_function)) { + return $backup_provider->restore_get_guest_config($volname, $storeid); + } } my $archive = abs_filesystem_path($cfg, $volid); -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH qemu v2 04/25] PVE backup: save device name in device info structure
The device name needs to be queried while holding the graph read lock and since it doesn't change during the whole operation, just get it once during setup and avoid the need to query it again in different places. Also in preparation to use it more often in error messages and for the upcoming external backup access API. Signed-off-by: Fiona Ebner --- No changes in v2. pve-backup.c | 29 +++-- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/pve-backup.c b/pve-backup.c index 051ebffe48..33c23e53c2 100644 --- a/pve-backup.c +++ b/pve-backup.c @@ -94,6 +94,7 @@ typedef struct PVEBackupDevInfo { size_t size; uint64_t block_size; uint8_t dev_id; +char* device_name; int completed_ret; // INT_MAX if not completed BdrvDirtyBitmap *bitmap; BlockDriverState *target; @@ -327,6 +328,8 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque) } di->bs = NULL; +g_free(di->device_name); +di->device_name = NULL; assert(di->target == NULL); @@ -621,9 +624,6 @@ static void create_backup_jobs_bh(void *opaque) { BlockDriverState *source_bs = di->bs; bool discard_source = false; -bdrv_graph_co_rdlock(); -const char *job_id = bdrv_get_device_name(di->bs); -bdrv_graph_co_rdunlock(); if (di->fleecing.bs) { if (setup_snapshot_access(di, &local_err) < 0) { error_setg(errp, "setting up snapshot access for fleecing failed: %s", @@ -654,7 +654,7 @@ static void create_backup_jobs_bh(void *opaque) { } BlockJob *job = backup_job_create( -job_id, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap, +di->device_name, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap, bitmap_mode, false, discard_source, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT, JOB_DEFAULT, pvebackup_complete_cb, di, backup_state.txn, &local_err); @@ -751,6 +751,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info( } PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1); di->bs = bs; +di->device_name = g_strdup(bdrv_get_device_name(bs)); if (fleecing && device_uses_fleecing(*d)) { g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL); @@ -789,6 +790,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info( PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1); di->bs = bs; +di->device_name = g_strdup(bdrv_get_device_name(bs)); di_list = g_list_append(di_list, di); } } @@ -956,9 +958,6 @@ UuidInfo coroutine_fn *qmp_backup( di->block_size = dump_cb_block_size; -bdrv_graph_co_rdlock(); -const char *devname = bdrv_get_device_name(di->bs); -bdrv_graph_co_rdunlock(); PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED; size_t dirty = di->size; @@ -973,7 +972,8 @@ UuidInfo coroutine_fn *qmp_backup( } action = PBS_BITMAP_ACTION_NEW; } else { -expect_only_dirty = proxmox_backup_check_incremental(pbs, devname, di->size) != 0; +expect_only_dirty = +proxmox_backup_check_incremental(pbs, di->device_name, di->size) != 0; } if (expect_only_dirty) { @@ -997,7 +997,8 @@ UuidInfo coroutine_fn *qmp_backup( } } -int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, expect_only_dirty, errp); +int dev_id = proxmox_backup_co_register_image(pbs, di->device_name, di->size, + expect_only_dirty, errp); if (dev_id < 0) { goto err_mutex; } @@ -1009,7 +1010,7 @@ UuidInfo coroutine_fn *qmp_backup( di->dev_id = dev_id; PBSBitmapInfo *info = g_malloc(sizeof(*info)); -info->drive = g_strdup(devname); +info->drive = g_strdup(di->device_name); info->action = action; info->size = di->size; info->dirty = dirty; @@ -1034,10 +1035,7 @@ UuidInfo coroutine_fn *qmp_backup( goto err_mutex; } -bdrv_graph_co_rdlock(); -const char *devname = bdrv_get_device_name(di->bs); -bdrv_graph_co_rdunlock(); -di->dev_id = vma_writer_register_stream(vmaw, devname, di->size); +di->dev_id = vma_writer_register_stream(vmaw, di->device_name, di->size); if (di->dev_id <= 0) { error_set(errp, ERROR_CLASS_GENERIC_ERROR, "register_stream failed"); @@ -1148,6 +1146,9 @@ err:
[pve-devel] [PATCH qemu-server v2 14/25] move nbd_stop helper to QMPHelpers module
Like this nbd_stop() can be called from a module that cannot include QemuServer.pm. Signed-off-by: Fiona Ebner --- No changes in v2. PVE/API2/Qemu.pm | 3 ++- PVE/CLI/qm.pm| 3 ++- PVE/QemuServer.pm| 6 -- PVE/QemuServer/QMPHelpers.pm | 6 ++ 4 files changed, 10 insertions(+), 8 deletions(-) diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm index d25a79fe..212cfc1b 100644 --- a/PVE/API2/Qemu.pm +++ b/PVE/API2/Qemu.pm @@ -35,6 +35,7 @@ use PVE::QemuServer::Monitor qw(mon_cmd); use PVE::QemuServer::Machine; use PVE::QemuServer::Memory qw(get_current_memory); use PVE::QemuServer::PCI; +use PVE::QemuServer::QMPHelpers; use PVE::QemuServer::USB; use PVE::QemuMigrate; use PVE::RPCEnvironment; @@ -5910,7 +5911,7 @@ __PACKAGE__->register_method({ return; }, 'nbdstop' => sub { - PVE::QemuServer::nbd_stop($state->{vmid}); + PVE::QemuServer::QMPHelpers::nbd_stop($state->{vmid}); return; }, 'resume' => sub { diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm index d3dbf7b4..8349997e 100755 --- a/PVE/CLI/qm.pm +++ b/PVE/CLI/qm.pm @@ -35,6 +35,7 @@ use PVE::QemuServer::Agent qw(agent_available); use PVE::QemuServer::ImportDisk; use PVE::QemuServer::Monitor qw(mon_cmd); use PVE::QemuServer::OVF; +use PVE::QemuServer::QMPHelpers; use PVE::QemuServer; use PVE::CLIHandler; @@ -385,7 +386,7 @@ __PACKAGE__->register_method ({ my $vmid = $param->{vmid}; - eval { PVE::QemuServer::nbd_stop($vmid) }; + eval { PVE::QemuServer::QMPHelpers::nbd_stop($vmid) }; warn $@ if $@; return; diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm index b26da505..e5ff5efb 100644 --- a/PVE/QemuServer.pm +++ b/PVE/QemuServer.pm @@ -8548,12 +8548,6 @@ sub generate_smbios1_uuid { return "uuid=".generate_uuid(); } -sub nbd_stop { -my ($vmid) = @_; - -mon_cmd($vmid, 'nbd-server-stop', timeout => 25); -} - sub create_reboot_request { my ($vmid) = @_; open(my $fh, '>', "/run/qemu-server/$vmid.reboot") diff --git a/PVE/QemuServer/QMPHelpers.pm b/PVE/QemuServer/QMPHelpers.pm index 0269ea46..826938de 100644 --- a/PVE/QemuServer/QMPHelpers.pm +++ b/PVE/QemuServer/QMPHelpers.pm @@ -15,6 +15,12 @@ qemu_objectadd qemu_objectdel ); +sub nbd_stop { +my ($vmid) = @_; + +mon_cmd($vmid, 'nbd-server-stop', timeout => 25); +} + sub qemu_deviceadd { my ($vmid, $devicefull) = @_; -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH qemu-server v2 17/25] backup: keep track of block-node size instead of volume size
For fleecing, the size needs to match exactly with what QEMU sees. In particular, EFI disks might be attached with a 'size=' option, meaning that size can be different from the volume's size. Commit 36377acf ("backup: disk info: also keep track of size") introduced size tracking and it was only used for fleecing since then, so replace the existing 'size' key in the device info hash and replace it with an explicit 'block-node-size' for clarity. Should also help with the following issue reported in the community forum: https://forum.proxmox.com/threads/152202 Fixes: 36377acf ("backup: disk info: also keep track of size") Signed-off-by: Fiona Ebner --- New in v2. PVE/VZDump/QemuServer.pm | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm index c46e607c..98685127 100644 --- a/PVE/VZDump/QemuServer.pm +++ b/PVE/VZDump/QemuServer.pm @@ -106,6 +106,9 @@ sub prepare { PVE::Storage::activate_volumes($self->{storecfg}, $vollist); +my $block_info = mon_cmd($vmid, "query-block"); +$block_info = { map { $_->{device} => $_ } $block_info->@* }; + foreach my $ds (sort keys %$drivehash) { my $drive = $drivehash->{$ds}; @@ -133,11 +136,22 @@ sub prepare { die "cannot determine size and format of volume '$volid' - $@\n" if $@; } + # The size for fleecing images needs to be exactly the same size as QEMU sees. E.g. EFI disk + # can be attached with a smaller size then the underyling image on the storage. + my $block_node_size = + eval { $block_info->{"drive-$ds"}->{inserted}->{image}->{'virtual-size'}; }; + if (!$block_node_size) { + # TPM state is not attached yet and will be attached with same size, so don't warn then. + $self->loginfo("could not determine block node size of drive '$ds' - using fallback") + if $ds !~ m/^tpmstate\d+/; + $block_node_size = $size; + } + my $diskinfo = { path => $path, volid => $volid, storeid => $storeid, - size => $size, + 'block-node-size' => $block_node_size, format => $format, virtdev => $ds, qmdevice => "drive-$ds", @@ -551,7 +565,7 @@ my sub allocate_fleecing_images { my $name = "vm-$vmid-fleece-$n"; $name .= ".$format" if $scfg->{path}; - my $size = PVE::Tools::convert_size($di->{size}, 'b' => 'kb'); + my $size = PVE::Tools::convert_size($di->{'block-node-size'}, 'b' => 'kb'); $di->{'fleece-volid'} = PVE::Storage::vdisk_alloc( $self->{storecfg}, $fleecing_storeid, $vmid, $format, $name, $size); @@ -600,7 +614,7 @@ my sub attach_fleecing_images { my $drive = "file=$path,if=none,id=$devid,format=$format,discard=unmap"; # Specify size explicitly, to make it work if storage backend rounded up size for # fleecing image when allocating. - $drive .= ",size=$di->{size}" if $format eq 'raw'; + $drive .= ",size=$di->{'block-node-size'}" if $format eq 'raw'; $drive =~ s/\\//g; my $ret = PVE::QemuServer::Monitor::hmp_cmd($vmid, "drive_add auto \"$drive\"", 60); die "attaching fleecing image $volid failed - $ret\n" if $ret !~ m/OK/s; -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [RFC container v2 22/25] backup: implement backup for external providers
The filesystem structure is made available as a directory in a consistent manner (with details depending on the vzdump backup mode) just like for regular backup via tar. The backup provider needs to back up the guest and firewall configuration and then the filesystem structure, honoring the ID maps (for unprivileged containers) as well as file exclusions and the bandwidth limit. Signed-off-by: Fiona Ebner --- Changes in v2: * Adapt to API changes. src/PVE/VZDump/LXC.pm | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/src/PVE/VZDump/LXC.pm b/src/PVE/VZDump/LXC.pm index 67d13db..0fc2a94 100644 --- a/src/PVE/VZDump/LXC.pm +++ b/src/PVE/VZDump/LXC.pm @@ -373,7 +373,27 @@ sub archive { my $userns_cmd = $task->{userns_cmd}; my $findexcl = $self->{vzdump}->{findexcl}; -if ($self->{vzdump}->{opts}->{pbs}) { +if (my $backup_provider = $self->{vzdump}->{'backup-provider'}) { + $self->loginfo("starting external backup via " . $backup_provider->provider_name()); + + my ($mechanism) = $backup_provider->backup_get_mechanism($vmid, 'lxc'); + die "mechanism '$mechanism' requested by backup provider is not supported for containers\n" + if $mechanism ne 'directory'; + + my $config_file = "$tmpdir/etc/vzdump/pct.conf"; + my $firewall_file = "$tmpdir/etc/vzdump/pct.fw"; + + + my $conf = PVE::LXC::Config->load_config($vmid); + my ($id_map, undef, undef) = PVE::LXC::parse_id_maps($conf); + my $info = { + directory => $snapdir, + sources => [@sources], + }; + $info->{'firewall-config'} = $firewall_file if -e $firewall_file; + $info->{'bandwidth-limit'} = $opts->{bwlimit} * 1024 if $opts->{bwlimit}; + $backup_provider->backup_container($vmid, $config_file, $id_map, $findexcl, $info); +} elsif ($self->{vzdump}->{opts}->{pbs}) { my $param = []; push @$param, "pct.conf:$tmpdir/etc/vzdump/pct.conf"; -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [RFC qemu/storage/qemu-server/container/manager v2 00/25] backup provider API
Changes in v2: * Add 'block-device' backup mechansim for VMs. The NBD export is mounted by Proxmox VE and only the block device path (as well as a callback to get the next dirty range for bitmaps) is passed to the backup provider. * Add POC example for Borg - note that I tested with borg 1.2.4 in Debian and only tested with a local repository, not SSH yet. * Merge hook API into a single function for backup and for jobs. * Add restore_vm_init() and restore_vm_cleanup() for better flexibility to allow preparing the whole restore. Question is if restore_vm_volume_init() and restore_vm_volume_cleanup() should be dropped (but certain providers might prefer using only those)? Having both is more flexible, but makes the API longer of course. * Switch to backup_vm() (was per-volume backup_vm_volume() before) and backup_container(), passing along the configuration files, rather than having dedicated methods for the configuration files, for giving the backup provider more flexibility. * Some renames in API methods/params to improve clarity. * Pass backup time to backup 'start' hook and use that in the directory example rather than the job start time. * Use POD for base plugin documentation and flesh out documentation. * Use 'BackupProvider::Plugin::' namespace. * Various smaller improvements in the directory provider example. == A backup provider needs to implement a storage plugin as well as a backup provider plugin. The storage plugin is for integration in Proxmox VE's front-end, so users can manage the backups via UI/API/CLI. The backup provider plugin is for interfacing with the backup provider's backend to integrate backup and restore with that backend into Proxmox VE. This is an initial draft of an API and required changes to the backup stack in Proxmox VE to make it work. Depending on feedback from other developers and interested parties, it can still substantially change. == The backup provider API is split into two parts, both of which again need different implementations for VM and LXC guests: 1. Backup API There are two hook callback functions, namely: 1. job_hook() is called during the start/end/abort phases of the whole backup job. 2. backup_hook() is called during the start/end/abort phases of the backup of an individual guest. The backup_get_mechanism() method is used to decide on the backup mechanism. Currently, 'block-device' or 'nbd' for VMs, and 'directory' for containers is possible. The method also let's the plugin indicate whether to use a bitmap for incremental VM backup or not. It is enough to implement one mechanism for VMs and one mechanism for containers. Next, there are methods for backing up the guest's configuration and data, backup_vm() for VM backup and backup_container() for container backup. Finally, some helpers like getting the provider name or volume ID for the backup target, as well as for handling the backup log. 1.1 Backup Mechanisms VM: Access to the data on the VM's disk from the time the backup started is made available via a so-called "snapshot access". This is either the full image, or in case a bitmap is used, the dirty parts of the image since the last time the bitmap was used for a successful backup. Reading outside of the dirty parts will result in an error. After backing up each part of the disk, it should be discarded in the export to avoid unnecessary space usage on the Proxmox VE side (there is an associated fleecing image). VM mechanism 'block-device': The snapshot access is exposed as a block device. If used, a bitmap is passed along. VM mechanism 'nbd': The snapshot access and, if used, bitmap are exported via NBD. Container mechanism 'directory': A copy or snapshot of the container's filesystem state is made available as a directory. 2. Restore API The restore_get_mechanism() method is used to decide on the restore mechanism. Currently, 'qemu-img' for VMs, and 'directory' or 'tar' for containers are possible. It is enough to implement one mechanism for VMs and one mechanism for containers. Next, methods for extracting the guest and firewall configuration and the implementations of the restore mechanism via a pair of methods: an init method, for making the data available to Proxmox VE and a cleanup method that is called after restore. For VMs, there also is a restore_vm_get_device_info() helper required, to get the disks included in the backup and their sizes. 2.1. Restore Mechanisms VM mechanism 'qemu-img': The backup provider gives a path to the disk image that will be restored. The path needs to be something 'qemu-img' can deal with, e.g. can also be an NBD URI or similar. Container mechanism 'directory': The backup provider gives the path to a directory with the full filesystem structure of the container. Container mechanism 'directory': The backup provider gives the path to a (potentially compressed) tar archive with the full filesystem structure of the container. See the PVE::BackupProvi
[pve-devel] [RFC qemu-server v2 21/25] backup: implement restore for external providers
First, the provider is asked about what restore mechanism to use. Currently, only 'qemu-img' is possible. Then the configuration files are restored, the provider gives information about volumes contained in the backup and finally the volumes are restored via 'qemu-img convert'. The code for the restore_external_archive() function was copied and adapted from the restore_proxmox_backup_archive() function. Together with restore_vma_archive() it seems sensible to extract the common parts and use a dedicated module for restore code. The parse_restore_archive() helper was renamed, because it's not just parsing. Signed-off-by: Fiona Ebner --- Changes in v2: * Adapt to API changes. PVE/API2/Qemu.pm | 29 +- PVE/QemuServer.pm | 139 ++ 2 files changed, 165 insertions(+), 3 deletions(-) diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm index 212cfc1b..8917905e 100644 --- a/PVE/API2/Qemu.pm +++ b/PVE/API2/Qemu.pm @@ -845,7 +845,7 @@ __PACKAGE__->register_method({ return $res; }}); -my $parse_restore_archive = sub { +my $classify_restore_archive = sub { my ($storecfg, $archive) = @_; my ($archive_storeid, $archive_volname) = PVE::Storage::parse_volume_id($archive, 1); @@ -859,6 +859,21 @@ my $parse_restore_archive = sub { $res->{type} = 'pbs'; return $res; } + my $log_function = sub { + my ($log_level, $message) = @_; + my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level); + print "$prefix: $message\n"; + }; + my $backup_provider = PVE::Storage::new_backup_provider( + $storecfg, + $archive_storeid, + $log_function, + ); + if ($backup_provider) { + $res->{type} = 'external'; + $res->{'backup-provider'} = $backup_provider; + return $res; + } } my $path = PVE::Storage::abs_filesystem_path($storecfg, $archive); $res->{type} = 'file'; @@ -1011,7 +1026,7 @@ __PACKAGE__->register_method({ 'backup', ); - $archive = $parse_restore_archive->($storecfg, $archive); + $archive = $classify_restore_archive->($storecfg, $archive); } } @@ -1069,7 +1084,15 @@ __PACKAGE__->register_method({ PVE::QemuServer::check_restore_permissions($rpcenv, $authuser, $merged); } } - if ($archive->{type} eq 'file' || $archive->{type} eq 'pipe') { + if (my $backup_provider = $archive->{'backup-provider'}) { + PVE::QemuServer::restore_external_archive( + $backup_provider, + $archive->{volid}, + $vmid, + $authuser, + $restore_options, + ); + } elsif ($archive->{type} eq 'file' || $archive->{type} eq 'pipe') { die "live-restore is only compatible with backup images from a Proxmox Backup Server\n" if $live_restore; PVE::QemuServer::restore_file_archive($archive->{path} // '-', $vmid, $authuser, $restore_options); diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm index 37f56f69..6cd21b7d 100644 --- a/PVE/QemuServer.pm +++ b/PVE/QemuServer.pm @@ -7245,6 +7245,145 @@ sub restore_proxmox_backup_archive { } } +sub restore_external_archive { +my ($backup_provider, $archive, $vmid, $user, $options) = @_; + +die "live restore from backup provider is not implemented\n" if $options->{live}; + +my $storecfg = PVE::Storage::config(); + +my ($storeid, $volname) = PVE::Storage::parse_volume_id($archive); +my $scfg = PVE::Storage::storage_config($storecfg, $storeid); + +my $tmpdir = "/var/tmp/vzdumptmp$$"; +rmtree $tmpdir; +mkpath $tmpdir; + +my $conffile = PVE::QemuConfig->config_file($vmid); +# disable interrupts (always do cleanups) +local $SIG{INT} = + local $SIG{TERM} = + local $SIG{QUIT} = + local $SIG{HUP} = sub { print STDERR "got interrupt - ignored\n"; }; + +# Note: $oldconf is undef if VM does not exists +my $cfs_path = PVE::QemuConfig->cfs_config_path($vmid); +my $oldconf = PVE::Cluster::cfs_read_file($cfs_path); +my $new_conf_raw = ''; + +my $rpcenv = PVE::RPCEnvironment::get(); +my $devinfo = {}; # info about drives included in backup +my $virtdev_hash = {}; # info about allocated drives + +eval { + # enable interrupts + local $SIG{INT} = + local $SIG{TERM} = + local $SIG{QUIT} = + local $SIG{HUP} = + local $SIG{PIPE} = sub { die "interrupted by signal\n"; }; + + my $cfgfn = "$tmpdir/qemu-server.conf"; + my $firewall_config_fn = "$tmpdir/fw.conf"; + + my $cmd = "restore"; + + my ($mechanism, $vmtype) = + $backup
[pve-devel] [RFC manager v2 25/25] backup: implement backup for external providers
Hooks from the backup provider are called during start/end/abort for both job and backup. And it is necessary to adapt some log messages and special case some things like is already done for PBS, e.g. log file handling. Signed-off-by: Fiona Ebner --- Changes in v2: * Adapt to API changes. PVE/VZDump.pm | 62 - test/vzdump_new_test.pl | 3 ++ 2 files changed, 58 insertions(+), 7 deletions(-) diff --git a/PVE/VZDump.pm b/PVE/VZDump.pm index f1a6b220..2b3e7b1c 100644 --- a/PVE/VZDump.pm +++ b/PVE/VZDump.pm @@ -206,7 +206,15 @@ sub storage_info { $info->{'prune-backups'} = PVE::JSONSchema::parse_property_string('prune-backups', $scfg->{'prune-backups'}) if defined($scfg->{'prune-backups'}); -if ($type eq 'pbs') { +my $backup_provider = PVE::Storage::new_backup_provider( + $cfg, + $storage, + sub { debugmsg($_[0], $_[1]); }, +); + +if ($backup_provider) { + $info->{'backup-provider'} = $backup_provider +} elsif ($type eq 'pbs') { $info->{pbs} = 1; } else { $info->{dumpdir} = PVE::Storage::get_backup_dir($cfg, $storage); @@ -706,6 +714,7 @@ sub new { $opts->{scfg} = $info->{scfg}; $opts->{pbs} = $info->{pbs}; $opts->{'prune-backups'} //= $info->{'prune-backups'}; + $self->{'backup-provider'} = $info->{'backup-provider'} if $info->{'backup-provider'}; } } elsif ($opts->{dumpdir}) { $add_error->("dumpdir '$opts->{dumpdir}' does not exist") @@ -990,7 +999,7 @@ sub exec_backup_task { } } - if (!$self->{opts}->{pbs}) { + if (!$self->{opts}->{pbs} && !$self->{'backup-provider'}) { $task->{logfile} = "$opts->{dumpdir}/$basename.log"; } @@ -1000,7 +1009,11 @@ sub exec_backup_task { $ext .= ".${comp_ext}"; } - if ($self->{opts}->{pbs}) { + if ($self->{'backup-provider'}) { + die "unable to pipe backup to stdout\n" if $opts->{stdout}; + $task->{target} = $self->{'backup-provider'}->backup_get_archive_name( + $vmid, $vmtype, $task->{backup_time}); + } elsif ($self->{opts}->{pbs}) { die "unable to pipe backup to stdout\n" if $opts->{stdout}; $task->{target} = $pbs_snapshot_name; } else { @@ -1018,7 +1031,7 @@ sub exec_backup_task { my $pid = $$; if ($opts->{tmpdir}) { $task->{tmpdir} = "$opts->{tmpdir}/vzdumptmp${pid}_$vmid/"; - } elsif ($self->{opts}->{pbs}) { + } elsif ($self->{opts}->{pbs} || $self->{'backup-provider'}) { $task->{tmpdir} = "/var/tmp/vzdumptmp${pid}_$vmid"; } else { # dumpdir is posix? then use it as temporary dir @@ -1090,6 +1103,10 @@ sub exec_backup_task { if ($mode eq 'stop') { $plugin->prepare ($task, $vmid, $mode); + if ($self->{'backup-provider'}) { + $self->{'backup-provider'}->backup_hook( + 'start', $vmid, $vmtype, { 'start-time' => $task->{backup_time} }); + } $self->run_hook_script ('backup-start', $task, $logfd); if ($running) { @@ -1104,6 +1121,10 @@ sub exec_backup_task { } elsif ($mode eq 'suspend') { $plugin->prepare ($task, $vmid, $mode); + if ($self->{'backup-provider'}) { + $self->{'backup-provider'}->backup_hook( + 'start', $vmid, $vmtype, { 'start-time' => $task->{backup_time} }); + } $self->run_hook_script ('backup-start', $task, $logfd); if ($vmtype eq 'lxc') { @@ -1130,6 +1151,10 @@ sub exec_backup_task { } } elsif ($mode eq 'snapshot') { + if ($self->{'backup-provider'}) { + $self->{'backup-provider'}->backup_hook( + 'start', $vmid, $vmtype, { 'start-time' => $task->{backup_time} }); + } $self->run_hook_script ('backup-start', $task, $logfd); my $snapshot_count = $task->{snapshot_count} || 0; @@ -1172,11 +1197,13 @@ sub exec_backup_task { return; } - my $archive_txt = $self->{opts}->{pbs} ? 'Proxmox Backup Server' : 'vzdump'; + my $archive_txt = 'vzdump'; + $archive_txt = 'Proxmox Backup Server' if $self->{opts}->{pbs}; + $archive_txt = $self->{'backup-provider'}->provider_name() if $self->{'backup-provider'}; debugmsg('info', "creating $archive_txt archive '$task->{target}'", $logfd); $plugin->archive($task, $vmid, $task->{tmptar}, $comp); - if ($self->{opts}->{pbs}) { + if ($self->{'backup-provider'} || $self->{opts}->{pbs}) { # size is added to task struct in guest vzdump plugins } else { rename ($task->{tmptar}, $task->{target}) || @@ -1190,7 +1217,8 @@ sub exec_backup_task { # Mark as protected before pruning. if (my
[pve-devel] [POC storage v2 12/25] add backup provider example
The example uses a simple directory structure to save the backups, grouped by guest ID. VM backups are saved as configuration files and qcow2 images, with backing files when doing incremental backups. Container backups are saved as configuration files and a tar file or squashfs image (added to test the 'directory' restore mechanism). Whether to use incremental VM backups and which backup mechanisms to use can be configured in the storage configuration. The 'nbdinfo' binary from the 'libnbd-bin' package is required for backup mechanism 'nbd' for VM backups, the 'mksquashfs' binary from the 'squashfs-tools' package is required for backup mechanism 'squashfs' for containers. Signed-off-by: Fiona Ebner --- Changes in v2: * Adapt to API changes. * Add log function helpers. * Use make_path and remove_tree instead of deprecated variants. * Add support for 'block-device' backup mechanism for VMs. * Use backup time for guest rather than the job start time in the archive name. .../BackupProvider/Plugin/DirectoryExample.pm | 694 ++ src/PVE/BackupProvider/Plugin/Makefile| 2 +- .../Custom/BackupProviderDirExamplePlugin.pm | 306 src/PVE/Storage/Custom/Makefile | 5 + src/PVE/Storage/Makefile | 1 + 5 files changed, 1007 insertions(+), 1 deletion(-) create mode 100644 src/PVE/BackupProvider/Plugin/DirectoryExample.pm create mode 100644 src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm create mode 100644 src/PVE/Storage/Custom/Makefile diff --git a/src/PVE/BackupProvider/Plugin/DirectoryExample.pm b/src/PVE/BackupProvider/Plugin/DirectoryExample.pm new file mode 100644 index 000..b01ad02 --- /dev/null +++ b/src/PVE/BackupProvider/Plugin/DirectoryExample.pm @@ -0,0 +1,694 @@ +package PVE::BackupProvider::Plugin::DirectoryExample; + +use strict; +use warnings; + +use Fcntl qw(SEEK_SET); +use File::Path qw(make_path remove_tree); +use IO::File; +use IPC::Open3; + +use PVE::Storage::Plugin; +use PVE::Tools qw(file_get_contents file_read_firstline file_set_contents run_command); + +use base qw(PVE::BackupProvider::Plugin::Base); + +use constant { +BLKDISCARD => 0x1277, # see linux/fs.h +}; + +# Private helpers + +my sub log_info { +my ($self, $message) = @_; + +$self->{'log-function'}->('info', $message); +} + +my sub log_warning { +my ($self, $message) = @_; + +$self->{'log-function'}->('warn', $message); +} + +my sub log_error { +my ($self, $message) = @_; + +$self->{'log-function'}->('err', $message); +} + +# Try to use the same bitmap ID as last time for incremental backup if the storage is configured for +# incremental VM backup. Need to start fresh if there is no previous ID or the associated backup +# doesn't exist. +my sub get_bitmap_id { +my ($self, $vmid, $vmtype) = @_; + +return if $self->{'storage-plugin'}->get_vm_backup_mode($self->{scfg}) ne 'incremental'; + +my $previous_info_dir = "$self->{scfg}->{path}/$vmid/"; + +my $previous_info_file = "$previous_info_dir/previous-info"; +my $info = file_read_firstline($previous_info_file) // ''; +$self->{$vmid}->{'old-previous-info'} = $info; +my ($bitmap_id, $previous_backup_id) = $info =~ m/^(\d+)\s+(\d+)$/; +my $previous_backup_dir = + $previous_backup_id ? "$self->{scfg}->{path}/$vmid/$vmtype-$previous_backup_id" : undef; + +if ($bitmap_id && -d $previous_backup_dir) { + $self->{$vmid}->{'previous-backup-dir'} = $previous_backup_dir; +} else { + # need to start fresh if there is no previous ID or the associated backup doesn't exist + $bitmap_id = $self->{$vmid}->{'backup-time'}; +} + +$self->{$vmid}->{'bitmap-id'} = $bitmap_id; +make_path($previous_info_dir); +die "unable to create directory $previous_info_dir\n" if !-d $previous_info_dir; +file_set_contents($previous_info_file, "$bitmap_id $self->{$vmid}->{'backup-time'}"); + +return $bitmap_id; +} + +# Backup Provider API + +sub new { +my ($class, $storage_plugin, $scfg, $storeid, $log_function) = @_; + +my $self = bless { + scfg => $scfg, + storeid => $storeid, + 'storage-plugin' => $storage_plugin, + 'log-function' => $log_function, +}, $class; + +return $self; +} + +sub provider_name { +my ($self) = @_; + +return 'dir provider example'; +} + +# Hooks + +my sub job_start { +my ($self, $start_time) = @_; + +log_info($self, "job start hook called"); + +run_command(["modprobe", "nbd"]); + +log_info($self, "backup provider initialized successfully for new job $start_time"); +} + +sub job_hook { +my ($self, $phase, $info) = @_; + +if ($phase eq 'start') { + job_start($self, $info->{'start-time'}); +} elsif ($phase eq 'end') { + log_info($self, "job end hook called"); +} elsif ($phase eq 'abort') { + log_info($self, "job abort hook called with error - $info->{error}"); +} + +# ignore unknown p
[pve-devel] [RFC storage v2 10/25] plugin: introduce new_backup_provider() method
The new_backup_provider() method can be used by storage plugins for external backup providers. If the method returns a provider, Proxmox VE will use callbacks to that provider for backups and restore instead of using its usual backup/restore mechanisms. API age and version are both bumped. The backup provider API is split into two parts, both of which again need different implementations for VM and LXC guests: 1. Backup API There are two hook callback functions, namely: 1. job_hook() is called during the start/end/abort phases of the whole backup job. 2. backup_hook() is called during the start/end/abort phases of the backup of an individual guest. The backup_get_mechanism() method is used to decide on the backup mechanism. Currently, 'block-device' or 'nbd' for VMs, and 'directory' for containers is possible. The method also let's the plugin indicate whether to use a bitmap for incremental VM backup or not. It is enough to implement one mechanism for VMs and one mechanism for containers. Next, there are methods for backing up the guest's configuration and data, backup_vm() for VM backup and backup_container() for container backup. Finally, some helpers like getting the provider name or volume ID for the backup target, as well as for handling the backup log. 1.1 Backup Mechanisms VM: Access to the data on the VM's disk from the time the backup started is made available via a so-called "snapshot access". This is either the full image, or in case a bitmap is used, the dirty parts of the image since the last time the bitmap was used for a successful backup. Reading outside of the dirty parts will result in an error. After backing up each part of the disk, it should be discarded in the export to avoid unnecessary space usage on the Proxmox VE side (there is an associated fleecing image). VM mechanism 'block-device': The snapshot access is exposed as a block device. If used, a bitmap is passed along. VM mechanism 'nbd': The snapshot access and, if used, bitmap are exported via NBD. Container mechanism 'directory': A copy or snapshot of the container's filesystem state is made available as a directory. 2. Restore API The restore_get_mechanism() method is used to decide on the restore mechanism. Currently, 'qemu-img' for VMs, and 'directory' or 'tar' for containers are possible. It is enough to implement one mechanism for VMs and one mechanism for containers. Next, methods for extracting the guest and firewall configuration and the implementations of the restore mechanism via a pair of methods: an init method, for making the data available to Proxmox VE and a cleanup method that is called after restore. For VMs, there also is a restore_vm_get_device_info() helper required, to get the disks included in the backup and their sizes. 2.1. Restore Mechanisms VM mechanism 'qemu-img': The backup provider gives a path to the disk image that will be restored. The path needs to be something 'qemu-img' can deal with, e.g. can also be an NBD URI or similar. Container mechanism 'directory': The backup provider gives the path to a directory with the full filesystem structure of the container. Container mechanism 'directory': The backup provider gives the path to a (potentially compressed) tar archive with the full filesystem structure of the container. See the PVE::BackupProvider::Plugin module for the full API documentation. Signed-off-by: Fiona Ebner --- Changes in v2: * Merge hook API into a single function for backup and for jobs. * Add restore_vm_init() and restore_vm_cleanup() for better flexibility to allow preparing the whole restore. Question is if restore_vm_volume_init() and restore_vm_volume_cleanup() should be dropped (but certain providers might prefer using only those)? Having both is more flexible, but makes the API longer of course. * Switch to backup_vm() (was per-volume backup_vm_volume() before) and backup_container(), passing along the configuration files, rather than having dedicated methods for the configuration files, for giving the backup provider more flexibility. * Pass backup time to backup 'start' hook and use that in the directory example rather than the job start time. * Use POD for base plugin documentation and flesh out documentation. * Use 'BackupProvider::Plugin::' namespace. * Rename $exclude_paths to $exclude_patterns and clarify what should be supported. * Rename backup_get_target() method to backup_get_archive_name() for clarity. * Rename extract_guest_config() to restore_get_guest_config() for better consistency with other (restore) API methods. src/PVE/BackupProvider/Makefile|3 + src/PVE/BackupProvider/Plugin/Base.pm | 1149 src/PVE/BackupProvider/Plugin/Makefile |5 + src/PVE/Makefile |1 + src/PVE/Storage.pm | 12 +- src/PVE/Storage/Plugin.pm | 15 + 6 files changed, 1183 insertions(+), 2 deletions(-) create mode 100644
[pve-devel] [PATCH qemu v2 01/25] block/reqlist: allow adding overlapping requests
Allow overlapping request by removing the assert that made it impossible. There are only two callers: 1. block_copy_task_create() It already asserts the very same condition before calling reqlist_init_req(). 2. cbw_snapshot_read_lock() There is no need to have read requests be non-overlapping in copy-before-write when used for snapshot-access. In fact, there was no protection against two callers of cbw_snapshot_read_lock() calling reqlist_init_req() with overlapping ranges and this could lead to an assertion failure [1]. In particular, with the reproducer script below [0], two cbw_co_snapshot_block_status() callers could race, with the second calling reqlist_init_req() before the first one finishes and removes its conflicting request. [0]: > #!/bin/bash -e > dd if=/dev/urandom of=/tmp/disk.raw bs=1M count=1024 > ./qemu-img create /tmp/fleecing.raw -f raw 1G > ( > ./qemu-system-x86_64 --qmp stdio \ > --blockdev raw,node-name=node0,file.driver=file,file.filename=/tmp/disk.raw \ > --blockdev > raw,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.raw \ > < {"execute": "qmp_capabilities"} > {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", > "file": "node0", "target": "node1", "node-name": "node3" } } > {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", > "file": "node3", "node-name": "snap0" } } > {"execute": "nbd-server-start", "arguments": {"addr": { "type": "unix", > "data": { "path": "/tmp/nbd.socket" } } } } > {"execute": "block-export-add", "arguments": {"id": "exp0", "node-name": > "snap0", "type": "nbd", "name": "exp0"}} > EOF > ) & > sleep 5 > while true; do > ./qemu-nbd -d /dev/nbd0 > ./qemu-nbd -c /dev/nbd0 nbd:unix:/tmp/nbd.socket:exportname=exp0 -f raw -r > nbdinfo --map 'nbd+unix:///exp0?socket=/tmp/nbd.socket' > done [1]: > #5 0x71e5f0088eb2 in __GI___assert_fail (...) at ./assert/assert.c:101 > #6 0x615285438017 in reqlist_init_req (...) at ../block/reqlist.c:23 > #7 0x6152853e2d98 in cbw_snapshot_read_lock (...) at > ../block/copy-before-write.c:237 > #8 0x6152853e3068 in cbw_co_snapshot_block_status (...) at > ../block/copy-before-write.c:304 > #9 0x6152853f4d22 in bdrv_co_snapshot_block_status (...) at > ../block/io.c:3726 > #10 0x61528543a63e in snapshot_access_co_block_status (...) at > ../block/snapshot-access.c:48 > #11 0x6152853f1a0a in bdrv_co_do_block_status (...) at ../block/io.c:2474 > #12 0x6152853f2016 in bdrv_co_common_block_status_above (...) at > ../block/io.c:2652 > #13 0x6152853f22cf in bdrv_co_block_status_above (...) at > ../block/io.c:2732 > #14 0x6152853d9a86 in blk_co_block_status_above (...) at > ../block/block-backend.c:1473 > #15 0x61528538da6c in blockstatus_to_extents (...) at ../nbd/server.c:2374 > #16 0x61528538deb1 in nbd_co_send_block_status (...) at > ../nbd/server.c:2481 > #17 0x61528538f424 in nbd_handle_request (...) at ../nbd/server.c:2978 > #18 0x61528538f906 in nbd_trip (...) at ../nbd/server.c:3121 > #19 0x6152855a7caf in coroutine_trampoline (...) at > ../util/coroutine-ucontext.c:175 Cc: qemu-sta...@nongnu.org Suggested-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Fiona Ebner Reviewed-by: Vladimir Sementsov-Ogievskiy --- No changes in v2. block/copy-before-write.c | 3 ++- block/reqlist.c | 2 -- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/block/copy-before-write.c b/block/copy-before-write.c index 50cc4c7aae..a5bb4d14f6 100644 --- a/block/copy-before-write.c +++ b/block/copy-before-write.c @@ -67,7 +67,8 @@ typedef struct BDRVCopyBeforeWriteState { /* * @frozen_read_reqs: current read requests for fleecing user in bs->file - * node. These areas must not be rewritten by guest. + * node. These areas must not be rewritten by guest. There can be multiple + * overlapping read requests. */ BlockReqList frozen_read_reqs; diff --git a/block/reqlist.c b/block/reqlist.c index 08cb57cfa4..098e807378 100644 --- a/block/reqlist.c +++ b/block/reqlist.c @@ -20,8 +20,6 @@ void reqlist_init_req(BlockReqList *reqs, BlockReq *req, int64_t offset, int64_t bytes) { -assert(!reqlist_find_conflict(reqs, offset, bytes)); - *req = (BlockReq) { .offset = offset, .bytes = bytes, -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH qemu v2 05/25] PVE backup: include device name in error when setting up snapshot access fails
Signed-off-by: Fiona Ebner --- No changes in v2. pve-backup.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/pve-backup.c b/pve-backup.c index 33c23e53c2..d931746453 100644 --- a/pve-backup.c +++ b/pve-backup.c @@ -626,7 +626,8 @@ static void create_backup_jobs_bh(void *opaque) { bool discard_source = false; if (di->fleecing.bs) { if (setup_snapshot_access(di, &local_err) < 0) { -error_setg(errp, "setting up snapshot access for fleecing failed: %s", +error_setg(errp, "%s - setting up snapshot access for fleecing failed: %s", + di->device_name, local_err ? error_get_pretty(local_err) : "unknown error"); cleanup_snapshot_access(di); bdrv_drained_end(di->bs); -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [RFC qemu v2 06/25] PVE backup: add target ID in backup state
In preparation for allowing multiple backup providers. Each backup target can then have its own dirty bitmap and there can be additional checks that the current backup state is actually associated to the expected target. Signed-off-by: Fiona Ebner --- No changes in v2. pve-backup.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/pve-backup.c b/pve-backup.c index d931746453..e8031bb89c 100644 --- a/pve-backup.c +++ b/pve-backup.c @@ -70,6 +70,7 @@ static struct PVEBackupState { JobTxn *txn; CoMutex backup_mutex; CoMutex dump_callback_mutex; +char *target_id; } backup_state; static void pvebackup_init(void) @@ -848,7 +849,7 @@ UuidInfo coroutine_fn *qmp_backup( if (backup_state.di_list) { error_set(errp, ERROR_CLASS_GENERIC_ERROR, - "previous backup not finished"); + "previous backup by provider '%s' not finished", backup_state.target_id); qemu_co_mutex_unlock(&backup_state.backup_mutex); return NULL; } @@ -1100,6 +1101,11 @@ UuidInfo coroutine_fn *qmp_backup( backup_state.vmaw = vmaw; backup_state.pbs = pbs; +if (backup_state.target_id) { +g_free(backup_state.target_id); +} +backup_state.target_id = g_strdup("Proxmox"); + backup_state.di_list = di_list; uuid_info = g_malloc0(sizeof(*uuid_info)); -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [RFC qemu v2 07/25] PVE backup: get device info: allow caller to specify filter for which devices use fleecing
For providing snapshot-access to external backup providers, EFI and TPM also need an associated fleecing image. The new caller will thus need a different filter. Signed-off-by: Fiona Ebner --- No changes in v2. pve-backup.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/pve-backup.c b/pve-backup.c index e8031bb89c..d0593fc581 100644 --- a/pve-backup.c +++ b/pve-backup.c @@ -717,7 +717,7 @@ static void create_backup_jobs_bh(void *opaque) { /* * EFI disk and TPM state are small and it's just not worth setting up fleecing for them. */ -static bool device_uses_fleecing(const char *device_id) +static bool fleecing_no_efi_tpm(const char *device_id) { return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, "drive-tpmstate", 14); } @@ -729,7 +729,7 @@ static bool device_uses_fleecing(const char *device_id) */ static GList coroutine_fn GRAPH_RDLOCK *get_device_info( const char *devlist, -bool fleecing, +bool (*device_uses_fleecing)(const char*), Error **errp) { gchar **devs = NULL; @@ -755,7 +755,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info( di->bs = bs; di->device_name = g_strdup(bdrv_get_device_name(bs)); -if (fleecing && device_uses_fleecing(*d)) { +if (device_uses_fleecing && device_uses_fleecing(*d)) { g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL); BlockBackend *fleecing_blk = blk_by_name(fleecing_devid); if (!fleecing_blk) { @@ -858,7 +858,8 @@ UuidInfo coroutine_fn *qmp_backup( format = has_format ? format : BACKUP_FORMAT_VMA; bdrv_graph_co_rdlock(); -di_list = get_device_info(devlist, has_fleecing && fleecing, &local_err); +di_list = get_device_info(devlist, (has_fleecing && fleecing) ? fleecing_no_efi_tpm : NULL, + &local_err); bdrv_graph_co_rdunlock(); if (local_err) { error_propagate(errp, local_err); -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH qemu-server v2 20/25] restore: die early when there is no size for a device
Makes it a clean error for buggy (external) backup providers where the size might not be set at all. Signed-off-by: Fiona Ebner --- No changes in v2. PVE/QemuServer.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm index e5ff5efb..37f56f69 100644 --- a/PVE/QemuServer.pm +++ b/PVE/QemuServer.pm @@ -6755,6 +6755,7 @@ my $restore_allocate_devices = sub { my $map = {}; foreach my $virtdev (sort keys %$virtdev_hash) { my $d = $virtdev_hash->{$virtdev}; + die "got no size for '$virtdev'\n" if !defined($d->{size}); my $alloc_size = int(($d->{size} + 1024 - 1)/1024); my $storeid = $d->{storeid}; my $scfg = PVE::Storage::storage_config($storecfg, $storeid); -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH qemu-server v2 16/25] backup: cleanup: check if VM is running before issuing QMP commands
When the VM is only started for backup, the VM will be stopped at that point again. While the detach helpers do not warn about errors currently, that might change in the future. This is also in preparation for other cleanup QMP helpers that are more verbose about failure. Signed-off-by: Fiona Ebner --- No changes in v2. PVE/VZDump/QemuServer.pm | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm index b2ced154..c46e607c 100644 --- a/PVE/VZDump/QemuServer.pm +++ b/PVE/VZDump/QemuServer.pm @@ -1118,13 +1118,14 @@ sub snapshot { sub cleanup { my ($self, $task, $vmid) = @_; -$detach_tpmstate_drive->($task, $vmid); - -if ($task->{'use-fleecing'}) { - detach_fleecing_images($task->{disks}, $vmid); - cleanup_fleecing_images($self, $task->{disks}); +# If VM was started only for backup, it is already stopped now. +if (PVE::QemuServer::Helpers::vm_running_locally($vmid)) { + $detach_tpmstate_drive->($task, $vmid); + detach_fleecing_images($task->{disks}, $vmid) if $task->{'use-fleecing'}; } +cleanup_fleecing_images($self, $task->{disks}) if $task->{'use-fleecing'}; + if ($self->{qmeventd_fh}) { close($self->{qmeventd_fh}); } -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH qemu-server v2 15/25] backup: move cleanup of fleecing images to cleanup method
TPM drives are already detached there and it's better to group these things together. Signed-off-by: Fiona Ebner --- No changes in v2. PVE/VZDump/QemuServer.pm | 25 + 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm index 012c9210..b2ced154 100644 --- a/PVE/VZDump/QemuServer.pm +++ b/PVE/VZDump/QemuServer.pm @@ -690,7 +690,6 @@ sub archive_pbs { # get list early so we die on unkown drive types before doing anything my $devlist = _get_task_devlist($task); -my $use_fleecing; $self->enforce_vm_running_for_backup($vmid); $self->{qmeventd_fh} = PVE::QemuServer::register_qmeventd_handle($vmid); @@ -721,7 +720,7 @@ sub archive_pbs { my $is_template = PVE::QemuConfig->is_template($self->{vmlist}->{$vmid}); - $use_fleecing = check_and_prepare_fleecing( + $task->{'use-fleecing'} = check_and_prepare_fleecing( $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support); my $fs_frozen = $self->qga_fs_freeze($task, $vmid); @@ -735,7 +734,7 @@ sub archive_pbs { devlist => $devlist, 'config-file' => $conffile, }; - $params->{fleecing} = JSON::true if $use_fleecing; + $params->{fleecing} = JSON::true if $task->{'use-fleecing'}; if (defined(my $ns = $scfg->{namespace})) { $params->{'backup-ns'} = $ns; @@ -784,11 +783,6 @@ sub archive_pbs { } $self->restore_vm_power_state($vmid); -if ($use_fleecing) { - detach_fleecing_images($task->{disks}, $vmid); - cleanup_fleecing_images($self, $task->{disks}); -} - die $err if $err; } @@ -891,7 +885,6 @@ sub archive_vma { } my $devlist = _get_task_devlist($task); -my $use_fleecing; $self->enforce_vm_running_for_backup($vmid); $self->{qmeventd_fh} = PVE::QemuServer::register_qmeventd_handle($vmid); @@ -911,7 +904,7 @@ sub archive_vma { $attach_tpmstate_drive->($self, $task, $vmid); - $use_fleecing = check_and_prepare_fleecing( + $task->{'use-fleecing'} = check_and_prepare_fleecing( $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support); my $outfh; @@ -942,7 +935,7 @@ sub archive_vma { devlist => $devlist }; $params->{'firewall-file'} = $firewall if -e $firewall; - $params->{fleecing} = JSON::true if $use_fleecing; + $params->{fleecing} = JSON::true if $task->{'use-fleecing'}; add_backup_performance_options($params, $opts->{performance}, $qemu_support); $qmpclient->queue_cmd($vmid, $backup_cb, 'backup', %$params); @@ -984,11 +977,6 @@ sub archive_vma { $self->restore_vm_power_state($vmid); -if ($use_fleecing) { - detach_fleecing_images($task->{disks}, $vmid); - cleanup_fleecing_images($self, $task->{disks}); -} - if ($err) { if ($cpid) { kill(9, $cpid); @@ -1132,6 +1120,11 @@ sub cleanup { $detach_tpmstate_drive->($task, $vmid); +if ($task->{'use-fleecing'}) { + detach_fleecing_images($task->{disks}, $vmid); + cleanup_fleecing_images($self, $task->{disks}); +} + if ($self->{qmeventd_fh}) { close($self->{qmeventd_fh}); } -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH manager v2 24/25] ui: backup: also check for backup subtype to classify archive
In anticipation of future storage plugins that might not have PBS-specific formats or adhere to the vzdump naming scheme for backups. Signed-off-by: Fiona Ebner --- No changes in v2. www/manager6/Utils.js | 10 ++ www/manager6/grid/BackupView.js| 4 ++-- www/manager6/storage/BackupView.js | 4 ++-- 3 files changed, 10 insertions(+), 8 deletions(-) diff --git a/www/manager6/Utils.js b/www/manager6/Utils.js index db86fa9a..a8e4e8ee 100644 --- a/www/manager6/Utils.js +++ b/www/manager6/Utils.js @@ -693,12 +693,14 @@ Ext.define('PVE.Utils', { 'snippets': gettext('Snippets'), }, -volume_is_qemu_backup: function(volid, format) { - return format === 'pbs-vm' || volid.match(':backup/vzdump-qemu-'); +volume_is_qemu_backup: function(volume) { + return volume.format === 'pbs-vm' || volume.volid.match(':backup/vzdump-qemu-') || + volume.subtype === 'qemu'; }, -volume_is_lxc_backup: function(volid, format) { - return format === 'pbs-ct' || volid.match(':backup/vzdump-(lxc|openvz)-'); +volume_is_lxc_backup: function(volume) { + return volume.format === 'pbs-ct' || volume.volid.match(':backup/vzdump-(lxc|openvz)-') || + volume.subtype === 'lxc'; }, authSchema: { diff --git a/www/manager6/grid/BackupView.js b/www/manager6/grid/BackupView.js index e71d1c88..ef3649c6 100644 --- a/www/manager6/grid/BackupView.js +++ b/www/manager6/grid/BackupView.js @@ -29,11 +29,11 @@ Ext.define('PVE.grid.BackupView', { var vmtypeFilter; if (vmtype === 'lxc' || vmtype === 'openvz') { vmtypeFilter = function(item) { - return PVE.Utils.volume_is_lxc_backup(item.data.volid, item.data.format); + return PVE.Utils.volume_is_lxc_backup(item.data); }; } else if (vmtype === 'qemu') { vmtypeFilter = function(item) { - return PVE.Utils.volume_is_qemu_backup(item.data.volid, item.data.format); + return PVE.Utils.volume_is_qemu_backup(item.data); }; } else { throw "unsupported VM type '" + vmtype + "'"; diff --git a/www/manager6/storage/BackupView.js b/www/manager6/storage/BackupView.js index 878e1c8f..ad6e6a01 100644 --- a/www/manager6/storage/BackupView.js +++ b/www/manager6/storage/BackupView.js @@ -84,9 +84,9 @@ Ext.define('PVE.storage.BackupView', { disabled: true, handler: function(b, e, rec) { let vmtype; - if (PVE.Utils.volume_is_qemu_backup(rec.data.volid, rec.data.format)) { + if (PVE.Utils.volume_is_qemu_backup(rec.data)) { vmtype = 'qemu'; - } else if (PVE.Utils.volume_is_lxc_backup(rec.data.volid, rec.data.format)) { + } else if (PVE.Utils.volume_is_lxc_backup(rec.data)) { vmtype = 'lxc'; } else { return; -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [RFC qemu v2 08/25] PVE backup: implement backup access setup and teardown API for external providers
For external backup providers, the state of the VM's disk images at the time the backup is started is preserved via a snapshot-access block node. Old data is moved to the fleecing image when new guest writes come in. The snapshot-access block node, as well as the associated bitmap in case of incremental backup, will be exported via NBD to the external provider. The NBD export will be done by the management layer, the missing functionality is setting up and tearing down the snapshot-access block nodes, which this patch adds. It is necessary to also set up fleecing for EFI and TPM disks, so that old data can be moved out of the way when a new guest write comes in. There can only be one regular backup or one active backup access at a time, because both require replacing the original block node of the drive. Thus the backup state is re-used, and checks are added to prohibit regular backup while snapshot access is active and vice versa. The block nodes added by the backup-access-setup QMP call are not tracked anywhere else (there is no job they are associated to like for regular backup). This requires adding a callback for teardown when QEMU exits, i.e. in qemu_cleanup(). Otherwise, there will be an assertion failure that the block graph is not empty when QEMU exits before the backup-access-teardown QMP command is called. The code for the qmp_backup_access_setup() was based on the existing qmp_backup() routine. The return value for the setup QMP command contains information about the snapshot-access block nodes that can be used by the management layer to set up the NBD exports. Signed-off-by: Fiona Ebner --- Changes in v2: * Also return the size of the block devices in the setup call. pve-backup.c | 273 +++ pve-backup.h | 16 +++ qapi/block-core.json | 45 +++ system/runstate.c| 6 + 4 files changed, 340 insertions(+) create mode 100644 pve-backup.h diff --git a/pve-backup.c b/pve-backup.c index d0593fc581..d3370d6744 100644 --- a/pve-backup.c +++ b/pve-backup.c @@ -1,4 +1,5 @@ #include "proxmox-backup-client.h" +#include "pve-backup.h" #include "vma.h" #include "qemu/osdep.h" @@ -585,6 +586,37 @@ static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp) return 0; } +static void setup_all_snapshot_access_bh(void *opaque) +{ +assert(!qemu_in_coroutine()); + +CoCtxData *data = (CoCtxData*)opaque; +Error **errp = (Error**)data->data; + +Error *local_err = NULL; + +GList *l = backup_state.di_list; +while (l) { +PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; +l = g_list_next(l); + +bdrv_drained_begin(di->bs); + +if (setup_snapshot_access(di, &local_err) < 0) { +cleanup_snapshot_access(di); +bdrv_drained_end(di->bs); +error_setg(errp, "%s - setting up snapshot access failed: %s", di->device_name, + local_err ? error_get_pretty(local_err) : "unknown error"); +break; +} + +bdrv_drained_end(di->bs); +} + +/* return */ +aio_co_enter(data->ctx, data->co); +} + /* * backup_job_create can *not* be run from a coroutine, so this can't either. * The caller is responsible that backup_mutex is held nonetheless. @@ -722,6 +754,11 @@ static bool fleecing_no_efi_tpm(const char *device_id) return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, "drive-tpmstate", 14); } +static bool fleecing_all(const char *device_id) +{ +return true; +} + /* * Returns a list of device infos, which needs to be freed by the caller. In * case of an error, errp will be set, but the returned value might still be a @@ -810,6 +847,242 @@ err: return di_list; } +BackupAccessInfoList *coroutine_fn qmp_backup_access_setup( +const char *target_id, +const char *devlist, +Error **errp) +{ +assert(qemu_in_coroutine()); + +qemu_co_mutex_lock(&backup_state.backup_mutex); + +Error *local_err = NULL; +GList *di_list = NULL; +GList *l; + +if (backup_state.di_list) { +error_set(errp, ERROR_CLASS_GENERIC_ERROR, + "previous backup by provider '%s' not finished", backup_state.target_id); +qemu_co_mutex_unlock(&backup_state.backup_mutex); +return NULL; +} + +bdrv_graph_co_rdlock(); +di_list = get_device_info(devlist, fleecing_all, &local_err); +bdrv_graph_co_rdunlock(); +if (local_err) { +error_propagate(errp, local_err); +goto err; +} +assert(di_list); + +size_t total = 0; + +l = di_list; +while (l) { +PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; +l = g_list_next(l); + +ssize_t size = bdrv_getlength(di->bs); +if (size < 0) { +error_setg_errno(errp, -size, "bdrv_getlength failed"); +goto err; +} +di->size = size; +total += size; + +di->comple
[pve-devel] [RFC qemu-server v2 18/25] backup: allow adding fleecing images also for EFI and TPM
For the external backup API, it will be necessary to add a fleecing image even for small disks like EFI and TPM, because there is no other place the old data could be copied to when a new guest write comes in. Signed-off-by: Fiona Ebner --- No changes in v2. PVE/VZDump/QemuServer.pm | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm index 98685127..4ad4a154 100644 --- a/PVE/VZDump/QemuServer.pm +++ b/PVE/VZDump/QemuServer.pm @@ -548,7 +548,7 @@ my sub cleanup_fleecing_images { } my sub allocate_fleecing_images { -my ($self, $disks, $vmid, $fleecing_storeid, $format) = @_; +my ($self, $disks, $vmid, $fleecing_storeid, $format, $all_images) = @_; die "internal error - no fleecing storage specified\n" if !$fleecing_storeid; @@ -559,7 +559,8 @@ my sub allocate_fleecing_images { my $n = 0; # counter for fleecing image names for my $di ($disks->@*) { - next if $di->{virtdev} =~ m/^(?:tpmstate|efidisk)\d$/; # too small to be worth it + # EFI/TPM are usually too small to be worth it, but it's required for external providers + next if !$all_images && $di->{virtdev} =~ m/^(?:tpmstate|efidisk)\d$/; if ($di->{type} eq 'block' || $di->{type} eq 'file') { my $scfg = PVE::Storage::storage_config($self->{storecfg}, $fleecing_storeid); my $name = "vm-$vmid-fleece-$n"; @@ -623,7 +624,7 @@ my sub attach_fleecing_images { } my sub check_and_prepare_fleecing { -my ($self, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support) = @_; +my ($self, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support, $all_images) = @_; # Even if the VM was started specifically for fleecing, it's possible that the VM is resumed and # then starts doing IO. For VMs that are not resumed the fleecing images will just stay empty, @@ -644,7 +645,8 @@ my sub check_and_prepare_fleecing { $self->{storecfg}, $fleecing_opts->{storage}); my $format = scalar(grep { $_ eq 'qcow2' } $valid_formats->@*) ? 'qcow2' : 'raw'; - allocate_fleecing_images($self, $disks, $vmid, $fleecing_opts->{storage}, $format); + allocate_fleecing_images( + $self, $disks, $vmid, $fleecing_opts->{storage}, $format, $all_images); attach_fleecing_images($self, $disks, $vmid, $format); } @@ -735,7 +737,7 @@ sub archive_pbs { my $is_template = PVE::QemuConfig->is_template($self->{vmlist}->{$vmid}); $task->{'use-fleecing'} = check_and_prepare_fleecing( - $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support); + $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support, 0); my $fs_frozen = $self->qga_fs_freeze($task, $vmid); @@ -919,7 +921,7 @@ sub archive_vma { $attach_tpmstate_drive->($self, $task, $vmid); $task->{'use-fleecing'} = check_and_prepare_fleecing( - $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support); + $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support, 0); my $outfh; if ($opts->{stdout}) { -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [RFC qemu v2 09/25] PVE backup: implement bitmap support for external backup access
There can be one dirty bitmap for each backup target ID (which are tracked in the backup_access_bitmaps hash table). The QMP user can specify the ID of the bitmap it likes to use. This ID is then compared to the current one for the given target. If they match, the bitmap is re-used (should it still exist on the drive, otherwise re-created). If there is a mismatch, the old bitmap is removed and a new one is created. The return value of the QMP command includes information about what bitmap action was taken. Similar to what the query-backup QMP command returns for regular backup. It also includes the bitmap name and associated block node, so the management layer can then set up an NBD export with the bitmap. While the backup access is active, a background bitmap is also required. This is necessary to implement bitmap handling according to the original reference [0]. In particular: - in the error case, new writes since the backup access was set up are in the background bitmap. Because of failure, the previously tracked writes from the backup access bitmap are still required too. Thus, the bitmap is merged with the background bitmap to get all new writes since the last backup. - in the success case, continue tracking for the next incremental backup in the backup access bitmap. New writes since the backup access was set up are in the background bitmap. Because the backup was successfully, clear the backup access bitmap and merge back the background bitmap to get only the new writes. Since QEMU cannot know if the backup was successful or not (except if failure already happens during the setup QMP command), the management layer needs to tell it via the teardown QMP command. The bitmap action is also recorded in the device info now. [0]: https://lore.kernel.org/qemu-devel/b68833dd-8864-4d72-7c61-c134a9835...@ya.ru/ Signed-off-by: Fiona Ebner --- No changes in v2. pve-backup.c | 175 ++- pve-backup.h | 2 +- qapi/block-core.json | 22 +- system/runstate.c| 2 +- 4 files changed, 193 insertions(+), 8 deletions(-) diff --git a/pve-backup.c b/pve-backup.c index d3370d6744..5f8dd396d5 100644 --- a/pve-backup.c +++ b/pve-backup.c @@ -15,6 +15,7 @@ #include "qapi/qmp/qdict.h" #include "qapi/qmp/qerror.h" #include "qemu/cutils.h" +#include "qemu/error-report.h" #if defined(CONFIG_MALLOC_TRIM) #include @@ -41,6 +42,7 @@ */ const char *PBS_BITMAP_NAME = "pbs-incremental-dirty-bitmap"; +const char *BACKGROUND_BITMAP_NAME = "backup-access-background-bitmap"; static struct PVEBackupState { struct { @@ -72,6 +74,7 @@ static struct PVEBackupState { CoMutex backup_mutex; CoMutex dump_callback_mutex; char *target_id; +GHashTable *backup_access_bitmaps; // key=target_id, value=bitmap_name } backup_state; static void pvebackup_init(void) @@ -99,6 +102,8 @@ typedef struct PVEBackupDevInfo { char* device_name; int completed_ret; // INT_MAX if not completed BdrvDirtyBitmap *bitmap; +BdrvDirtyBitmap *background_bitmap; // used for external backup access +PBSBitmapAction bitmap_action; BlockDriverState *target; BlockJob *job; } PVEBackupDevInfo; @@ -362,6 +367,67 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque) qemu_co_mutex_unlock(&backup_state.backup_mutex); } +/* + * New writes since the backup access was set up are in the background bitmap. Because of failure, + * the previously tracked writes in di->bitmap are still required too. Thus, merge with the + * background bitmap to get all new writes since the last backup. + */ +static void handle_backup_access_bitmaps_in_error_case(PVEBackupDevInfo *di) +{ +Error *local_err = NULL; + +if (di->bs && di->background_bitmap) { +bdrv_drained_begin(di->bs); +if (di->bitmap) { +bdrv_enable_dirty_bitmap(di->bitmap); +if (!bdrv_merge_dirty_bitmap(di->bitmap, di->background_bitmap, NULL, &local_err)) { +warn_report("backup access: %s - could not merge bitmaps in error path - %s", +di->device_name, +local_err ? error_get_pretty(local_err) : "unknown error"); +/* + * Could not merge, drop original bitmap too. + */ +bdrv_release_dirty_bitmap(di->bitmap); +} +} else { +warn_report("backup access: %s - expected bitmap not present", di->device_name); +} +bdrv_release_dirty_bitmap(di->background_bitmap); +bdrv_drained_end(di->bs); +} +} + +/* + * Continue tracking for next incremental backup in di->bitmap. New writes since the backup access + * was set up are in the background bitmap. Because the backup was successful, clear di->bitmap and + * merge back the background bitmap to get only the new writes. + */ +static void handle_backup_access_bitmaps_afte
[pve-devel] [POC storage v2 13/25] Borg plugin
Archive names start with the guest type and ID and then the same timestamp format as PBS. Container archives have the following structure: guest.config firewall.config filesystem/ # containing the whole filesystem structure VM archives have the following structure guest.config firewall.config volumes/ # containing a raw file for each device A bindmount (respectively symlinks) are used to achieve this structure, because Borg doesn't seem to support renaming on-the-fly. (Prefix stripping via the "slashdot hack" would have helped slightly, but is only in Borg >= 1.4 https://github.com/borgbackup/borg/actions/runs/7967940995) NOTE: running via SSH was not yet tested. Bandwidth limit is not yet honored and the task size is not calculated yet. Discard for VM backups would also be nice to have, but it's not entirely clear how (parsing progress and discarding according to that is one idea). There is no dirty bitmap support, not sure if that is feasible to add. Signed-off-by: Fiona Ebner --- New in v2. src/PVE/BackupProvider/Plugin/Borg.pm | 373 ++ src/PVE/BackupProvider/Plugin/Makefile | 2 +- src/PVE/Storage.pm | 2 + src/PVE/Storage/BorgBackupPlugin.pm| 506 + src/PVE/Storage/Makefile | 1 + 5 files changed, 883 insertions(+), 1 deletion(-) create mode 100644 src/PVE/BackupProvider/Plugin/Borg.pm create mode 100644 src/PVE/Storage/BorgBackupPlugin.pm diff --git a/src/PVE/BackupProvider/Plugin/Borg.pm b/src/PVE/BackupProvider/Plugin/Borg.pm new file mode 100644 index 000..b65321c --- /dev/null +++ b/src/PVE/BackupProvider/Plugin/Borg.pm @@ -0,0 +1,373 @@ +package PVE::BackupProvider::Plugin::Borg; + +use strict; +use warnings; + +use File::chdir; +use File::Basename qw(basename); +use File::Path qw(make_path remove_tree); +use POSIX qw(strftime); + +use PVE::Tools; + +# ($vmtype, $vmid, $time_string) +our $ARCHIVE_RE_3 = qr!^pve-(lxc|qemu)-([0-9]+)-([0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z)$!; + +sub archive_name { +my ($vmtype, $vmid, $backup_time) = @_; + +return "pve-${vmtype}-${vmid}-" . strftime("%FT%TZ", gmtime($backup_time)); +} + +# remove_tree can be very verbose by default, do explicit error handling and limit to one message +my sub _remove_tree { +my ($path) = @_; + +remove_tree($path, { error => \my $err }); +if ($err && @$err) { # empty array if no error + for my $diag (@$err) { + my ($file, $message) = %$diag; + die "cannot remove_tree '$path': $message\n" if $file eq ''; + die "cannot remove_tree '$path': unlinking $file failed - $message\n"; + } +} +} + +my sub prepare_run_dir { +my ($archive, $operation) = @_; + +my $run_dir = "/run/pve-storage-borg-plugin/${archive}.${operation}"; +_remove_tree($run_dir); +make_path($run_dir); +die "unable to create directory $run_dir\n" if !-d $run_dir; + +return $run_dir; +} + +my sub log_info { +my ($self, $message) = @_; + +$self->{'log-function'}->('info', $message); +} + +my sub log_warning { +my ($self, $message) = @_; + +$self->{'log-function'}->('warn', $message); +} + +my sub log_error { +my ($self, $message) = @_; + +$self->{'log-function'}->('err', $message); +} + +my sub file_contents_from_archive { +my ($self, $archive, $file) = @_; + +my $run_dir = prepare_run_dir($archive, "file-contents"); + +my $raw; + +eval { + local $CWD = $run_dir; + + $self->{'storage-plugin'}->borg_cmd_extract( + $self->{scfg}, + $self->{storeid}, + $archive, + [$file], + ); + + $raw = PVE::Tools::file_get_contents("${run_dir}/${file}"); +}; +my $err = $@; +eval { _remove_tree($run_dir); }; +log_warning($self, $@) if $@; +die $err if $err; + +return $raw; +} + +# Plugin implementation + +sub new { +my ($class, $storage_plugin, $scfg, $storeid, $log_function) = @_; + +my $self = bless { + scfg => $scfg, + storeid => $storeid, + 'storage-plugin' => $storage_plugin, + 'log-function' => $log_function, +}, $class; + +return $self; +} + +sub provider_name { +my ($self) = @_; + +return "Borg"; +} + +sub job_hook { +my ($self, $phase, $info) = @_; + +if ($phase eq 'start') { + $self->{'job-id'} = $info->{'start-time'}; +} + +return; +} + +sub backup_hook { +my ($self, $phase, $vmid, $vmtype, $info) = @_; + +if ($phase eq 'start') { + $self->{$vmid}->{'task-size'} = 0; +} + +return; +} + +sub backup_get_mechanism { +my ($self, $vmid, $vmtype) = @_; + +return ('block-device', undef) if $vmtype eq 'qemu'; +return ('directory', undef) if $vmtype eq 'lxc'; + +die "unsupported VM type '$vmtype'\n"; +} + +sub backup_get_archive_name { +my ($self, $vmid, $vmtype, $backup_time) = @_; + +return $self->{$vmid}->{archive} = archive_name($vmtype, $
[pve-devel] [RFC qemu-server v2 19/25] backup: implement backup for external providers
The state of the VM's disk images at the time the backup is started is preserved via a snapshot-access block node. Old data is moved to the fleecing image when new guest writes come in. The snapshot-access block node, as well as the associated bitmap in case of incremental backup, will be made available to the external provider. They are exported via NBD and for 'nbd' mechanism, the NBD socket path is passed to the provider, while for 'block-device' mechanism, the NBD export is made accessible as a regular block device first and the bitmap information is made available via a $next_dirty_region->() function. For 'block-device', the 'nbdinfo' binary is required. The provider can indicate that it wants to do an incremental backup by returning the bitmap ID that was used for a previous backup and it will then be told if the bitmap was newly created (either first backup or old bitmap was invalid) or if the bitmap can be reused. The provider then reads the parts of the NBD or block device it needs, either the full disk for full backup, or the dirty parts according to the bitmap for incremental backup. The bitmap has to be respected, reads to other parts of the image will return an error. After backing up each part of the disk, it should be discarded in the export to avoid unnecessary space usage in the fleecing image (requires the storage underlying the fleecing image to support discard too). Signed-off-by: Fiona Ebner --- Changes in v2: * Support 'block-device' mechanism, by exposing the NBD export as a block device via qemu-nbd. * Adapt to API changes, i.e. pass all volumes as well as configuration files to the provider at once. PVE/VZDump/QemuServer.pm | 308 ++- 1 file changed, 307 insertions(+), 1 deletion(-) diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm index 4ad4a154..9daebbc2 100644 --- a/PVE/VZDump/QemuServer.pm +++ b/PVE/VZDump/QemuServer.pm @@ -20,7 +20,7 @@ use PVE::QMPClient; use PVE::Storage::Plugin; use PVE::Storage::PBSPlugin; use PVE::Storage; -use PVE::Tools; +use PVE::Tools qw(run_command); use PVE::VZDump; use PVE::Format qw(render_duration render_bytes); @@ -291,6 +291,8 @@ sub archive { if ($self->{vzdump}->{opts}->{pbs}) { $self->archive_pbs($task, $vmid); +} elsif ($self->{vzdump}->{'backup-provider'}) { + $self->archive_external($task, $vmid); } else { $self->archive_vma($task, $vmid, $filename, $comp); } @@ -1136,6 +1138,23 @@ sub cleanup { # If VM was started only for backup, it is already stopped now. if (PVE::QemuServer::Helpers::vm_running_locally($vmid)) { + if ($task->{cleanup}->{'nbd-stop'}) { + eval { PVE::QemuServer::QMPHelpers::nbd_stop($vmid); }; + $self->logerr($@) if $@; + } + + if (my $info = $task->{cleanup}->{'backup-access-teardown'}) { + my $params = { + 'target-id' => $info->{'target-id'}, + timeout => 60, + success => $info->{success} ? JSON::true : JSON::false, + }; + + $self->loginfo("tearing down backup-access"); + eval { mon_cmd($vmid, "backup-access-teardown", $params->%*) }; + $self->logerr($@) if $@; + } + $detach_tpmstate_drive->($task, $vmid); detach_fleecing_images($task->{disks}, $vmid) if $task->{'use-fleecing'}; } @@ -1147,4 +1166,291 @@ sub cleanup { } } +my sub block_device_backup_cleanup { +my ($self, $paths, $cpids) = @_; + +for my $path ($paths->@*) { + eval { run_command(["qemu-nbd", "-d", $path ]); }; + $self->log('warn', "unable to disconnect NBD backup source '$path' - $@") if $@; +} + +my $waited; +my $wait_limit = 5; +for ($waited = 0; $waited < $wait_limit && scalar(keys $cpids->%*); $waited++) { + while ((my $cpid = waitpid(-1, POSIX::WNOHANG)) > 0) { + delete($cpids->{$cpid}); + } + if ($waited == 0) { + kill 15, $_ for keys $cpids->%*; + } + sleep 1; +} +if ($waited == $wait_limit && scalar(keys $cpids->%*)) { + kill 9, $_ for keys $cpids->%*; + sleep 1; + while ((my $cpid = waitpid(-1, POSIX::WNOHANG)) > 0) { + delete($cpids->{$cpid}); + } + $self->log('warn', "unable to collect nbdinfo child process '$_'") for keys $cpids->%*; +} +} + +my sub block_device_backup_prepare { +my ($self, $devicename, $size, $nbd_path, $bitmap_name, $count) = @_; + +my $nbd_info_uri = "nbd+unix:///${devicename}?socket=${nbd_path}"; +my $qemu_nbd_uri = "nbd:unix:${nbd_path}:exportname=${devicename}"; + +my $cpid; +my $error_fh; +my $next_dirty_region; + +# If there is no dirty bitmap, it can be treated as if there's a full dirty one. The output of +# nbdinfo is a list of tuples with offset, length, type, description. The first bit of 'type' is +# set when the bitmap is dirty, see QEMU's docs/int
[pve-devel] [RFC container v2 23/25] backup: implement restore for external providers
First, the provider is asked about what restore mechanism to use. Currently, 'directory' and 'tar' are possible, for restoring either from a directory containing the full filesystem structure (for which rsync is used) or a potentially compressed tar file containing the same. The new functions are copied and adapted from the existing ones for PBS or tar and it might be worth to factor out the common parts. Signed-off-by: Fiona Ebner --- Changes in v2: * Adapt to API changes. src/PVE/LXC/Create.pm | 141 ++ 1 file changed, 141 insertions(+) diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm index 117103c..9d1c337 100644 --- a/src/PVE/LXC/Create.pm +++ b/src/PVE/LXC/Create.pm @@ -25,6 +25,24 @@ sub restore_archive { if ($scfg->{type} eq 'pbs') { return restore_proxmox_backup_archive($storage_cfg, $archive, $rootdir, $conf, $no_unpack_error, $bwlimit); } + my $log_function = sub { + my ($log_level, $message) = @_; + my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level); + print "$prefix: $message\n"; + }; + my $backup_provider = + PVE::Storage::new_backup_provider($storage_cfg, $storeid, $log_function); + if ($backup_provider) { + return restore_external_archive( + $backup_provider, + $storeid, + $volname, + $rootdir, + $conf, + $no_unpack_error, + $bwlimit, + ); + } } $archive = PVE::Storage::abs_filesystem_path($storage_cfg, $archive) if $archive ne '-'; @@ -118,6 +136,55 @@ sub restore_tar_archive { die $err if $err && !$no_unpack_error; } +sub restore_external_archive { +my ($backup_provider, $storeid, $volname, $rootdir, $conf, $no_unpack_error, $bwlimit) = @_; + +my ($mechanism, $vmtype) = $backup_provider->restore_get_mechanism($volname, $storeid); +die "cannot restore non-LXC guest of type '$vmtype'\n" if $vmtype ne 'lxc'; + +my $info = $backup_provider->restore_container_init($volname, $storeid, {}); +eval { + if ($mechanism eq 'tar') { + my $tar_path = $info->{'tar-path'} + or die "did not get path to tar file from backup provider\n"; + die "not a regular file '$tar_path'" if !-f $tar_path; + restore_tar_archive($tar_path, $rootdir, $conf, $no_unpack_error, $bwlimit); + } elsif ($mechanism eq 'directory') { + my $directory = $info->{'archive-directory'} + or die "did not get path to archive directory from backup provider\n"; + die "not a directory '$directory'" if !-d $directory; + + my $rsync = ['rsync', '--stats', '-h', '-X', '-A', '--numeric-ids', '-aH', '--delete', + '--no-whole-file', '--sparse', '--one-file-system', '--relative']; + push $rsync->@*, '--bwlimit', $bwlimit if $bwlimit; + push $rsync->@*, "${directory}/./", $rootdir; + + my $transferred = ''; + my $outfunc = sub { + return if $_[0] !~ /^Total transferred file size: (.+)$/; + $transferred = $1; + }; + my $errfunc = sub { log_warn($_[0]); }; + + my $starttime = time(); + PVE::Tools::run_command($rsync, outfunc => $outfunc, errfunc => $errfunc); + my $delay = time () - $starttime; + + print "sync finished - transferred ${transferred} in ${delay}s\n"; + } else { + die "mechanism '$mechanism' requested by backup provider is not supported for LXCs\n"; + } +}; +my $err = $@; +eval { $backup_provider->restore_container_cleanup($volname, $storeid, {}); }; +if (my $cleanup_err = $@) { + die $cleanup_err if !$err; + warn $cleanup_err; +} +die $err if $err; + +} + sub recover_config { my ($storage_cfg, $volid, $vmid) = @_; @@ -126,6 +193,8 @@ sub recover_config { my $scfg = PVE::Storage::storage_check_enabled($storage_cfg, $storeid); if ($scfg->{type} eq 'pbs') { return recover_config_from_proxmox_backup($storage_cfg, $volid, $vmid); + } elsif (PVE::Storage::new_backup_provider($storage_cfg, $storeid, sub {})) { + return recover_config_from_external_backup($storage_cfg, $volid, $vmid); } } @@ -200,6 +269,26 @@ sub recover_config_from_tar { return wantarray ? ($conf, $mp_param) : $conf; } +sub recover_config_from_external_backup { +my ($storage_cfg, $volid, $vmid) = @_; + +$vmid //= 0; + +my $raw = PVE::Storage::extract_vzdump_config($storage_cfg, $volid); + +my $conf = PVE::LXC::Config::parse_pct_config("/lxc/${vmid}.conf" , $raw); + +delete $conf->{snapshots}; + +my $mp_param = {}; +PVE::LXC::Config->foreach_volume($conf, sub { + my ($ms, $mountpoint) = @_; + $mp_param->{$ms} = $conf->{$ms}; +}); + +
[pve-devel] applied: [PATCH docs] pve-network: correct language errors
Am 13.08.24 um 12:23 schrieb Alexander Zeidler: > Signed-off-by: Alexander Zeidler > --- > pve-network.adoc | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/pve-network.adoc b/pve-network.adoc > index acdcf39..434430d 100644 > --- a/pve-network.adoc > +++ b/pve-network.adoc > @@ -447,7 +447,7 @@ slaves in the single logical bonded interface such that > different > network-peers use different MAC addresses for their network packet > traffic. > > -If your switch support the LACP (IEEE 802.3ad) protocol then we recommend > using > +If your switch supports the LACP (IEEE 802.3ad) protocol then we recommend > using > the corresponding bonding mode (802.3ad). Otherwise you should generally use > the > active-backup mode. > > @@ -490,7 +490,7 @@ iface vmbr0 inet static > > > [thumbnail="default-network-setup-bond.svg"] > -Another possibility it to use the bond directly as bridge port. > +Another possibility is to use the bond directly as bridge port. > This can be used to make the guest network fault-tolerant. > > .Example: Use a bond as bridge port applied, thanks! While adding the missing comma Lukas noted and his R-b. ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] applied: [PATCH docs] local-zfs: correct language errors
Am 13.08.24 um 12:23 schrieb Alexander Zeidler: > Signed-off-by: Alexander Zeidler > --- > local-zfs.adoc | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > applied, thanks! Squashed in a change to wrap lines that would've been longer than 80 characters. ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] applied: [PATCH docs] pve-network: correct language errors
Am 13.08.24 um 15:57 schrieb Fiona Ebner: > Am 13.08.24 um 12:23 schrieb Alexander Zeidler: >> Signed-off-by: Alexander Zeidler >> --- >> pve-network.adoc | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/pve-network.adoc b/pve-network.adoc >> index acdcf39..434430d 100644 >> --- a/pve-network.adoc >> +++ b/pve-network.adoc >> @@ -447,7 +447,7 @@ slaves in the single logical bonded interface such that >> different >> network-peers use different MAC addresses for their network packet >> traffic. >> >> -If your switch support the LACP (IEEE 802.3ad) protocol then we recommend >> using >> +If your switch supports the LACP (IEEE 802.3ad) protocol then we recommend >> using >> the corresponding bonding mode (802.3ad). Otherwise you should generally >> use the >> active-backup mode. >> >> @@ -490,7 +490,7 @@ iface vmbr0 inet static >> >> >> [thumbnail="default-network-setup-bond.svg"] >> -Another possibility it to use the bond directly as bridge port. >> +Another possibility is to use the bond directly as bridge port. >> This can be used to make the guest network fault-tolerant. >> >> .Example: Use a bond as bridge port > > applied, thanks! While adding the missing comma Lukas noted and his R-b. I also added a follow-up using "as the bridge port" instead of "as bridge port" while at it. ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] [RFC firewall/proxmox{-ve-rs, -firewall, -perl-rs} 00/21] autogenerate ipsets for sdn objects
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote: > This patch series adds support for autogenerating ipsets for SDN objects. It > autogenerates ipsets for every VNet as follows: > > * ipset containing all IP ranges of the VNet > * ipset containing all gateways of the VNet > * ipset containing all IP ranges of the subnet - except gateways > * ipset containing all dhcp ranges of the vnet Gave the entire RFC a spin - apart from a few minor version bumps and one small rejected hunk, everything applied and built just fine. Encountered no other issues in that regard. My full review can be found below. Review: RFC: autogenerate ipsets for sdn objects Building As I also mention in some patches inline, a couple version bumps were necessary to get everything to build correctly. Those are as follows: - proxmox-ve-rs - proxmox-sys = "0.6.2" - serde_with = "3.8.1" - proxmox-firewall - proxmox-sys = "0.6.2" Additionally, patch 21 contained one rejected hunk: diff a/pve-rs/Cargo.toml b/pve-rs/Cargo.toml(rejected hunks) @@ -43,3 +43,4 @@ proxmox-subscription = "0.4" proxmox-sys = "0.5" proxmox-tfa = { version = "4.0.4", features = ["api"] } proxmox-time = "2" +proxmox-ve-config = { version = "0.1.0" } This got rejected because `proxmox-sys` was already bumped to "0.6" in `pve-rs`. Simply adding the dependency manually allowed me to build `pve-rs`. Testing --- You saw a lot of this off-list already (thanks again for the help btw), but want to mention it here as well, so it doesn't get lost. Setup * - Installed all packages on my development VM on which I then performed my tests. - No issues were encountered during installation. - Installed `dnsmasq` on the VM. - Disabled `dnsmasq` permanently via `systemctl disable --now dnsmasq`. Simple Zone & VNet ** - Added a new simple zone in Datacenter > SDN > Zones. - Enabled automatic DHCP on the zone. - Added a new VNet named `vnet0` and assigned it to the new simple zone. - Subnet: 172.16.100.0/24 - Gateway: 172.16.100.1 - DHCP Range: 172.16.100.100 - 172.16.100.200 - SNAT: enabled Cluster Firewall - Edited cluster firewall rules. - Immediately noticed that the new ipsets appeared in the UI and were able to be selected. - Configured a basic firewall rule as example. - Type: out - Action: REJECT - Macro: Web - Source: +dc/vnet0-all - Log: info - While this does not block web traffic for VMs, this was nevertheless useful to check whether the ipsets and `iptables` FW config were generated correctly. - Relevant output of `iptables-save`: # iptables-save | grep -E '\-\-dport (80|443) ' -A PVEFW-HOST-OUT -p tcp -m set --match-set PVEFW-0-vnet0-all-v4 src -m tcp --dport 80 -m limit --limit 1/sec -j NFLOG --nflog-prefix ":0:6:PVEFW-HOST-OUT: REJECT: " -A PVEFW-HOST-OUT -p tcp -m set --match-set PVEFW-0-vnet0-all-v4 src -m tcp --dport 80 -j PVEFW-reject -A PVEFW-HOST-OUT -p tcp -m set --match-set PVEFW-0-vnet0-all-v4 src -m tcp --dport 443 -m limit --limit 1/sec -j NFLOG --nflog-prefix ":0:6:PVEFW-HOST-OUT: REJECT: " -A PVEFW-HOST-OUT -p tcp -m set --match-set PVEFW-0-vnet0-all-v4 src -m tcp --dport 443 -j PVEFW-reject - The set passed via `--match-set` also appears in `ipset -L`: # ipset -L | grep -A 8 'PVEFW-0-vnet0-all-v4' Name: PVEFW-0-vnet0-all-v4 Type: hash:net Revision: 7 Header: family inet hashsize 64 maxelem 64 bucketsize 12 initval 0xa4c09bc0 Size in memory: 504 References: 12 Number of entries: 1 Members: 172.16.100.0/24 - Very nice. - All of the other ipsets for the VNet also appear in `ipset -L` as expected (-all, -dhcp, -gateway for v4 and v6 each) - When removing removing `+dc/vnet0-all` from the firewall rule and leaving the source empty, outgoing web traffic was blocked, as expected. - Keeping it there does *not* block outgoing web traffic, as expected. Host Firewall * - The cluster firewall rule above was deactivated. - Otherwise, the exact same steps as above were performed, just on the host firewall. - The results are exactly the same, as expected. VM / CT ipsets ** - All containers and VMs correctly got their own ipset (checked with `ipset -L`). - Assigning a VM to the VNet makes it show up in IPAM and also updates its corresponding ipset correctly. - Adding the same firewall rule as above to a VM blocks the VM's outgoing web traffic, as expected. - Changing the source to the VM's ipset, in this case `+guest-ipam-102`, also blocks the VM's outgoing web traffic, as expected. - Output of `iptables-save | grep -E '\-\-dport (80|443) '` on the node: # iptables-save | grep -E '\-\-dport (80|443) '
Re: [pve-devel] [PATCH proxmox-ve-rs 01/21] debian: add files for packaging
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote: > Since we now have a standalone repository for Proxmox VE related > crates, add the required files for packaging the crates contained in > this repository. > > Signed-off-by: Stefan Hanreich > --- > .cargo/config.toml | 5 ++ > .gitignore | 8 +++ > Cargo.toml | 17 +++ > Makefile | 69 ++ > build.sh | 35 + > bump.sh| 44 > proxmox-ve-config/Cargo.toml | 16 +++--- > proxmox-ve-config/debian/changelog | 5 ++ > proxmox-ve-config/debian/control | 43 > proxmox-ve-config/debian/copyright | 19 +++ > proxmox-ve-config/debian/debcargo.toml | 4 ++ > 11 files changed, 255 insertions(+), 10 deletions(-) > create mode 100644 .cargo/config.toml > create mode 100644 .gitignore > create mode 100644 Cargo.toml > create mode 100644 Makefile > create mode 100755 build.sh > create mode 100755 bump.sh > create mode 100644 proxmox-ve-config/debian/changelog > create mode 100644 proxmox-ve-config/debian/control > create mode 100644 proxmox-ve-config/debian/copyright > create mode 100644 proxmox-ve-config/debian/debcargo.toml > > diff --git a/.cargo/config.toml b/.cargo/config.toml > new file mode 100644 > index 000..3b5b6e4 > --- /dev/null > +++ b/.cargo/config.toml > @@ -0,0 +1,5 @@ > +[source] > +[source.debian-packages] > +directory = "/usr/share/cargo/registry" > +[source.crates-io] > +replace-with = "debian-packages" > diff --git a/.gitignore b/.gitignore > new file mode 100644 > index 000..d72b68b > --- /dev/null > +++ b/.gitignore > @@ -0,0 +1,8 @@ > +/target > +/*/target > +Cargo.lock > +**/*.rs.bk > +/*.buildinfo > +/*.changes > +/build > +/*-deb > diff --git a/Cargo.toml b/Cargo.toml > new file mode 100644 > index 000..ab23d89 > --- /dev/null > +++ b/Cargo.toml > @@ -0,0 +1,17 @@ > +[workspace] > +members = [ > +"proxmox-ve-config", > +] > +exclude = [ > +"build", > +] > +resolver = "2" > + > +[workspace.package] > +authors = ["Proxmox Support Team "] > +edition = "2021" > +license = "AGPL-3" > +homepage = "https://proxmox.com"; > +exclude = [ "debian" ] > +rust-version = "1.70" > + > diff --git a/Makefile b/Makefile > new file mode 100644 > index 000..0da9b74 > --- /dev/null > +++ b/Makefile > @@ -0,0 +1,69 @@ > +# Shortcut for common operations: > + > +CRATES != echo proxmox-*/Cargo.toml | sed -e 's|/Cargo.toml||g' > + > +# By default we just run checks: > +.PHONY: all > +all: check > + > +.PHONY: deb > +deb: $(foreach c,$(CRATES), $c-deb) > + echo $(foreach c,$(CRATES), $c-deb) > + lintian build/*.deb > + > +.PHONY: dsc > +dsc: $(foreach c,$(CRATES), $c-dsc) > + echo $(foreach c,$(CRATES), $c-dsc) > + lintian build/*.dsc > + > +.PHONY: autopkgtest > +autopkgtest: $(foreach c,$(CRATES), $c-autopkgtest) > + > +.PHONY: dinstall > +dinstall: > + $(MAKE) clean > + $(MAKE) deb > + sudo -k dpkg -i build/librust-*.deb > + > +%-deb: > + ./build.sh $* > + touch $@ > + > +%-dsc: > + BUILDCMD='dpkg-buildpackage -S -us -uc -d' ./build.sh $* > + touch $@ > + > +%-autopkgtest: > + autopkgtest build/$* build/*.deb -- null > + touch $@ > + > +.PHONY: check > +check: > + cargo test > + > +# Prints a diff between the current code and the one rustfmt would produce > +.PHONY: fmt > +fmt: > + cargo +nightly fmt -- --check > + > +# Doc without dependencies > +.PHONY: doc > +doc: > + cargo doc --no-deps > + > +.PHONY: clean > +clean: > + cargo clean > + rm -rf build/ > + rm -f -- *-deb *-dsc *-autopkgtest *.build *.buildinfo *.changes > + > +.PHONY: update > +update: > + cargo update > + > +%-upload: %-deb > + cd build; \ > + dcmd --deb rust-$*_*.changes \ > + | grep -v '.changes$$' \ > + | tar -cf "$@.tar" -T-; \ > + cat "$@.tar" | ssh -X repo...@repo.proxmox.com upload --product > devel --dist bookworm > diff --git a/build.sh b/build.sh > new file mode 100755 > index 000..39a8302 > --- /dev/null > +++ b/build.sh > @@ -0,0 +1,35 @@ > +#!/bin/sh > + > +set -eux > + > +export CARGO=/usr/bin/cargo > +export RUSTC=/usr/bin/rustc > + > +CRATE=$1 > +BUILDCMD=${BUILDCMD:-"dpkg-buildpackage -b -uc -us"} > + > +mkdir -p build > +echo system >build/rust-toolchain > +rm -rf "build/${CRATE}" > + > +CONTROL="$PWD/${CRATE}/debian/control" > + > +if [ -e "$CONTROL" ]; then > +# check but only warn, debcargo fails anyway if crates are missing > +dpkg-checkbuilddeps $PWD/${CRATE}/debian/control || true > +# rm -f "$PWD/${CRATE}/debian/control" > +fi > + > +debcargo package \ > +--config "$PWD/${CRATE}/debian/debcargo.toml" \ > +--changelog-ready \ > +--no-overlay-write-back \ > +--directory "$PWD/build/${CRATE}" \ > +"${CRATE}" \ > +
Re: [pve-devel] [PATCH proxmox-ve-rs 02/21] firewall: add ip range types
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote: > Currently we are using tuples to represent IP ranges which is > suboptimal. Validation logic and invariant checking needs to happen at > every site using the IP range rather than having a unified struct for > enforcing those invariants. That's something I completely support; as you know I'm a fan of representing state / invariants / etc. via types ;) > > Signed-off-by: Stefan Hanreich > --- > .../src/firewall/types/address.rs | 230 +- > 1 file changed, 228 insertions(+), 2 deletions(-) > > diff --git a/proxmox-ve-config/src/firewall/types/address.rs > b/proxmox-ve-config/src/firewall/types/address.rs > index e48ac1b..ddf4652 100644 > --- a/proxmox-ve-config/src/firewall/types/address.rs > +++ b/proxmox-ve-config/src/firewall/types/address.rs > @@ -1,9 +1,9 @@ > -use std::fmt; > +use std::fmt::{self, Display}; > use std::net::{IpAddr, Ipv4Addr, Ipv6Addr}; > use std::ops::Deref; > > use anyhow::{bail, format_err, Error}; > -use serde_with::DeserializeFromStr; > +use serde_with::{DeserializeFromStr, SerializeDisplay}; > > #[derive(Clone, Copy, Debug, Eq, PartialEq)] > pub enum Family { > @@ -239,6 +239,202 @@ impl> From for Ipv6Cidr { > } > } > > +#[derive(Clone, Copy, Debug, PartialOrd, Ord, PartialEq, Eq, Hash)] > +pub enum IpRangeError { > +MismatchedFamilies, > +StartGreaterThanEnd, > +InvalidFormat, > +} > + > +impl std::error::Error for IpRangeError {} > + > +impl Display for IpRangeError { > +fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { > +f.write_str(match self { > +IpRangeError::MismatchedFamilies => "mismatched ip address > families", > +IpRangeError::StartGreaterThanEnd => "start is greater than end", > +IpRangeError::InvalidFormat => "invalid ip range format", > +}) > +} > +} > + > +/// represents a range of IPv4 or IPv6 addresses Small thing: I'd prefer Represents a range of IPv4 or IPv6 addresses. instead. Mainly because most docstrings are written that way, and you do it in later patches in a couple places as well anyways. Just gonna mention this here once as you do this a couple times in this series in order to avoid unnecessary noise. IMO it's a really minor thing, but since you told me off-list that you're in the process of neatly documenting everything, I thought I'd mention it here. > +/// > +/// For more information see [`AddressRange`] > +#[derive(Clone, Copy, Debug, PartialEq, Eq, SerializeDisplay, > DeserializeFromStr)] > +pub enum IpRange { > +V4(AddressRange), > +V6(AddressRange), > +} > + > +impl IpRange { > +/// returns the family of the IpRange > +pub fn family(&self) -> Family { > +match self { > +IpRange::V4(_) => Family::V4, > +IpRange::V6(_) => Family::V6, > +} > +} > + > +/// creates a new [`IpRange`] from two [`IpAddr`] > +/// > +/// # Errors > +/// > +/// This function will return an error if start and end IP address are > not from the same family. > +pub fn new(start: impl Into, end: impl Into) -> > Result { > +match (start.into(), end.into()) { > +(IpAddr::V4(start), IpAddr::V4(end)) => Self::new_v4(start, end), > +(IpAddr::V6(start), IpAddr::V6(end)) => Self::new_v6(start, end), > +_ => Err(IpRangeError::MismatchedFamilies), > +} > +} > + > +/// construct a new Ipv4 Range > +pub fn new_v4( > +start: impl Into, > +end: impl Into, > +) -> Result { > +Ok(IpRange::V4(AddressRange::new_v4(start, end)?)) > +} > + > +pub fn new_v6( > +start: impl Into, > +end: impl Into, > +) -> Result { > +Ok(IpRange::V6(AddressRange::new_v6(start, end)?)) > +} > +} > + > +impl std::str::FromStr for IpRange { > +type Err = IpRangeError; > + > +fn from_str(s: &str) -> Result { > +if let Ok(range) = s.parse() { > +return Ok(IpRange::V4(range)); > +} > + > +if let Ok(range) = s.parse() { > +return Ok(IpRange::V6(range)); > +} > + > +Err(IpRangeError::InvalidFormat) > +} > +} > + > +impl fmt::Display for IpRange { > +fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { > +match self { > +IpRange::V4(range) => range.fmt(f), > +IpRange::V6(range) => range.fmt(f), > +} > +} > +} > + > +/// represents a range of IP addresses from start to end > +/// > +/// This type is for encapsulation purposes for the [`IpRange`] enum and > should be instantiated via > +/// that enum. > +/// > +/// # Invariants > +/// > +/// * start and end have the same IP address family > +/// * start is lesser than or equal to end > +/// > +/// # Textual representation > +/// > +/// Two IP addresses separated by a hyphen, e.g.: `127.0.0.1-127.0.0.255` > +#[derive(Clone, Copy, Debug,
Re: [pve-devel] [PATCH proxmox-ve-rs 05/21] iprange: add methods for converting an ip range to cidrs
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote: > This is mainly used in proxmox-perl-rs, so the generated ipsets can be > used in pve-firewall where only CIDRs are supported. > > Signed-off-by: Stefan Hanreich > --- > .../src/firewall/types/address.rs | 818 ++ > 1 file changed, 818 insertions(+) > > diff --git a/proxmox-ve-config/src/firewall/types/address.rs > b/proxmox-ve-config/src/firewall/types/address.rs > index 8db3942..3238601 100644 > --- a/proxmox-ve-config/src/firewall/types/address.rs > +++ b/proxmox-ve-config/src/firewall/types/address.rs > @@ -303,6 +303,17 @@ impl IpRange { > ) -> Result { > Ok(IpRange::V6(AddressRange::new_v6(start, end)?)) > } > + > +/// converts an IpRange into the minimal amount of CIDRs > +/// > +/// see the concrete implementations of [`AddressRange`] or > [`AddressRange`] > +/// respectively > +pub fn to_cidrs(&self) -> Vec { > +match self { > +IpRange::V4(range) => > range.to_cidrs().into_iter().map(Cidr::from).collect(), > +IpRange::V6(range) => > range.to_cidrs().into_iter().map(Cidr::from).collect(), > +} > +} > } > > impl std::str::FromStr for IpRange { > @@ -362,6 +373,71 @@ impl AddressRange { > > Ok(Self { start, end }) > } > + > +/// returns the minimum amount of CIDRs that exactly represent the range > +/// > +/// The idea behind this algorithm is as follows: > +/// > +/// Start iterating with current = start of the IP range > +/// > +/// Find two netmasks > +/// * The largest CIDR that the current IP can be the first of > +/// * The largest CIDR that *only* contains IPs from current - end > +/// > +/// Add the smaller of the two CIDRs to our result and current to the > first IP that is in > +/// the range but not in the CIDR we just added. Proceed until we > reached the end of the IP > +/// range. Would mybe prefer some more inline formatting / minor rewording regarding the algorithm's steps above, simply for readability's sake (e.g. when rendering the docs). Sort of like: 1. Start iteration: Set `current` to `start` of the IP range 2. Find two netmasks: - The largest CIDR that the `current` IP can be the first of - The largest CIDR that *only* contains IPs from `current` to `end` 3. Add the smaller of the two CIDRs to our result and `current` to the first IP that is in the range but *not* in the CIDR we just added. Proceed until we reached the end of the IP range. Again, just a small thing, but thought I'd mention it. > +/// > +pub fn to_cidrs(&self) -> Vec { > +let mut cidrs = Vec::new(); > + > +let mut current = u32::from_be_bytes(self.start.octets()); > +let end = u32::from_be_bytes(self.end.octets()); > + > +if current == end { > +// valid Ipv4 since netmask is 32 > +cidrs.push(Ipv4Cidr::new(current, 32).unwrap()); > +return cidrs; > +} > + > +// special case this, since this is the only possibility of overflow > +// when calculating delta_min_mask - makes everything a lot easier > +if current == u32::MIN && end == u32::MAX { > +// valid Ipv4 since it is `0.0.0.0/0` > +cidrs.push(Ipv4Cidr::new(current, 0).unwrap()); > +return cidrs; > +} > + > +while current <= end { > +// netmask of largest CIDR that current IP can be the first of > +// cast is safe, because trailing zeroes can at most be 32 > +let current_max_mask = IPV4_LENGTH - (current.trailing_zeros() > as u8); > + > +// netmask of largest CIDR that *only* contains IPs of the > remaining range > +// is at most 32 due to unwrap_or returning 32 and ilog2 being > at most 31 > +let delta_min_mask = ((end - current) + 1) // safe due to > special case above > +.checked_ilog2() // should never occur due to special case, > but for good measure > +.map(|mask| IPV4_LENGTH - mask as u8) > +.unwrap_or(IPV4_LENGTH); > + > +// at most 32, due to current/delta being at most 32 > +let netmask = u8::max(current_max_mask, delta_min_mask); > + > +// netmask is at most 32, therefore safe to unwrap > +cidrs.push(Ipv4Cidr::new(current, netmask).unwrap()); > + > +let delta = 2u32.saturating_pow((IPV4_LENGTH - netmask).into()); > + > +if let Some(result) = current.checked_add(delta) { > +current = result > +} else { > +// we reached the end of IP address space > +break; > +} > +} > + > +cidrs > +} > } > > impl AddressRange { > @@ -377,6 +453,61 @@ impl AddressRange { > > Ok(Self { start, end }) > } > + > +/// returns the minimum amount
Re: [pve-devel] [PATCH proxmox-ve-rs 10/21] sdn: add ipam module
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote: > This module includes structs for representing the JSON schema from the > PVE ipam. Those can be used to parse the current IPAM state. > > We also include a general Ipam struct, and provide a method for > converting the PVE IPAM to the general struct. The idea behind this > is that we have multiple IPAM plugins in PVE and will likely add > support for importing them in the future. With the split, we can have > our dedicated structs for representing the different data formats from > the different IPAM plugins and then convert them into a common > representation that can then be used internally, decoupling the > concrete plugin from the code using the IPAM configuration. Big fan of this - as I had already mentioned, I find it always nice to have different types for such things. IMO it would be neat if those types were logically grouped though, e.g. the types for PVE could live in a separate `mod` inside the file. Or, if you want, you can also convert `ipam.rs` to a `ipam/mod.rs` and add further files depending on the types of different representations there. Either solution would make it harder for these types to become "intermingled" in the future, IMO. So this kind of grouping would simply serve as a "decent barrier" between the separate representations. Perhaps the module(s) could then also use a little bit of developer documentation or something that describes how and why the types are organized that way. It's probably best to just add `mod`s for now, as those can be split up into files later anyway. (Just don't `use super::*;` :P ) > > Enforcing the invariants the way we currently do adds a bit of runtime > complexity when building the object, but we get the upside of never > being able to construct an invalid struct. For the amount of entries > the ipam usually has, this should be fine. Should it turn out to be > not performant enough we could always add a HashSet for looking up > values and speeding up the validation. For now, I wanted to avoid the > additional complexity. > > Signed-off-by: Stefan Hanreich > --- > .../src/firewall/types/address.rs | 8 + > proxmox-ve-config/src/guest/vm.rs | 4 + > proxmox-ve-config/src/sdn/ipam.rs | 330 ++ > proxmox-ve-config/src/sdn/mod.rs | 2 + > 4 files changed, 344 insertions(+) > create mode 100644 proxmox-ve-config/src/sdn/ipam.rs > > diff --git a/proxmox-ve-config/src/firewall/types/address.rs > b/proxmox-ve-config/src/firewall/types/address.rs > index a0b82c5..3ad1a7a 100644 > --- a/proxmox-ve-config/src/firewall/types/address.rs > +++ b/proxmox-ve-config/src/firewall/types/address.rs > @@ -61,6 +61,14 @@ impl Cidr { > pub fn is_ipv6(&self) -> bool { > matches!(self, Cidr::Ipv6(_)) > } > + > +pub fn contains_address(&self, ip: &IpAddr) -> bool { > +match (self, ip) { > +(Cidr::Ipv4(cidr), IpAddr::V4(ip)) => cidr.contains_address(ip), > +(Cidr::Ipv6(cidr), IpAddr::V6(ip)) => cidr.contains_address(ip), > +_ => false, > +} > +} > } > > impl fmt::Display for Cidr { > diff --git a/proxmox-ve-config/src/guest/vm.rs > b/proxmox-ve-config/src/guest/vm.rs > index a7ea9bb..6a706c7 100644 > --- a/proxmox-ve-config/src/guest/vm.rs > +++ b/proxmox-ve-config/src/guest/vm.rs > @@ -17,6 +17,10 @@ static LOCAL_PART: [u8; 8] = [0xFE, 0x80, 0x00, 0x00, > 0x00, 0x00, 0x00, 0x00]; > static EUI64_MIDDLE_PART: [u8; 2] = [0xFF, 0xFE]; > > impl MacAddress { > +pub fn new(address: [u8; 6]) -> Self { > +Self(address) > +} > + > /// generates a link local IPv6-address according to RFC 4291 (Appendix > A) > pub fn eui64_link_local_address(&self) -> Ipv6Addr { > let head = &self.0[..3]; > diff --git a/proxmox-ve-config/src/sdn/ipam.rs > b/proxmox-ve-config/src/sdn/ipam.rs > new file mode 100644 > index 000..682bbe7 > --- /dev/null > +++ b/proxmox-ve-config/src/sdn/ipam.rs > @@ -0,0 +1,330 @@ > +use std::{ > +collections::{BTreeMap, HashMap}, > +error::Error, > +fmt::Display, > +net::IpAddr, > +}; > + > +use serde::Deserialize; > + > +use crate::{ > +firewall::types::Cidr, > +guest::{types::Vmid, vm::MacAddress}, > +sdn::{SdnNameError, SubnetName, ZoneName}, > +}; > + > +/// struct for deserializing a gateway entry in PVE IPAM > +/// > +/// They are automatically generated by the PVE SDN module when creating a > new subnet. > +#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Hash, PartialOrd, Ord)] > +pub struct IpamJsonDataGateway { > +#[serde(rename = "gateway")] > +_gateway: u8, > +} > + > +/// struct for deserializing a guest entry in PVE IPAM > +/// > +/// They are automatically created when adding a guest to a VNet that has a > Subnet with DHCP > +/// configured. > +#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Hash, PartialOrd, Ord)] > +pub struct IpamJsonDataVm { > +
Re: [pve-devel] issues with Virtio-SCSI devicde on Proxmox...
--- Begin Message --- Hi, Currently they are no way to add custom options for virtio-devices command line from vm config, so it should be patched to add support for openvms os and add special tuning. for example:https://git.proxmox.com/?p=qemu-server.git;a=blob_plain;f=PVE/QemuServer.pm;hb=HEAD sub print_netdevice_full { if ($conf->{ostype} && $conf->{ostype} eq 'openvms') { tmpstr .= ",disable_legacy=on"; } >>I have tried both “virtio-scsi-pci” and “virtio-scsi-single”, but no >>luck. The emulated virtio-scsi device is a legacy device >> >> >>Questions: >>* is there a way to configure a modern virtio-scsi devcie (i.e. >>disable_legacy=on) ? >> Do you mean than currently on proxmox, the default is legacy mode and and not modern ? I was pretty sure than qemu was defaulting to modern, and that you need to use ",disable-legacy=off,disable-modern=true" to switch to legacy. (and AFAIK, proxmox qemu version don't have any patch forcing legacy) but maybe I'm wrong maybe it's only enabled by default with pci-express (using q35 machine model) ? https://github.com/qemu/qemu/blob/9eb51530c12ae645b91e308d16196c68563ea883/docs/pcie.txt "Virtio devices plugged into PCI Express ports are PCI Express devices and have "1.0" behavior by default without IO support. In both cases disable-legacy and disable-modern properties can be used to override the behaviour." in qemu code I see: https://github.com/qemu/qemu/blob/9eb51530c12ae645b91e308d16196c68563ea883/hw/core/machine.c#L258 GlobalProperty hw_compat_2_6[] = { { "virtio-mmio", "format_transport_address", "off" }, /* Optional because not all virtio-pci devices support legacy mode */ { "virtio-pci", "disable-modern", "on", .optional = true }, { "virtio-pci", "disable-legacy", "off", .optional = true }, }; const size_t hw_compat_2_6_len = G_N_ELEMENTS(hw_compat_2_6); so I'm really not sure what is the real default do you have qemu command line of another kvm hypervisor (ovirt,rhev,...) where it's working correctly ? Maybe they are enabling modern bits manually , when a virtio device is supporting it ? Message initial De: Christian Moser Répondre à: Proxmox VE development discussion À: pve-devel@lists.proxmox.com Cc: Christian Moser Objet: [pve-devel] issues with Virtio-SCSI devicde on Proxmox... Date: 12/08/2024 12:40:10 Hello, I work for VSI (VMS Software Inc) which is porting the OpenVMS operating system to x86. At this point we successfully on various hypervisors, but have some issues on the KVM running on Proxmox. The OpenVMS VM works just fine with SATA disks and it also works with for example virtio-network device etc., but trying to use virtio-scsi hangs the driver. I have debugged this and I can successfully configure the port/controller, send the IO request to the device. It then gets processed by the device, which posts the results and sets the interrupt bit in the ISR register, but it never asserts the interrupt hence the driver never gets notified and the I/O hangs. I have tried both “virtio-scsi-pci” and “virtio-scsi-single”, but no luck. The emulated virtio-scsi device is a legacy device. But then again, the virtio-network device is also a legacy device and here we are getting interrupts. One thing which bothers me is the fact that the “legacy interrupt disable” bit is set in the PCI config space of the virtio-scsi device (i.e. bit 10 at offset 4) Questions: * is there a way to configure a modern virtio-scsi devcie (i.e. disable_legacy=on) ? * why is the legacy interrupt bit set in the PCI config space ? * Are there any working driver for virtio-scsi on this KVM using Q35 machine? i.e. any other OS Any thoughts why these interrupts are not getting delivered on the PCIE bus? thanks /cmos ___ Christian Moser Mobile: +358-40-5022105 Email: c...@maklee.com URL: www.maklee.com ___ pve-devel mailing list pve-devel@lists.proxmox.com https://antiphishing.vadesecure.com/v4?f=MXNCQ25TYXBra2RFV0VXZHojMIIFPS zCvkhJApFQpU82zfAhP9K7STA7y- wx1jAbgWgLhPFHm0qw3LR0qKok0w&i=cVVpR014Sk93Z0c1QzFSMs3n08s7G9NqllHFFNJ9 1W0&k=ThnE&r=VWoxMWNDTTl5UmRwenhPYicvAeXd4xlreb0WEHQKfL1pVhoIeFvTknKKfK g5BYLZ&s=c0303a1ebcdeef9c5d38eaa27d25b873b9eed7bddb72bc270d5944f93d63bb ec&u=https%3A%2F%2Flists.proxmox.com%2Fcgi- bin%2Fmailman%2Flistinfo%2Fpve-devel --- End Message --- ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] [PATCH pve-firewall 19/21] add support for loading sdn firewall configuration
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote: > Signed-off-by: Stefan Hanreich > --- > src/PVE/Firewall.pm | 43 +-- > 1 file changed, 41 insertions(+), 2 deletions(-) > > diff --git a/src/PVE/Firewall.pm b/src/PVE/Firewall.pm > index 09544ba..95325a0 100644 > --- a/src/PVE/Firewall.pm > +++ b/src/PVE/Firewall.pm > @@ -25,6 +25,7 @@ use PVE::Tools qw($IPV4RE $IPV6RE); > use PVE::Tools qw(run_command lock_file dir_glob_foreach); > > use PVE::Firewall::Helpers; > +use PVE::RS::Firewall::SDN; > > my $pvefw_conf_dir = "/etc/pve/firewall"; > my $clusterfw_conf_filename = "$pvefw_conf_dir/cluster.fw"; > @@ -3644,7 +3645,7 @@ sub lock_clusterfw_conf { > } > > sub load_clusterfw_conf { > -my ($filename) = @_; > +my ($filename, $load_sdn_config) = @_; Small thing: I would suggest using a prototype here and also accept a hash reference OR a hash as the last parameter, so that the call signature is a little more readable. E.g. right now it's: load_clusterfw_conf(undef, 1) VS: load_clusterfw_conf(undef, { load_sdn_config => 1 }) Or: load_clusterfw_conf(undef, load_sdn_config => 1) I know we're gonna phase this whole thing out eventually, but little things like this help a lot in the long run, IMO. It makes it a little clearer what the subroutine does at call sites. I'm not sure if these subroutines are used elsewhere (didn't really bother to check, sorry), so perhaps you could pass `$filename` via the hash as well, as an optional parameter. Then it's immediately clear what *everything* stands for, because a sole `undef` "hides" what's actually passed to the subroutine. > > $filename = $clusterfw_conf_filename if !defined($filename); > my $empty_conf = { > @@ -3657,12 +3658,50 @@ sub load_clusterfw_conf { > ipset_comments => {}, > }; > > +if ($load_sdn_config) { > + my $sdn_conf = load_sdn_conf(); > + $empty_conf = { %$empty_conf, %$sdn_conf }; > +} > + > my $cluster_conf = generic_fw_config_parser($filename, $empty_conf, > $empty_conf, 'cluster'); > $set_global_log_ratelimit->($cluster_conf->{options}); > > return $cluster_conf; > } > > +sub load_sdn_conf { > +my $rpcenv = PVE::RPCEnvironment::get(); > +my $authuser = $rpcenv->get_user(); > + > +my $guests = PVE::Cluster::get_vmlist(); > +my $allowed_vms = []; > +foreach my $vmid (sort keys %{$guests->{ids}}) { > + next if !$rpcenv->check($authuser, "/vms/$vmid", [ 'VM.Audit' ], 1); > + push @$allowed_vms, $vmid; > +} > + > +my $vnets = PVE::Network::SDN::Vnets::config(1); > +my $privs = [ 'SDN.Audit', 'SDN.Allocate' ]; > +my $allowed_vnets = []; > +foreach my $vnet (sort keys %{$vnets->{ids}}) { > + my $zone = $vnets->{ids}->{$vnet}->{zone}; > + next if !$rpcenv->check_any($authuser, "/sdn/zones/$zone/$vnet", > $privs, 1); > + push @$allowed_vnets, $vnet; > +} > + > +my $sdn_config = { > + ipset => {} , > + ipset_comments => {}, > +}; > + > +eval { > + $sdn_config = PVE::RS::Firewall::SDN::config($allowed_vnets, > $allowed_vms); > +}; > +warn $@ if $@; > + > +return $sdn_config; > +} > + > sub save_clusterfw_conf { > my ($cluster_conf) = @_; > > @@ -4731,7 +4770,7 @@ sub init { > sub update { > my $code = sub { > > - my $cluster_conf = load_clusterfw_conf(); > + my $cluster_conf = load_clusterfw_conf(undef, 1); > my $hostfw_conf = load_hostfw_conf($cluster_conf); > > if (!is_enabled_and_not_nftables($cluster_conf, $hostfw_conf)) { ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] [PATCH proxmox-perl-rs 21/21] add PVE::RS::Firewall::SDN module
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote: > Used for obtaining the IPSets that get autogenerated by the nftables > firewall. The returned configuration has the same format as the > pve-firewall uses internally, making it compatible with the existing > pve-firewall code. > > Signed-off-by: Stefan Hanreich > --- > pve-rs/Cargo.toml | 1 + > pve-rs/Makefile| 1 + > pve-rs/src/firewall/mod.rs | 1 + > pve-rs/src/firewall/sdn.rs | 130 + > pve-rs/src/lib.rs | 1 + > 5 files changed, 134 insertions(+) > create mode 100644 pve-rs/src/firewall/mod.rs > create mode 100644 pve-rs/src/firewall/sdn.rs > > diff --git a/pve-rs/Cargo.toml b/pve-rs/Cargo.toml > index e40588d..f612b3a 100644 > --- a/pve-rs/Cargo.toml > +++ b/pve-rs/Cargo.toml > @@ -43,3 +43,4 @@ proxmox-subscription = "0.4" > proxmox-sys = "0.5" > proxmox-tfa = { version = "4.0.4", features = ["api"] } > proxmox-time = "2" > +proxmox-ve-config = { version = "0.1.0" } This hunk doesn't apply anymore because proxmox-sys was bumped. Manually adding proxmox-ve-config as depencency works just fine though, so this needs just a little rebase. > diff --git a/pve-rs/Makefile b/pve-rs/Makefile > index c6b4e08..d01da69 100644 > --- a/pve-rs/Makefile > +++ b/pve-rs/Makefile > @@ -28,6 +28,7 @@ PERLMOD_GENPACKAGE := /usr/lib/perlmod/genpackage.pl \ > > PERLMOD_PACKAGES := \ > PVE::RS::APT::Repositories \ > + PVE::RS::Firewall::SDN \ > PVE::RS::OpenId \ > PVE::RS::ResourceScheduling::Static \ > PVE::RS::TFA > diff --git a/pve-rs/src/firewall/mod.rs b/pve-rs/src/firewall/mod.rs > new file mode 100644 > index 000..8bd18a8 > --- /dev/null > +++ b/pve-rs/src/firewall/mod.rs > @@ -0,0 +1 @@ > +pub mod sdn; > diff --git a/pve-rs/src/firewall/sdn.rs b/pve-rs/src/firewall/sdn.rs > new file mode 100644 > index 000..55f3e93 > --- /dev/null > +++ b/pve-rs/src/firewall/sdn.rs > @@ -0,0 +1,130 @@ > +#[perlmod::package(name = "PVE::RS::Firewall::SDN", lib = "pve_rs")] > +mod export { > +use std::collections::HashMap; > +use std::{fs, io}; > + > +use anyhow::{bail, Context, Error}; > +use serde::Serialize; > + > +use proxmox_ve_config::{ > +common::Allowlist, > +firewall::types::ipset::{IpsetAddress, IpsetEntry}, > +firewall::types::Ipset, > +guest::types::Vmid, > +sdn::{ > +config::{RunningConfig, SdnConfig}, > +ipam::{Ipam, IpamJson}, > +SdnNameError, VnetName, SdnNameError isn't used here. > +}, > +}; > + > +#[derive(Clone, Debug, Default, Serialize)] > +pub struct LegacyIpsetEntry { > +nomatch: bool, > +cidr: String, > +comment: Option, > +} > + > +impl LegacyIpsetEntry { > +pub fn from_ipset_entry(entry: &IpsetEntry) -> Vec > { > +let mut entries = Vec::new(); > + > +match &entry.address { > +IpsetAddress::Alias(name) => { > +entries.push(Self { > +nomatch: entry.nomatch, > +cidr: name.to_string(), > +comment: entry.comment.clone(), > +}); > +} > +IpsetAddress::Cidr(cidr) => { > +entries.push(Self { > +nomatch: entry.nomatch, > +cidr: cidr.to_string(), > +comment: entry.comment.clone(), > +}); > +} > +IpsetAddress::Range(range) => { > +entries.extend(range.to_cidrs().into_iter().map(|cidr| > Self { > +nomatch: entry.nomatch, > +cidr: cidr.to_string(), > +comment: entry.comment.clone(), > +})) > +} > +}; > + > +entries > +} > +} > + > +#[derive(Clone, Debug, Default, Serialize)] > +pub struct SdnFirewallConfig { > +ipset: HashMap>, > +ipset_comments: HashMap, > +} > + > +impl SdnFirewallConfig { > +pub fn new() -> Self { > +Default::default() > +} > + > +pub fn extend_ipsets(&mut self, ipsets: impl IntoIterator Ipset>) { > +for ipset in ipsets { > +let entries = ipset > +.iter() > +.flat_map(LegacyIpsetEntry::from_ipset_entry) > +.collect(); > + > +self.ipset.insert(ipset.name().name().to_string(), entries); > + > +if let Some(comment) = &ipset.comment { > +self.ipset_comments > +.insert(ipset.name().name().to_string(), > comment.to_string()); > +} > +} > +} > +} > + > +const SDN_RUNNING_CONFIG: &str = "/etc/pve/sdn/.runn
[pve-devel] [PATCH installer v2 4/5] fix #5250: tui: expose new btrfs `compress` option
Signed-off-by: Christoph Heiss --- Changes v1 -> v2: * rebased on master proxmox-tui-installer/src/views/bootdisk.rs | 37 + 1 file changed, 24 insertions(+), 13 deletions(-) diff --git a/proxmox-tui-installer/src/views/bootdisk.rs b/proxmox-tui-installer/src/views/bootdisk.rs index 8db33dd..625374e 100644 --- a/proxmox-tui-installer/src/views/bootdisk.rs +++ b/proxmox-tui-installer/src/views/bootdisk.rs @@ -19,8 +19,9 @@ use proxmox_installer_common::{ check_zfs_raid_config, }, options::{ -AdvancedBootdiskOptions, BootdiskOptions, BtrfsBootdiskOptions, BtrfsCompressOption, Disk, -FsType, LvmBootdiskOptions, ZfsBootdiskOptions, ZFS_CHECKSUM_OPTIONS, ZFS_COMPRESS_OPTIONS, +AdvancedBootdiskOptions, BootdiskOptions, BtrfsBootdiskOptions, Disk, FsType, +LvmBootdiskOptions, ZfsBootdiskOptions, BTRFS_COMPRESS_OPTIONS, ZFS_CHECKSUM_OPTIONS, +ZFS_COMPRESS_OPTIONS, }, setup::{BootType, ProductConfig, ProxmoxProduct, RuntimeInfo}, }; @@ -573,12 +574,23 @@ struct BtrfsBootdiskOptionsView { impl BtrfsBootdiskOptionsView { fn new(runinfo: &RuntimeInfo, options: &BtrfsBootdiskOptions) -> Self { -let view = MultiDiskOptionsView::new( -&runinfo.disks, -&options.selected_disks, -FormView::new().child("hdsize", DiskSizeEditView::new().content(options.disk_size)), -) -.top_panel(TextView::new("Btrfs integration is a technology preview!").center()); +let inner = FormView::new() +.child( +"compress", +SelectView::new() +.popup() +.with_all(BTRFS_COMPRESS_OPTIONS.iter().map(|o| (o.to_string(), *o))) +.selected( +BTRFS_COMPRESS_OPTIONS +.iter() +.position(|o| *o == options.compress) +.unwrap_or_default(), +), +) +.child("hdsize", DiskSizeEditView::new().content(options.disk_size)); + +let view = MultiDiskOptionsView::new(&runinfo.disks, &options.selected_disks, inner) +.top_panel(TextView::new("Btrfs integration is a technology preview!").center()); Self { view } } @@ -592,17 +604,16 @@ impl BtrfsBootdiskOptionsView { fn get_values(&mut self) -> Option<(Vec, BtrfsBootdiskOptions)> { let (disks, selected_disks) = self.view.get_disks_and_selection()?; -let disk_size = self -.view -.get_options_view()? -.get_value::(0)?; +let view = self.view.get_options_view()?; +let compress = view.get_value::, _>(0)?; +let disk_size = view.get_value::(1)?; Some(( disks, BtrfsBootdiskOptions { disk_size, selected_disks, -compress: BtrfsCompressOption::default(), +compress, }, )) } -- 2.45.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH installer v2 1/5] fix #5250: install: config: add new `btrfs_opts` with `compress` config option
Signed-off-by: Christoph Heiss --- Changes v1 -> v2: * no changes Proxmox/Install/Config.pm | 15 ++ proxmox-auto-installer/src/utils.rs | 1 + proxmox-installer-common/src/options.rs | 31 + proxmox-installer-common/src/setup.rs | 21 -- proxmox-tui-installer/src/setup.rs | 2 ++ proxmox-tui-installer/src/views/bootdisk.rs | 5 ++-- 6 files changed, 71 insertions(+), 4 deletions(-) diff --git a/Proxmox/Install/Config.pm b/Proxmox/Install/Config.pm index ae70093..7e67f69 100644 --- a/Proxmox/Install/Config.pm +++ b/Proxmox/Install/Config.pm @@ -79,6 +79,9 @@ my sub init_cfg { copies => 1, arc_max => Proxmox::Install::RunEnv::default_zfs_arc_max(), # in MiB }, + btrfs_opts => { + compress => 'off', + }, # TODO: single disk selection config target_hd => undef, disk_selection => {}, @@ -173,6 +176,18 @@ sub get_zfs_opt { return defined($k) ? $zfs_opts->{$k} : $zfs_opts; } +sub set_btrfs_opt { +my ($k, $v) = @_; +my $opts = get('btrfs_opts'); +croak "unknown btrfs opts key '$k'" if !exists($opts->{$k}); +$opts->{$k} = $v; +} +sub get_btrfs_opt { +my ($k) = @_; +my $opts = get('btrfs_opts'); +return defined($k) ? $opts->{$k} : $opts; +} + sub set_target_hd { set_key('target_hd', $_[0]); } sub get_target_hd { return get('target_hd'); } diff --git a/proxmox-auto-installer/src/utils.rs b/proxmox-auto-installer/src/utils.rs index 45ad222..ae7dbbd 100644 --- a/proxmox-auto-installer/src/utils.rs +++ b/proxmox-auto-installer/src/utils.rs @@ -340,6 +340,7 @@ pub fn parse_answer( minfree: None, maxvz: None, zfs_opts: None, +btrfs_opts: None, target_hd: None, disk_selection: BTreeMap::new(), existing_storage_auto_rename: 1, diff --git a/proxmox-installer-common/src/options.rs b/proxmox-installer-common/src/options.rs index 9375ded..d99e26d 100644 --- a/proxmox-installer-common/src/options.rs +++ b/proxmox-installer-common/src/options.rs @@ -102,10 +102,40 @@ impl LvmBootdiskOptions { } } +/// See the accompanying mount option in btrfs(5). +#[derive(Copy, Clone, Debug, Default, Deserialize, Eq, PartialEq)] +#[serde(rename_all(deserialize = "lowercase"))] +pub enum BtrfsCompressOption { +On, +#[default] +Off, +Zlib, +Lzo, +Zstd, +} + +impl fmt::Display for BtrfsCompressOption { +fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { +write!(f, "{}", format!("{self:?}").to_lowercase()) +} +} + +impl From<&BtrfsCompressOption> for String { +fn from(value: &BtrfsCompressOption) -> Self { +value.to_string() +} +} + +pub const BTRFS_COMPRESS_OPTIONS: &[BtrfsCompressOption] = { +use BtrfsCompressOption::*; +&[On, Off, Zlib, Lzo, Zstd] +}; + #[derive(Clone, Debug)] pub struct BtrfsBootdiskOptions { pub disk_size: f64, pub selected_disks: Vec, +pub compress: BtrfsCompressOption, } impl BtrfsBootdiskOptions { @@ -115,6 +145,7 @@ impl BtrfsBootdiskOptions { Self { disk_size: disk.size, selected_disks: (0..disks.len()).collect(), +compress: BtrfsCompressOption::default(), } } } diff --git a/proxmox-installer-common/src/setup.rs b/proxmox-installer-common/src/setup.rs index e4e609b..d7f1c47 100644 --- a/proxmox-installer-common/src/setup.rs +++ b/proxmox-installer-common/src/setup.rs @@ -13,8 +13,8 @@ use serde::{de, Deserialize, Deserializer, Serialize, Serializer}; use crate::{ options::{ -BtrfsRaidLevel, Disk, FsType, ZfsBootdiskOptions, ZfsChecksumOption, ZfsCompressOption, -ZfsRaidLevel, +BtrfsBootdiskOptions, BtrfsCompressOption, BtrfsRaidLevel, Disk, FsType, +ZfsBootdiskOptions, ZfsChecksumOption, ZfsCompressOption, ZfsRaidLevel, }, utils::CidrAddress, }; @@ -221,6 +221,20 @@ impl From for InstallZfsOption { } } +#[derive(Debug, Deserialize, Serialize)] +pub struct InstallBtrfsOption { +#[serde(serialize_with = "serialize_as_display")] +pub compress: BtrfsCompressOption, +} + +impl From for InstallBtrfsOption { +fn from(opts: BtrfsBootdiskOptions) -> Self { +InstallBtrfsOption { +compress: opts.compress, +} +} +} + pub fn read_json Deserialize<'de>, P: AsRef>(path: P) -> Result { let file = File::open(path).map_err(|err| err.to_string())?; let reader = BufReader::new(file); @@ -475,6 +489,9 @@ pub struct InstallConfig { #[serde(skip_serializing_if = "Option::is_none")] pub zfs_opts: Option, +#[serde(skip_serializing_if = "Option::is_none")] +pub btrfs_opts: Option, + #[serde( serialize_with = "serialize_disk_opt", skip_serializing_if = "Option::is_none", diff --git a/proxmox-tui-installer/src/setup.rs b/proxmox-tui-installer/src/setup.rs index e
[pve-devel] [PATCH installer v2 2/5] fix #5250: install: write btrfs `compress` option to fstab
`compress` instead of `compress-force` is used, as the latter can have unindented (performance) implications, as the name implies. That would be neither expected by users nor should such a decision made without the user explicitly opting for it. Others do the same, e.g. the installer for RedHat/Fedora systems (aka. Anaconda) opts for `compress` too. Signed-off-by: Christoph Heiss --- Changes v1 -> v2: * no changes * moved some mount setup code here from next patch Proxmox/Install.pm | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/Proxmox/Install.pm b/Proxmox/Install.pm index fa2702c..5c64c3d 100644 --- a/Proxmox/Install.pm +++ b/Proxmox/Install.pm @@ -1065,7 +1065,16 @@ sub extract_data { die "unable to detect FS UUID" if !defined($fsuuid); - $fstab .= "UUID=$fsuuid / btrfs defaults 0 1\n"; + my $btrfs_opts = Proxmox::Install::Config::get_btrfs_opt(); + + my $mountopts = 'defaults'; + if ($btrfs_opts->{compress} eq 'on') { + $mountopts .= ',compress'; + } elsif ($btrfs_opts->{compress} ne 'off') { + $mountopts .= ",compress=$btrfs_opts->{compress}"; + } + + $fstab .= "UUID=$fsuuid / btrfs $mountopts 0 1\n"; } else { my $root_mountopt = $fssetup->{$filesys}->{root_mountopt} || 'defaults'; $fstab .= "$rootdev / $filesys ${root_mountopt} 0 1\n"; -- 2.45.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH installer v2 3/5] fix #5250: proxinstall: expose new btrfs `compress` option
Signed-off-by: Christoph Heiss --- Changes v1 -> v2: * no changes * moved some mount option setup code to previous patch proxinstall | 15 +++ 1 file changed, 15 insertions(+) diff --git a/proxinstall b/proxinstall index 12f3eaa..c7776f0 100755 --- a/proxinstall +++ b/proxinstall @@ -1160,6 +1160,21 @@ my $create_raid_advanced_grid = sub { my $create_btrfs_raid_advanced_grid = sub { my ($hdsize_btn) = @_; my $labeled_widgets = []; + +my $combo_compress = Gtk3::ComboBoxText->new(); +$combo_compress->set_tooltip_text("btrfs compression algorithm for boot volume"); +my $comp_opts = ["on", "off", "zlib", "lzo", "zstd"]; +foreach my $opt (@$comp_opts) { + $combo_compress->append($opt, $opt); +} +my $compress = Proxmox::Install::Config::get_btrfs_opt('compress') // 'off'; +$combo_compress->set_active_id($compress); +$combo_compress->signal_connect (changed => sub { + my $w = shift; + Proxmox::Install::Config::set_btrfs_opt('compress', $w->get_active_text()); +}); +push @$labeled_widgets, ['compress', $combo_compress]; + push @$labeled_widgets, ['hdsize', $hdsize_btn, 'GB']; return $create_label_widget_grid->($labeled_widgets);; }; -- 2.45.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH installer v2 5/5] fix #5250: auto-installer: expose new btrfs `compress` option
Signed-off-by: Christoph Heiss --- Changes v1 -> v2: * squashed in separate tests patch proxmox-auto-installer/src/answer.rs | 6 - proxmox-auto-installer/src/utils.rs | 6 - .../tests/resources/parse_answer/btrfs.json | 24 +++ .../tests/resources/parse_answer/btrfs.toml | 17 + 4 files changed, 51 insertions(+), 2 deletions(-) create mode 100644 proxmox-auto-installer/tests/resources/parse_answer/btrfs.json create mode 100644 proxmox-auto-installer/tests/resources/parse_answer/btrfs.toml diff --git a/proxmox-auto-installer/src/answer.rs b/proxmox-auto-installer/src/answer.rs index d691da1..fd0ed0d 100644 --- a/proxmox-auto-installer/src/answer.rs +++ b/proxmox-auto-installer/src/answer.rs @@ -1,6 +1,9 @@ use clap::ValueEnum; use proxmox_installer_common::{ -options::{BtrfsRaidLevel, FsType, ZfsChecksumOption, ZfsCompressOption, ZfsRaidLevel}, +options::{ +BtrfsCompressOption, BtrfsRaidLevel, FsType, ZfsChecksumOption, ZfsCompressOption, +ZfsRaidLevel, +}, utils::{CidrAddress, Fqdn}, }; use serde::{Deserialize, Serialize}; @@ -270,6 +273,7 @@ pub struct LvmOptions { pub struct BtrfsOptions { pub hdsize: Option, pub raid: Option, +pub compress: Option, } #[derive(Clone, Deserialize, Serialize, Debug, PartialEq)] diff --git a/proxmox-auto-installer/src/utils.rs b/proxmox-auto-installer/src/utils.rs index ae7dbbd..9e86053 100644 --- a/proxmox-auto-installer/src/utils.rs +++ b/proxmox-auto-installer/src/utils.rs @@ -11,7 +11,8 @@ use crate::{ use proxmox_installer_common::{ options::{FsType, NetworkOptions, ZfsChecksumOption, ZfsCompressOption}, setup::{ -InstallConfig, InstallRootPassword, InstallZfsOption, LocaleInfo, RuntimeInfo, SetupInfo, +InstallBtrfsOption, InstallConfig, InstallRootPassword, InstallZfsOption, LocaleInfo, +RuntimeInfo, SetupInfo, }, }; use serde::{Deserialize, Serialize}; @@ -394,6 +395,9 @@ pub fn parse_answer( config.hdsize = btrfs .hdsize .unwrap_or(runtime_info.disks[first_selected_disk].size); +config.btrfs_opts = Some(InstallBtrfsOption { +compress: btrfs.compress.unwrap_or_default(), +}) } } Ok(config) diff --git a/proxmox-auto-installer/tests/resources/parse_answer/btrfs.json b/proxmox-auto-installer/tests/resources/parse_answer/btrfs.json new file mode 100644 index 000..0330a38 --- /dev/null +++ b/proxmox-auto-installer/tests/resources/parse_answer/btrfs.json @@ -0,0 +1,24 @@ +{ + "autoreboot": 1, + "cidr": "192.168.1.114/24", + "country": "at", + "dns": "192.168.1.254", + "domain": "testinstall", + "disk_selection": { +"6": "6", +"7": "7" + }, + "filesys": "btrfs (RAID1)", + "gateway": "192.168.1.1", + "hdsize": 80.0, + "existing_storage_auto_rename": 1, + "hostname": "pveauto", + "keymap": "de", + "mailto": "mail@no.invalid", + "mngmt_nic": "eno1", + "root_password": { "plain": "123456" }, + "timezone": "Europe/Vienna", + "btrfs_opts": { +"compress": "zlib" + } +} diff --git a/proxmox-auto-installer/tests/resources/parse_answer/btrfs.toml b/proxmox-auto-installer/tests/resources/parse_answer/btrfs.toml new file mode 100644 index 000..8fcd27d --- /dev/null +++ b/proxmox-auto-installer/tests/resources/parse_answer/btrfs.toml @@ -0,0 +1,17 @@ +[global] +keyboard = "de" +country = "at" +fqdn = "pveauto.testinstall" +mailto = "mail@no.invalid" +timezone = "Europe/Vienna" +root_password = "123456" + +[network] +source = "from-dhcp" + +[disk-setup] +filesystem = "btrfs" +btrfs.raid = "raid1" +btrfs.compress = "zlib" +btrfs.hdsize = 80 +disk_list = ["sda", "sdb"] -- 2.45.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH installer v2 0/5] fix #5250: add btrfs `compress` mount option support
Fixes #5250 [0]. Pretty much as it says on the tin, this adds the `compress` mount option for Btrfs much in the same way as the ZFS equivalent. W.r.t to the discussion in #5250 - `compress` is used. For a detailed explanation, see patch #2. [0] https://bugzilla.proxmox.com/show_bug.cgi?id=5250 History === v1: https://lists.proxmox.com/pipermail/pve-devel/2024-May/063845.html Notable changes v1 -> v2: * rebased on latest master * squashed auto-installer tests into feature patch Diffstat Christoph Heiss (5): fix #5250: install: config: add new `btrfs_opts` with `compress` config option fix #5250: install: write btrfs `compress` option to fstab fix #5250: proxinstall: expose new btrfs `compress` option fix #5250: tui: expose new btrfs `compress` option fix #5250: auto-installer: expose new btrfs `compress` option Proxmox/Install.pm| 11 +- Proxmox/Install/Config.pm | 15 proxinstall | 15 proxmox-auto-installer/src/answer.rs | 6 +++- proxmox-auto-installer/src/utils.rs | 7 +++- .../tests/resources/parse_answer/btrfs.json | 24 + .../tests/resources/parse_answer/btrfs.toml | 17 ++ proxmox-installer-common/src/options.rs | 31 + proxmox-installer-common/src/setup.rs | 21 ++-- proxmox-tui-installer/src/setup.rs| 2 ++ proxmox-tui-installer/src/views/bootdisk.rs | 34 +-- 11 files changed, 167 insertions(+), 16 deletions(-) create mode 100644 proxmox-auto-installer/tests/resources/parse_answer/btrfs.json create mode 100644 proxmox-auto-installer/tests/resources/parse_answer/btrfs.toml -- 2.44.0 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] issues with Virtio-SCSI devicde on Proxmox...
--- Begin Message --- Hi, I didn't see the responde of Fiona but indeed: https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg01567.html "virtio devices can be exposed in upto three ways - Legacy - follows virtio 0.9 specification. always uses PCI ID range 0x1000-0x103F - Transitional - follows virtio 0.9 specification by default, but can auto-negotiate with guest for 1.0 spce. Always uses PCI ID range 0x1000-0x103F - Modern - follows virtio 1.0 specification. always uses PCI ID range 0x1040-0x107F With QEMU, historically devices placed on a PCI bus will always default to being in transitional mode, while devices placed on a PCI-E bus will always dfault to being in modern mode. " Message initial De: Fiona Ebner Répondre à: Proxmox VE development discussion À: Proxmox VE development discussion , Christian Moser Objet: Re: [pve-devel] issues with Virtio-SCSI devicde on Proxmox... Date: 13/08/2024 10:55:47 Hi, Am 12.08.24 um 12:40 schrieb Christian Moser: > Hello, > > I work for VSI (VMS Software Inc) which is porting the OpenVMS > operating system to x86. At this point we successfully on various > hypervisors, but have some issues on the KVM running on Proxmox. > > The OpenVMS VM works just fine with SATA disks and it also works with > for example virtio-network device etc., but trying to use > virtio-scsi hangs the driver. I have debugged this and I can > successfully configure the port/controller, send the IO request to > the > device. It then gets processed by the device, which posts the results > and sets the interrupt bit in the ISR register, but it never > asserts the interrupt hence the driver never gets notified and the > I/O hangs. > > I have tried both “virtio-scsi-pci” and “virtio-scsi-single”, but no > luck. The emulated virtio-scsi device is a legacy device. But > then again, the virtio-network device is also a legacy device and > here we are getting interrupts. One thing which bothers me is the > fact that the “legacy interrupt disable” bit is set in the PCI config > space of the virtio-scsi device (i.e. bit 10 at offset 4) > > Questions: > * is there a way to configure a modern virtio-scsi devcie (i.e. > disable_legacy=on) ? I've already answered this question when you asked in a mail addressed directly to me: Am 12.08.24 um 11:58 schrieb Fiona Ebner: > Hi, > > It seems that you either need to attach the "virtio-scsi-pci" device > to > a pcie bus or explicitly set the "disable_legacy=on" option for the > device, neither of which Proxmox VE currently does or allows > configuring. The only way right now seems to be to attach the disk > yourself via custom arguments (the 'args' in the Proxmox VE VM > configuration), but then the disk will be invisible to Proxmox VE > operations which look at specific disk keys in the configuration! > > Feel free to open a feature request on our bug tracker to make this > configurable: > https://antiphishing.vadesecure.com/v4?f=MDk0SW9xRkhTVGYydkJlTIW3tbTr > aK7neiUoWcvis0-pokd-Q2cwuWCZhgBcTw_yw4KqJS-oP6jCsk-zj- > 1YMQ&i=ZURHSDhnY0huQ2tPS3VZahJdhRaQu1ItpJrYkl8wXrA&k=q1N6&r=RjIyR1Rob > kVxVWlHTXhKT3I72JjP2S9ryFg3B_csBogfeb2oROpP8B8yUJUd6awk&s=1aab5a2ada7 > 3beb46aa02df4e18ff6c5ba2db6d6ff2d1f302a3c4c83b13c8ef6&u=https%3A%2F%2 > Fbugzilla.proxmox.com%2F > > P.S. Please write to the developer list rather than individual > developers for such questions in the feature: > https://antiphishing.vadesecure.com/v4?f=MDk0SW9xRkhTVGYydkJlTIW3tbTr > aK7neiUoWcvis0-pokd-Q2cwuWCZhgBcTw_yw4KqJS-oP6jCsk-zj- > 1YMQ&i=ZURHSDhnY0huQ2tPS3VZahJdhRaQu1ItpJrYkl8wXrA&k=q1N6&r=RjIyR1Rob > kVxVWlHTXhKT3I72JjP2S9ryFg3B_csBogfeb2oROpP8B8yUJUd6awk&s=50133960c87 > 16b5426bc084f398f7760f04af8739fd68cad36d17b1dcd887778&u=https%3A%2F%2 > Flists.proxmox.com%2Fcgi-bin%2Fmailman%2Flistinfo%2Fpve-devel > > Best Regards, > Fiona > * why is the legacy interrupt bit set in the PCI config space ? Most likely because the virtio-scsi-pci is configured without the "disable_legacy=on" option. If not explicitly set, the option will be "disable_legacy=auto" and when not attached to PCIe (which is the case for Proxmox VE currently), then legacy mode will be enabled. > * Are there any working driver for virtio-scsi on this KVM using Q35 > machine? i.e. any other OS The virtio drivers for Windows and the ones in Linux work just fine with our configuration. > Any thoughts why these interrupts are not getting delivered on the > PCIE bus? We do not configure the virtio-scsi-pci on a PCIe bus currently, see my initial answer. Best Regards, Fiona ___ pve-devel mailing list pve-devel@lists.proxmox.com https://antiphishing.vadesecure.com/v4?f=MDk0SW9xRkhTVGYydkJlTIW3tbTraK 7neiUoWcvis0-pokd-Q2cwuWCZhgBcTw_yw4KqJS-oP6jCsk-zj- 1YMQ&i=ZURHSDhnY0huQ2tPS3VZahJdhRaQu1ItpJrYkl8wXrA&k=q1N6&r=RjIyR1RobkV xVWlHTXhKT3I72JjP2S9ryFg3B_csBogfeb2oROpP8B8yUJUd6awk&s=50133960c871