[pve-devel] issues with Virtio-SCSI devicde on Proxmox...

2024-08-13 Thread Christian Moser
Hello,

I work for VSI (VMS Software Inc) which is porting the OpenVMS operating system 
to x86. At this point we successfully on various
hypervisors, but have some issues on the KVM running on Proxmox.

The OpenVMS VM works just fine with SATA disks and it also works with for 
example virtio-network device etc., but trying to use
virtio-scsi hangs  the driver. I have debugged this and I can successfully 
configure the port/controller, send the IO request to the
device. It then gets processed by the device, which posts the results and sets 
the interrupt bit in the ISR register, but it never
asserts the interrupt hence the driver never gets notified and the I/O hangs.

I have tried both “virtio-scsi-pci” and “virtio-scsi-single”, but no luck. The 
emulated virtio-scsi device is a legacy device. But
then again, the virtio-network device is also a legacy device and here we are 
getting interrupts. One thing which bothers me is the
fact that the “legacy interrupt disable” bit is set in the PCI config space of 
the virtio-scsi device (i.e. bit 10 at offset 4)

Questions:
*   is there a way to configure a modern virtio-scsi devcie (i.e. 
disable_legacy=on) ?
*   why is the legacy interrupt bit set in the PCI config space ?
*   Are there any working driver for virtio-scsi on this KVM using Q35 
machine? i.e. any other OS

Any thoughts why these interrupts are not getting delivered on the PCIE bus?

thanks

/cmos

___
Christian Moser
Mobile:+358-40-5022105  
Email:  c...@maklee.com
URL:   www.maklee.com

___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] Interest in a file manager interface within pve-manager?

2024-08-13 Thread Dominik Csapak

On 7/12/24 00:27, Blythe, Nathan F. - US via pve-devel wrote:


Hello,



Hi, sorry for the late answer.

First thanks for a well written proposal/question. This is what we'd like to 
see if third-parties
are interested in making features/changes/etc.

I try to answer the question/proposal to the best of my knowledge, but maybe 
some of my colleagues
have additional things to say or want to correct me
(this could take a bit though, due to summer holiday season)


We have a need for a non-network based (i.e. hypercall / out-of-band) file browser for containers 
and VMs within the pve-manager web UI. I'd like to be able to select a VM or container, click 
"More", and choose "File manager" and then browse the VM or container's file 
systems and upload or download files from my local system, through the web-browser.

For VMs (qemu), the qemu-guest-agent provides the ability to read/write guest files, and the 
pve API exposes it (with strict file size limits) as 
nodes->{node}->qemu->{vmid}->agent->file-*. Presumably, a web UI file manager 
would use that API endpoint, plus maybe an enhancement for directory listing and chunked 
upload/download (to support arbitrary size files).

For containers (LXC), the pct application can read/write files by (I think) 
mounting the container root file system on the host and touching it directly. 
But there's no corresponding API endpoint. If there were, it could be used by a 
future web UI file manager.

So to implement this feature, I think there are 5 steps:

   1.  Enhance the qemu-guest-agent and qm to support reading directory 
structure and creating directories.
   2.  Enhance the pve API to expose these new features like file-read and 
file-write.
   3.  Enhance the pve API to support chunked file read and write, for large 
files. Maybe also requiring an enhancement to the qemu-guest-agent?
   4.  Enhance the pve API to support chunked file read/write, whole file 
read/write, and directory structure/creating directories (through direct 
filesystem interaction on the host).
   5.  Write a JS front-end file manager, invokable from the Other menu, which 
uses these new/enhanced API endpoints to implement a general purpose 
out-of-band file manager.

Is there interest from the proxmox developers for something like this? If we 
started working on some of these items, would there be interest in accepting 
patches? Does my general approach look sound?


After a bit of internal discussion, the general tone is that we probably don't 
want to expose the
guest content/state by default this much now, but in the longer term it could 
be OK, but we see the
following issues:

* For now QEMU allows those guest calls by default, and there is at the moment 
not a really good
  way for configuring what's allowed and what not.
* Our privileges and ACLs are not really designed for this currently.
* Handling permissions (and other filesystem/guest "quirks") will be very hard
  (e.g. windows permissions are very different from linux/unix ones)

As breaking the host/guest barrier can have bad implications on security/data 
integrity/etc.
those problems would have to be solved before one could/should start with such 
a feature.

There is currently some work done on the QEMU side for limiting/configuring QGA 
calls:
https://lore.kernel.org/qemu-devel/20240604153242.251334-1-berra...@redhat.com/

When that's integrated, one could think about how our ACLs, privileges could be 
extended to allow
such things.

Only after those are solved, implementing this feature make sense.

What could be done in the meantime probably is to work with upstream QEMU to add
a directory listing to the qga and maybe some proper permission handling?

This is relatively independent from our side, since we don't actually ship
the guest agent ourselves, so this has to be solved upstream anyway.

So all in all, such a feature could work, but only in the longer term, this is 
nothing
for the short/midterm.

Hope this answers your questions. If not, just write again :)



(We also investigated using SPICE's folder-sharing capabilities and the 
spice-proxy provided by the host, but that doesn't work with the Windows 
virt-viewer client, only the Linux client. It looks like there was some 
interest in the past in enhancing xterm.js to support xmodem file transfer, but 
it didn't go far.)

Regards,
Nathan Blythe
CACI



Best regards
Dominik


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH v2 qemu-server] remote migration: fix online migration via API clients

2024-08-13 Thread Fiona Ebner
As reported in the community forum [0], when a remote migration
request comes in via an API client, the -T flag for Perl is set, so an
insecure dependency in a call like unlink() in forward_unix_socket()
will fail with:

> failed to write forwarding command - Insecure dependency in unlink while 
> running with -T switch

To fix it, untaint the problematic socket addresses coming from the
remote side. Require that all sockets are below '/run/qemu-server/'
and end with '.migrate'. This allows extensions in the future while
still being quite strict.

[0]: https://forum.proxmox.com/threads/123048/post-691958

Signed-off-by: Fiona Ebner 
---

Changes in v2:
* rule out '../' in path

 PVE/QemuMigrate.pm | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index e71face4..d31589e7 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -1095,7 +1095,10 @@ sub phase2 {
die "only UNIX sockets are supported for remote migration\n"
if $tunnel_info->{proto} ne 'unix';
 
-   my $remote_socket = $tunnel_info->{addr};
+   # untaint
+   my ($remote_socket) =
+   $tunnel_info->{addr} =~ 
m|^(/run/qemu-server/(?:(?!\.\./).)+\.migrate)$|;
+   die "unexpected socket address '$tunnel_info->{addr}'\n" if 
!$remote_socket;
my $local_socket = $remote_socket;
$local_socket =~ s/$remote_vmid/$vmid/g;
$tunnel_info->{addr} = $local_socket;
@@ -1104,6 +1107,9 @@ sub phase2 {
PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, 
$remote_socket);
 
foreach my $remote_socket (@{$tunnel_info->{unix_sockets}}) {
+   # untaint
+   ($remote_socket) = $remote_socket =~ 
m|^(/run/qemu-server/(?:(?!\.\./).)+\.migrate)$|
+   or die "unexpected socket address '$remote_socket'\n";
my $local_socket = $remote_socket;
$local_socket =~ s/$remote_vmid/$vmid/g;
next if $self->{tunnel}->{forwarded}->{$local_socket};
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] Interest in a file manager interface within pve-manager?

2024-08-13 Thread Dominik Csapak

sorry for the noise, it seems i forgot to put you in CC, so now again with 
that...

On 8/13/24 10:35, Dominik Csapak wrote:

On 7/12/24 00:27, Blythe, Nathan F. - US via pve-devel wrote:


Hello,



Hi, sorry for the late answer.

First thanks for a well written proposal/question. This is what we'd like to 
see if third-parties
are interested in making features/changes/etc.

I try to answer the question/proposal to the best of my knowledge, but maybe 
some of my colleagues
have additional things to say or want to correct me
(this could take a bit though, due to summer holiday season)

We have a need for a non-network based (i.e. hypercall / out-of-band) file browser for containers 
and VMs within the pve-manager web UI. I'd like to be able to select a VM or container, click 
"More", and choose "File manager" and then browse the VM or container's file systems and upload or 
download files from my local system, through the web-browser.


For VMs (qemu), the qemu-guest-agent provides the ability to read/write guest files, and the pve 
API exposes it (with strict file size limits) as nodes->{node}->qemu->{vmid}->agent->file-*. 
Presumably, a web UI file manager would use that API endpoint, plus maybe an enhancement for 
directory listing and chunked upload/download (to support arbitrary size files).


For containers (LXC), the pct application can read/write files by (I think) mounting the container 
root file system on the host and touching it directly. But there's no corresponding API endpoint. 
If there were, it could be used by a future web UI file manager.


So to implement this feature, I think there are 5 steps:

   1.  Enhance the qemu-guest-agent and qm to support reading directory structure and creating 
directories.

   2.  Enhance the pve API to expose these new features like file-read and 
file-write.
   3.  Enhance the pve API to support chunked file read and write, for large files. Maybe also 
requiring an enhancement to the qemu-guest-agent?
   4.  Enhance the pve API to support chunked file read/write, whole file read/write, and 
directory structure/creating directories (through direct filesystem interaction on the host).
   5.  Write a JS front-end file manager, invokable from the Other menu, which uses these new/ 
enhanced API endpoints to implement a general purpose out-of-band file manager.


Is there interest from the proxmox developers for something like this? If we started working on 
some of these items, would there be interest in accepting patches? Does my general approach look 
sound?


After a bit of internal discussion, the general tone is that we probably don't 
want to expose the
guest content/state by default this much now, but in the longer term it could 
be OK, but we see the
following issues:

* For now QEMU allows those guest calls by default, and there is at the moment 
not a really good
   way for configuring what's allowed and what not.
* Our privileges and ACLs are not really designed for this currently.
* Handling permissions (and other filesystem/guest "quirks") will be very hard
   (e.g. windows permissions are very different from linux/unix ones)

As breaking the host/guest barrier can have bad implications on security/data 
integrity/etc.
those problems would have to be solved before one could/should start with such 
a feature.

There is currently some work done on the QEMU side for limiting/configuring QGA 
calls:
https://lore.kernel.org/qemu-devel/20240604153242.251334-1-berra...@redhat.com/

When that's integrated, one could think about how our ACLs, privileges could be 
extended to allow
such things.

Only after those are solved, implementing this feature make sense.

What could be done in the meantime probably is to work with upstream QEMU to add
a directory listing to the qga and maybe some proper permission handling?

This is relatively independent from our side, since we don't actually ship
the guest agent ourselves, so this has to be solved upstream anyway.

So all in all, such a feature could work, but only in the longer term, this is 
nothing
for the short/midterm.

Hope this answers your questions. If not, just write again :)



(We also investigated using SPICE's folder-sharing capabilities and the spice-proxy provided by 
the host, but that doesn't work with the Windows virt-viewer client, only the Linux client. It 
looks like there was some interest in the past in enhancing xterm.js to support xmodem file 
transfer, but it didn't go far.)


Regards,
Nathan Blythe
CACI



Best regards
Dominik


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel






___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH qemu-server] remote migration: fix online migration via API clients

2024-08-13 Thread Fiona Ebner
Am 09.08.24 um 12:17 schrieb Fiona Ebner:
> diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
> index e71face4..7a7183e0 100644
> --- a/PVE/QemuMigrate.pm
> +++ b/PVE/QemuMigrate.pm
> @@ -1095,7 +1095,8 @@ sub phase2 {
>   die "only UNIX sockets are supported for remote migration\n"
>   if $tunnel_info->{proto} ne 'unix';
>  
> - my $remote_socket = $tunnel_info->{addr};
> + my ($remote_socket) = $tunnel_info->{addr} =~ 
> m!^(/run/qemu-server/.*\.migrate)$! #untaint

I sent a v2 to rule out "../" in the regex:
https://lists.proxmox.com/pipermail/pve-devel/2024-August/065075.html


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] issues with Virtio-SCSI devicde on Proxmox...

2024-08-13 Thread Fiona Ebner
Hi,

Am 12.08.24 um 12:40 schrieb Christian Moser:
> Hello,
> 
> I work for VSI (VMS Software Inc) which is porting the OpenVMS operating 
> system to x86. At this point we successfully on various
> hypervisors, but have some issues on the KVM running on Proxmox.
> 
> The OpenVMS VM works just fine with SATA disks and it also works with for 
> example virtio-network device etc., but trying to use
> virtio-scsi hangs  the driver. I have debugged this and I can successfully 
> configure the port/controller, send the IO request to the
> device. It then gets processed by the device, which posts the results and 
> sets the interrupt bit in the ISR register, but it never
> asserts the interrupt hence the driver never gets notified and the I/O hangs.
> 
> I have tried both “virtio-scsi-pci” and “virtio-scsi-single”, but no luck. 
> The emulated virtio-scsi device is a legacy device. But
> then again, the virtio-network device is also a legacy device and here we are 
> getting interrupts. One thing which bothers me is the
> fact that the “legacy interrupt disable” bit is set in the PCI config space 
> of the virtio-scsi device (i.e. bit 10 at offset 4)
> 
> Questions:
> * is there a way to configure a modern virtio-scsi devcie (i.e. 
> disable_legacy=on) ?

I've already answered this question when you asked in a mail addressed
directly to me:

Am 12.08.24 um 11:58 schrieb Fiona Ebner:
> Hi,
> 
> It seems that you either need to attach the "virtio-scsi-pci" device to
> a pcie bus or explicitly set the "disable_legacy=on" option for the
> device, neither of which Proxmox VE currently does or allows
> configuring. The only way right now seems to be to attach the disk
> yourself via custom arguments (the 'args' in the Proxmox VE VM
> configuration), but then the disk will be invisible to Proxmox VE
> operations which look at specific disk keys in the configuration!
> 
> Feel free to open a feature request on our bug tracker to make this
> configurable: https://bugzilla.proxmox.com/
> 
> P.S. Please write to the developer list rather than individual
> developers for such questions in the feature:
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> Best Regards,
> Fiona

> * why is the legacy interrupt bit set in the PCI config space ?

Most likely because the virtio-scsi-pci is configured without the
"disable_legacy=on" option. If not explicitly set, the option will be
"disable_legacy=auto" and when not attached to PCIe (which is the case
for Proxmox VE currently), then legacy mode will be enabled.

> * Are there any working driver for virtio-scsi on this KVM using Q35 
> machine? i.e. any other OS

The virtio drivers for Windows and the ones in Linux work just fine with
our configuration.


> Any thoughts why these interrupts are not getting delivered on the PCIE bus?

We do not configure the virtio-scsi-pci on a PCIe bus currently, see my
initial answer.

Best Regards,
Fiona


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[pve-devel] [PATCH docs] pve-network: correct language errors

2024-08-13 Thread Alexander Zeidler
Signed-off-by: Alexander Zeidler 
---
 pve-network.adoc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/pve-network.adoc b/pve-network.adoc
index acdcf39..434430d 100644
--- a/pve-network.adoc
+++ b/pve-network.adoc
@@ -447,7 +447,7 @@ slaves in the single logical bonded interface such that 
different
 network-peers use different MAC addresses for their network packet
 traffic.
 
-If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using
+If your switch supports the LACP (IEEE 802.3ad) protocol then we recommend 
using
 the corresponding bonding mode (802.3ad). Otherwise you should generally use 
the
 active-backup mode.
 
@@ -490,7 +490,7 @@ iface vmbr0 inet static
 
 
 [thumbnail="default-network-setup-bond.svg"]
-Another possibility it to use the bond directly as bridge port.
+Another possibility is to use the bond directly as bridge port.
 This can be used to make the guest network fault-tolerant.
 
 .Example: Use a bond as bridge port
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH docs] local-zfs: correct language errors

2024-08-13 Thread Alexander Zeidler
Signed-off-by: Alexander Zeidler 
---
 local-zfs.adoc | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/local-zfs.adoc b/local-zfs.adoc
index 130f9d6..7ddf50a 100644
--- a/local-zfs.adoc
+++ b/local-zfs.adoc
@@ -442,7 +442,7 @@ recommend to:
 # zpool create -f -o ashift=12   log 
 
 
-In above example a single `` and a single `` is used, but 
you
+In the example above, a single `` and a single `` is used, 
but you
 can also combine this with other RAID variants, as described in the
 xref:sysadmin_zfs_create_new_zpool_raid0[Create a new pool with RAID] section.
 
@@ -465,19 +465,19 @@ protection that you want to use for improving the overall 
performance of your
 pool.
 
 As the maximum size of a log device should be about half the size of the
-installed physical memory, it means that the ZIL will mostly likely only take 
up
+installed physical memory, it means that the ZIL will most likely only take up
 a relatively small part of the SSD, the remaining space can be used as cache.
 
 First you have to create two GPT partitions on the SSD with `parted` or 
`gdisk`.
 
-Then you're ready to add them to an pool:
+Then you're ready to add them to a pool:
 
 .Add both, a separate log device and a second-level cache, to an existing pool
 
 # zpool add -f  log  cache 
 
 
-Just replay ``, `` and `` with the pool name
+Just replace ``, `` and `` with the pool name
 and the two `/dev/disk/by-id/` paths to the partitions.
 
 You can also add ZIL and cache separately.
@@ -527,8 +527,8 @@ process of the new disk has progressed.
 # proxmox-boot-tool init  [grub]
 
 
-NOTE: `ESP` stands for EFI System Partition, which is setup as partition #2 on
-bootable disks setup by the {pve} installer since version 5.4. For details, see
+NOTE: `ESP` stands for EFI System Partition, which is set up as partition #2 on
+bootable disks when using the {pve} installer since version 5.4. For details, 
see
 xref:sysboot_proxmox_boot_setup[Setting up a new partition for use as synced 
ESP].
 
 NOTE: Make sure to pass 'grub' as mode to `proxmox-boot-tool init` if
@@ -541,7 +541,7 @@ especially if Secure Boot is enabled!
 # grub-install 
 
 NOTE: Plain GRUB is only used on systems installed with {pve} 6.3 or earlier,
-which have not been manually migrated to using `proxmox-boot-tool` yet.
+which have not been manually migrated to use `proxmox-boot-tool` yet.
 
 
 Configure E-Mail Notification
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] [PATCH docs] pve-network: correct language errors

2024-08-13 Thread Lukas Wagner



On  2024-08-13 12:23, Alexander Zeidler wrote:
> Signed-off-by: Alexander Zeidler 
> ---
>  pve-network.adoc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/pve-network.adoc b/pve-network.adoc
> index acdcf39..434430d 100644
> --- a/pve-network.adoc
> +++ b/pve-network.adoc
> @@ -447,7 +447,7 @@ slaves in the single logical bonded interface such that 
> different
>  network-peers use different MAC addresses for their network packet
>  traffic.
>  
> -If your switch support the LACP (IEEE 802.3ad) protocol then we recommend 
> using
> +If your switch supports the LACP (IEEE 802.3ad) protocol then we recommend 
> using
   ^
There should be comma here: If ..., then ...

>  the corresponding bonding mode (802.3ad). Otherwise you should generally use 
> the
>  active-backup mode.
>  
> @@ -490,7 +490,7 @@ iface vmbr0 inet static
>  
>  
>  [thumbnail="default-network-setup-bond.svg"]
> -Another possibility it to use the bond directly as bridge port.
> +Another possibility is to use the bond directly as bridge port.
>  This can be used to make the guest network fault-tolerant.
>  
>  .Example: Use a bond as bridge port

Apart from this, this looks good to me:

Reviewed-by: Lukas Wagner 

-- 
- Lukas


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH qemu v2 02/25] PVE backup: fixup error handling for fleecing

2024-08-13 Thread Fiona Ebner
The drained section needs to be terminated before breaking out of the
loop in the error scenarios. Otherwise, guest IO on the drive would
become stuck.

If the job is created successfully, then the job completion callback
will clean up the snapshot access block nodes. In case failure
happened before the job is created, there was no cleanup for the
snapshot access block nodes yet. Add it.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 pve-backup.c | 38 +-
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index 4e730aa3da..c4178758b3 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -357,22 +357,23 @@ static void coroutine_fn 
pvebackup_co_complete_stream(void *opaque)
 qemu_co_mutex_unlock(&backup_state.backup_mutex);
 }
 
+static void cleanup_snapshot_access(PVEBackupDevInfo *di)
+{
+if (di->fleecing.snapshot_access) {
+bdrv_unref(di->fleecing.snapshot_access);
+di->fleecing.snapshot_access = NULL;
+}
+if (di->fleecing.cbw) {
+bdrv_cbw_drop(di->fleecing.cbw);
+di->fleecing.cbw = NULL;
+}
+}
+
 static void pvebackup_complete_cb(void *opaque, int ret)
 {
 PVEBackupDevInfo *di = opaque;
 di->completed_ret = ret;
 
-/*
- * Handle block-graph specific cleanup (for fleecing) outside of the 
coroutine, because the work
- * won't be done as a coroutine anyways:
- * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via 
bdrv_co_unref() would
- *   just spawn a BH calling bdrv_unref().
- * - For cbw, draining would need to spawn a BH.
- */
-if (di->fleecing.snapshot_access) {
-bdrv_unref(di->fleecing.snapshot_access);
-di->fleecing.snapshot_access = NULL;
-}
 if (di->fleecing.cbw) {
 /*
  * With fleecing, failure for cbw does not fail the guest write, but 
only sets the snapshot
@@ -383,10 +384,17 @@ static void pvebackup_complete_cb(void *opaque, int ret)
 if (di->completed_ret == -EACCES && snapshot_error) {
 di->completed_ret = snapshot_error;
 }
-bdrv_cbw_drop(di->fleecing.cbw);
-di->fleecing.cbw = NULL;
 }
 
+/*
+ * Handle block-graph specific cleanup (for fleecing) outside of the 
coroutine, because the work
+ * won't be done as a coroutine anyways:
+ * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via 
bdrv_co_unref() would
+ *   just spawn a BH calling bdrv_unref().
+ * - For cbw, draining would need to spawn a BH.
+ */
+cleanup_snapshot_access(di);
+
 /*
  * Needs to happen outside of coroutine, because it takes the graph write 
lock.
  */
@@ -587,6 +595,7 @@ static void create_backup_jobs_bh(void *opaque) {
 if (!di->fleecing.cbw) {
 error_setg(errp, "appending cbw node for fleecing failed: %s",
local_err ? error_get_pretty(local_err) : "unknown 
error");
+bdrv_drained_end(di->bs);
 break;
 }
 
@@ -599,6 +608,8 @@ static void create_backup_jobs_bh(void *opaque) {
 if (!di->fleecing.snapshot_access) {
 error_setg(errp, "setting up snapshot access for fleecing 
failed: %s",
local_err ? error_get_pretty(local_err) : "unknown 
error");
+cleanup_snapshot_access(di);
+bdrv_drained_end(di->bs);
 break;
 }
 source_bs = di->fleecing.snapshot_access;
@@ -637,6 +648,7 @@ static void create_backup_jobs_bh(void *opaque) {
 }
 
 if (!job || local_err) {
+cleanup_snapshot_access(di);
 error_setg(errp, "backup_job_create failed: %s",
local_err ? error_get_pretty(local_err) : "null");
 break;
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH qemu v2 03/25] PVE backup: factor out setting up snapshot access for fleecing

2024-08-13 Thread Fiona Ebner
Avoids some line bloat in the create_backup_jobs_bh() function and is
in preparation for setting up the snapshot access independently of
fleecing, in particular that will be useful for providing access to
the snapshot via NBD.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 pve-backup.c | 95 
 1 file changed, 58 insertions(+), 37 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index c4178758b3..051ebffe48 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -525,6 +525,62 @@ static int coroutine_fn pvebackup_co_add_config(
 goto out;
 }
 
+/*
+ * Setup a snapshot-access block node for a device with associated fleecing 
image.
+ */
+static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp)
+{
+Error *local_err = NULL;
+
+if (!di->fleecing.bs) {
+error_setg(errp, "no associated fleecing image");
+return -1;
+}
+
+QDict *cbw_opts = qdict_new();
+qdict_put_str(cbw_opts, "driver", "copy-before-write");
+qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
+qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs));
+
+if (di->bitmap) {
+/*
+ * Only guest writes to parts relevant for the backup need to be 
intercepted with
+ * old data being copied to the fleecing image.
+ */
+qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs));
+qdict_put_str(cbw_opts, "bitmap.name", 
bdrv_dirty_bitmap_name(di->bitmap));
+}
+/*
+ * Fleecing storage is supposed to be fast and it's better to break backup 
than guest
+ * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout 
by default, so
+ * abort a bit before that.
+ */
+qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
+qdict_put_int(cbw_opts, "cbw-timeout", 45);
+
+di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, 
&local_err);
+
+if (!di->fleecing.cbw) {
+error_setg(errp, "appending cbw node for fleecing failed: %s",
+   local_err ? error_get_pretty(local_err) : "unknown error");
+return -1;
+}
+
+QDict *snapshot_access_opts = qdict_new();
+qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
+qdict_put_str(snapshot_access_opts, "file", 
bdrv_get_node_name(di->fleecing.cbw));
+
+di->fleecing.snapshot_access =
+bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | 
BDRV_O_UNMAP, &local_err);
+if (!di->fleecing.snapshot_access) {
+error_setg(errp, "setting up snapshot access for fleecing failed: %s",
+   local_err ? error_get_pretty(local_err) : "unknown error");
+return -1;
+}
+
+return 0;
+}
+
 /*
  * backup_job_create can *not* be run from a coroutine, so this can't either.
  * The caller is responsible that backup_mutex is held nonetheless.
@@ -569,49 +625,14 @@ static void create_backup_jobs_bh(void *opaque) {
 const char *job_id = bdrv_get_device_name(di->bs);
 bdrv_graph_co_rdunlock();
 if (di->fleecing.bs) {
-QDict *cbw_opts = qdict_new();
-qdict_put_str(cbw_opts, "driver", "copy-before-write");
-qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
-qdict_put_str(cbw_opts, "target", 
bdrv_get_node_name(di->fleecing.bs));
-
-if (di->bitmap) {
-/*
- * Only guest writes to parts relevant for the backup need to 
be intercepted with
- * old data being copied to the fleecing image.
- */
-qdict_put_str(cbw_opts, "bitmap.node", 
bdrv_get_node_name(di->bs));
-qdict_put_str(cbw_opts, "bitmap.name", 
bdrv_dirty_bitmap_name(di->bitmap));
-}
-/*
- * Fleecing storage is supposed to be fast and it's better to 
break backup than guest
- * writes. Certain guest drivers like VirtIO-win have 60 seconds 
timeout by default, so
- * abort a bit before that.
- */
-qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
-qdict_put_int(cbw_opts, "cbw-timeout", 45);
-
-di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, 
&local_err);
-
-if (!di->fleecing.cbw) {
-error_setg(errp, "appending cbw node for fleecing failed: %s",
-   local_err ? error_get_pretty(local_err) : "unknown 
error");
-bdrv_drained_end(di->bs);
-break;
-}
-
-QDict *snapshot_access_opts = qdict_new();
-qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
-qdict_put_str(snapshot_access_opts, "file", 
bdrv_get_node_name(di->fleecing.cbw));
-
-di->fleecing.snapshot_access =
-bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | 
BDRV_O_UNMAP, &lo

[pve-devel] [RFC storage v2 11/25] extract backup config: delegate to backup provider if there is one

2024-08-13 Thread Fiona Ebner
Signed-off-by: Fiona Ebner 
---

Changes in v2:
* Adapt to method rename.

 src/PVE/Storage.pm | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index aea57ab..8993ba7 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -1726,6 +1726,16 @@ sub extract_vzdump_config {
storage_check_enabled($cfg, $storeid);
return PVE::Storage::PBSPlugin->extract_vzdump_config($scfg, 
$volname, $storeid);
}
+
+   my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
+   my $log_function = sub {
+   my ($log_level, $message) = @_;
+   my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level);
+   print "$prefix: $message\n";
+   };
+   if (my $backup_provider = $plugin->new_backup_provider($scfg, $storeid, 
$log_function)) {
+   return $backup_provider->restore_get_guest_config($volname, 
$storeid);
+   }
 }
 
 my $archive = abs_filesystem_path($cfg, $volid);
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH qemu v2 04/25] PVE backup: save device name in device info structure

2024-08-13 Thread Fiona Ebner
The device name needs to be queried while holding the graph read lock
and since it doesn't change during the whole operation, just get it
once during setup and avoid the need to query it again in different
places.

Also in preparation to use it more often in error messages and for the
upcoming external backup access API.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 pve-backup.c | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index 051ebffe48..33c23e53c2 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -94,6 +94,7 @@ typedef struct PVEBackupDevInfo {
 size_t size;
 uint64_t block_size;
 uint8_t dev_id;
+char* device_name;
 int completed_ret; // INT_MAX if not completed
 BdrvDirtyBitmap *bitmap;
 BlockDriverState *target;
@@ -327,6 +328,8 @@ static void coroutine_fn pvebackup_co_complete_stream(void 
*opaque)
 }
 
 di->bs = NULL;
+g_free(di->device_name);
+di->device_name = NULL;
 
 assert(di->target == NULL);
 
@@ -621,9 +624,6 @@ static void create_backup_jobs_bh(void *opaque) {
 
 BlockDriverState *source_bs = di->bs;
 bool discard_source = false;
-bdrv_graph_co_rdlock();
-const char *job_id = bdrv_get_device_name(di->bs);
-bdrv_graph_co_rdunlock();
 if (di->fleecing.bs) {
 if (setup_snapshot_access(di, &local_err) < 0) {
 error_setg(errp, "setting up snapshot access for fleecing 
failed: %s",
@@ -654,7 +654,7 @@ static void create_backup_jobs_bh(void *opaque) {
 }
 
 BlockJob *job = backup_job_create(
-job_id, source_bs, di->target, backup_state.speed, sync_mode, 
di->bitmap,
+di->device_name, source_bs, di->target, backup_state.speed, 
sync_mode, di->bitmap,
 bitmap_mode, false, discard_source, NULL, &perf, 
BLOCKDEV_ON_ERROR_REPORT,
 BLOCKDEV_ON_ERROR_REPORT, JOB_DEFAULT, pvebackup_complete_cb, di, 
backup_state.txn,
 &local_err);
@@ -751,6 +751,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
 }
 PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
 di->bs = bs;
+di->device_name = g_strdup(bdrv_get_device_name(bs));
 
 if (fleecing && device_uses_fleecing(*d)) {
 g_autofree gchar *fleecing_devid = g_strconcat(*d, 
"-fleecing", NULL);
@@ -789,6 +790,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
 
 PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
 di->bs = bs;
+di->device_name = g_strdup(bdrv_get_device_name(bs));
 di_list = g_list_append(di_list, di);
 }
 }
@@ -956,9 +958,6 @@ UuidInfo coroutine_fn *qmp_backup(
 
 di->block_size = dump_cb_block_size;
 
-bdrv_graph_co_rdlock();
-const char *devname = bdrv_get_device_name(di->bs);
-bdrv_graph_co_rdunlock();
 PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
 size_t dirty = di->size;
 
@@ -973,7 +972,8 @@ UuidInfo coroutine_fn *qmp_backup(
 }
 action = PBS_BITMAP_ACTION_NEW;
 } else {
-expect_only_dirty = proxmox_backup_check_incremental(pbs, 
devname, di->size) != 0;
+expect_only_dirty =
+proxmox_backup_check_incremental(pbs, di->device_name, 
di->size) != 0;
 }
 
 if (expect_only_dirty) {
@@ -997,7 +997,8 @@ UuidInfo coroutine_fn *qmp_backup(
 }
 }
 
-int dev_id = proxmox_backup_co_register_image(pbs, devname, 
di->size, expect_only_dirty, errp);
+int dev_id = proxmox_backup_co_register_image(pbs, 
di->device_name, di->size,
+  expect_only_dirty, 
errp);
 if (dev_id < 0) {
 goto err_mutex;
 }
@@ -1009,7 +1010,7 @@ UuidInfo coroutine_fn *qmp_backup(
 di->dev_id = dev_id;
 
 PBSBitmapInfo *info = g_malloc(sizeof(*info));
-info->drive = g_strdup(devname);
+info->drive = g_strdup(di->device_name);
 info->action = action;
 info->size = di->size;
 info->dirty = dirty;
@@ -1034,10 +1035,7 @@ UuidInfo coroutine_fn *qmp_backup(
 goto err_mutex;
 }
 
-bdrv_graph_co_rdlock();
-const char *devname = bdrv_get_device_name(di->bs);
-bdrv_graph_co_rdunlock();
-di->dev_id = vma_writer_register_stream(vmaw, devname, di->size);
+di->dev_id = vma_writer_register_stream(vmaw, di->device_name, 
di->size);
 if (di->dev_id <= 0) {
 error_set(errp, ERROR_CLASS_GENERIC_ERROR,
   "register_stream failed");
@@ -1148,6 +1146,9 @@ err:
 

[pve-devel] [PATCH qemu-server v2 14/25] move nbd_stop helper to QMPHelpers module

2024-08-13 Thread Fiona Ebner
Like this nbd_stop() can be called from a module that cannot include
QemuServer.pm.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 PVE/API2/Qemu.pm | 3 ++-
 PVE/CLI/qm.pm| 3 ++-
 PVE/QemuServer.pm| 6 --
 PVE/QemuServer/QMPHelpers.pm | 6 ++
 4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index d25a79fe..212cfc1b 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -35,6 +35,7 @@ use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
 use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer::PCI;
+use PVE::QemuServer::QMPHelpers;
 use PVE::QemuServer::USB;
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
@@ -5910,7 +5911,7 @@ __PACKAGE__->register_method({
return;
},
'nbdstop' => sub {
-   PVE::QemuServer::nbd_stop($state->{vmid});
+   PVE::QemuServer::QMPHelpers::nbd_stop($state->{vmid});
return;
},
'resume' => sub {
diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
index d3dbf7b4..8349997e 100755
--- a/PVE/CLI/qm.pm
+++ b/PVE/CLI/qm.pm
@@ -35,6 +35,7 @@ use PVE::QemuServer::Agent qw(agent_available);
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::OVF;
+use PVE::QemuServer::QMPHelpers;
 use PVE::QemuServer;
 
 use PVE::CLIHandler;
@@ -385,7 +386,7 @@ __PACKAGE__->register_method ({
 
my $vmid = $param->{vmid};
 
-   eval { PVE::QemuServer::nbd_stop($vmid) };
+   eval { PVE::QemuServer::QMPHelpers::nbd_stop($vmid) };
warn $@ if $@;
 
return;
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index b26da505..e5ff5efb 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -8548,12 +8548,6 @@ sub generate_smbios1_uuid {
 return "uuid=".generate_uuid();
 }
 
-sub nbd_stop {
-my ($vmid) = @_;
-
-mon_cmd($vmid, 'nbd-server-stop', timeout => 25);
-}
-
 sub create_reboot_request {
 my ($vmid) = @_;
 open(my $fh, '>', "/run/qemu-server/$vmid.reboot")
diff --git a/PVE/QemuServer/QMPHelpers.pm b/PVE/QemuServer/QMPHelpers.pm
index 0269ea46..826938de 100644
--- a/PVE/QemuServer/QMPHelpers.pm
+++ b/PVE/QemuServer/QMPHelpers.pm
@@ -15,6 +15,12 @@ qemu_objectadd
 qemu_objectdel
 );
 
+sub nbd_stop {
+my ($vmid) = @_;
+
+mon_cmd($vmid, 'nbd-server-stop', timeout => 25);
+}
+
 sub qemu_deviceadd {
 my ($vmid, $devicefull) = @_;
 
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH qemu-server v2 17/25] backup: keep track of block-node size instead of volume size

2024-08-13 Thread Fiona Ebner
For fleecing, the size needs to match exactly with what QEMU sees. In
particular, EFI disks might be attached with a 'size=' option, meaning
that size can be different from the volume's size. Commit 36377acf
("backup: disk info: also keep track of size") introduced size
tracking and it was only used for fleecing since then, so replace the
existing 'size' key in the device info hash and replace it with an
explicit 'block-node-size' for clarity.

Should also help with the following issue reported in the community
forum:
https://forum.proxmox.com/threads/152202

Fixes: 36377acf ("backup: disk info: also keep track of size")
Signed-off-by: Fiona Ebner 
---

New in v2.

 PVE/VZDump/QemuServer.pm | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index c46e607c..98685127 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -106,6 +106,9 @@ sub prepare {
 
 PVE::Storage::activate_volumes($self->{storecfg}, $vollist);
 
+my $block_info = mon_cmd($vmid, "query-block");
+$block_info = { map { $_->{device} => $_ } $block_info->@* };
+
 foreach my $ds (sort keys %$drivehash) {
my $drive = $drivehash->{$ds};
 
@@ -133,11 +136,22 @@ sub prepare {
die "cannot determine size and format of volume '$volid' - $@\n" if 
$@;
}
 
+   # The size for fleecing images needs to be exactly the same size as 
QEMU sees. E.g. EFI disk
+   # can be attached with a smaller size then the underyling image on the 
storage.
+   my $block_node_size =
+   eval { 
$block_info->{"drive-$ds"}->{inserted}->{image}->{'virtual-size'}; };
+   if (!$block_node_size) {
+   # TPM state is not attached yet and will be attached with same 
size, so don't warn then.
+   $self->loginfo("could not determine block node size of drive '$ds' 
- using fallback")
+   if $ds !~ m/^tpmstate\d+/;
+   $block_node_size = $size;
+   }
+
my $diskinfo = {
path => $path,
volid => $volid,
storeid => $storeid,
-   size => $size,
+   'block-node-size' => $block_node_size,
format => $format,
virtdev => $ds,
qmdevice => "drive-$ds",
@@ -551,7 +565,7 @@ my sub allocate_fleecing_images {
my $name = "vm-$vmid-fleece-$n";
$name .= ".$format" if $scfg->{path};
 
-   my $size = PVE::Tools::convert_size($di->{size}, 'b' => 'kb');
+   my $size = PVE::Tools::convert_size($di->{'block-node-size'}, 
'b' => 'kb');
 
$di->{'fleece-volid'} = PVE::Storage::vdisk_alloc(
$self->{storecfg}, $fleecing_storeid, $vmid, $format, 
$name, $size);
@@ -600,7 +614,7 @@ my sub attach_fleecing_images {
my $drive = 
"file=$path,if=none,id=$devid,format=$format,discard=unmap";
# Specify size explicitly, to make it work if storage backend 
rounded up size for
# fleecing image when allocating.
-   $drive .= ",size=$di->{size}" if $format eq 'raw';
+   $drive .= ",size=$di->{'block-node-size'}" if $format eq 'raw';
$drive =~ s/\\//g;
my $ret = PVE::QemuServer::Monitor::hmp_cmd($vmid, "drive_add auto 
\"$drive\"", 60);
die "attaching fleecing image $volid failed - $ret\n" if $ret !~ 
m/OK/s;
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [RFC container v2 22/25] backup: implement backup for external providers

2024-08-13 Thread Fiona Ebner
The filesystem structure is made available as a directory in a
consistent manner (with details depending on the vzdump backup mode)
just like for regular backup via tar.

The backup provider needs to back up the guest and firewall
configuration and then the filesystem structure, honoring the ID maps
(for unprivileged containers) as well as file exclusions and the
bandwidth limit.

Signed-off-by: Fiona Ebner 
---

Changes in v2:
* Adapt to API changes.

 src/PVE/VZDump/LXC.pm | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/PVE/VZDump/LXC.pm b/src/PVE/VZDump/LXC.pm
index 67d13db..0fc2a94 100644
--- a/src/PVE/VZDump/LXC.pm
+++ b/src/PVE/VZDump/LXC.pm
@@ -373,7 +373,27 @@ sub archive {
 my $userns_cmd = $task->{userns_cmd};
 my $findexcl = $self->{vzdump}->{findexcl};
 
-if ($self->{vzdump}->{opts}->{pbs}) {
+if (my $backup_provider = $self->{vzdump}->{'backup-provider'}) {
+   $self->loginfo("starting external backup via " . 
$backup_provider->provider_name());
+
+   my ($mechanism) = $backup_provider->backup_get_mechanism($vmid, 'lxc');
+   die "mechanism '$mechanism' requested by backup provider is not 
supported for containers\n"
+   if $mechanism ne 'directory';
+
+   my $config_file = "$tmpdir/etc/vzdump/pct.conf";
+   my $firewall_file = "$tmpdir/etc/vzdump/pct.fw";
+
+
+   my $conf = PVE::LXC::Config->load_config($vmid);
+   my ($id_map, undef, undef) = PVE::LXC::parse_id_maps($conf);
+   my $info = {
+   directory => $snapdir,
+   sources => [@sources],
+   };
+   $info->{'firewall-config'} = $firewall_file if -e $firewall_file;
+   $info->{'bandwidth-limit'} = $opts->{bwlimit} * 1024 if 
$opts->{bwlimit};
+   $backup_provider->backup_container($vmid, $config_file, $id_map, 
$findexcl, $info);
+} elsif ($self->{vzdump}->{opts}->{pbs}) {
 
my $param = [];
push @$param, "pct.conf:$tmpdir/etc/vzdump/pct.conf";
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [RFC qemu/storage/qemu-server/container/manager v2 00/25] backup provider API

2024-08-13 Thread Fiona Ebner
Changes in v2:
* Add 'block-device' backup mechansim for VMs. The NBD export is
  mounted by Proxmox VE and only the block device path (as well as a
  callback to get the next dirty range for bitmaps) is passed to the
  backup provider.
* Add POC example for Borg - note that I tested with borg 1.2.4 in
  Debian and only tested with a local repository, not SSH yet.
* Merge hook API into a single function for backup and for jobs.
* Add restore_vm_init() and restore_vm_cleanup() for better
  flexibility to allow preparing the whole restore. Question is
  if restore_vm_volume_init() and restore_vm_volume_cleanup() should
  be dropped (but certain providers might prefer using only those)?
  Having both is more flexible, but makes the API longer of course.
* Switch to backup_vm() (was per-volume backup_vm_volume() before) and
  backup_container(), passing along the configuration files, rather
  than having dedicated methods for the configuration files, for
  giving the backup provider more flexibility.
* Some renames in API methods/params to improve clarity.
* Pass backup time to backup 'start' hook and use that in the
  directory example rather than the job start time.
* Use POD for base plugin documentation and flesh out documentation.
* Use 'BackupProvider::Plugin::' namespace.
* Various smaller improvements in the directory provider example.

==

A backup provider needs to implement a storage plugin as well as a
backup provider plugin. The storage plugin is for integration in
Proxmox VE's front-end, so users can manage the backups via
UI/API/CLI. The backup provider plugin is for interfacing with the
backup provider's backend to integrate backup and restore with that
backend into Proxmox VE.

This is an initial draft of an API and required changes to the backup
stack in Proxmox VE to make it work. Depending on feedback from other
developers and interested parties, it can still substantially change.

==

The backup provider API is split into two parts, both of which again
need different implementations for VM and LXC guests:

1. Backup API

There are two hook callback functions, namely:
1. job_hook() is called during the start/end/abort phases of the
   whole backup job.
2. backup_hook() is called during the start/end/abort phases of the
   backup of an individual guest.

The backup_get_mechanism() method is used to decide on the backup
mechanism. Currently, 'block-device' or 'nbd' for VMs, and 'directory'
for containers is possible. The method also let's the plugin indicate
whether to use a bitmap for incremental VM backup or not. It is enough
to implement one mechanism for VMs and one mechanism for containers.

Next, there are methods for backing up the guest's configuration and
data, backup_vm() for VM backup and backup_container() for container
backup.

Finally, some helpers like getting the provider name or volume ID for
the backup target, as well as for handling the backup log.

1.1 Backup Mechanisms

VM:

Access to the data on the VM's disk from the time the backup started
is made available via a so-called "snapshot access". This is either
the full image, or in case a bitmap is used, the dirty parts of the
image since the last time the bitmap was used for a successful backup.
Reading outside of the dirty parts will result in an error. After
backing up each part of the disk, it should be discarded in the export
to avoid unnecessary space usage on the Proxmox VE side (there is an
associated fleecing image).

VM mechanism 'block-device':

The snapshot access is exposed as a block device. If used, a bitmap is
passed along.

VM mechanism 'nbd':

The snapshot access and, if used, bitmap are exported via NBD.

Container mechanism 'directory':

A copy or snapshot of the container's filesystem state is made
available as a directory.

2. Restore API

The restore_get_mechanism() method is used to decide on the restore
mechanism. Currently, 'qemu-img' for VMs, and 'directory' or 'tar' for
containers are possible. It is enough to implement one mechanism for
VMs and one mechanism for containers.

Next, methods for extracting the guest and firewall configuration and
the implementations of the restore mechanism via a pair of methods: an
init method, for making the data available to Proxmox VE and a cleanup
method that is called after restore.

For VMs, there also is a restore_vm_get_device_info() helper required,
to get the disks included in the backup and their sizes.

2.1. Restore Mechanisms

VM mechanism 'qemu-img':

The backup provider gives a path to the disk image that will be
restored. The path needs to be something 'qemu-img' can deal with,
e.g. can also be an NBD URI or similar.

Container mechanism 'directory':

The backup provider gives the path to a directory with the full
filesystem structure of the container.

Container mechanism 'directory':

The backup provider gives the path to a (potentially compressed) tar
archive with the full filesystem structure of the container.

See the PVE::BackupProvi

[pve-devel] [RFC qemu-server v2 21/25] backup: implement restore for external providers

2024-08-13 Thread Fiona Ebner
First, the provider is asked about what restore mechanism to use.
Currently, only 'qemu-img' is possible. Then the configuration files
are restored, the provider gives information about volumes contained
in the backup and finally the volumes are restored via
'qemu-img convert'.

The code for the restore_external_archive() function was copied and
adapted from the restore_proxmox_backup_archive() function. Together
with restore_vma_archive() it seems sensible to extract the common
parts and use a dedicated module for restore code.

The parse_restore_archive() helper was renamed, because it's not just
parsing.

Signed-off-by: Fiona Ebner 
---

Changes in v2:
* Adapt to API changes.

 PVE/API2/Qemu.pm  |  29 +-
 PVE/QemuServer.pm | 139 ++
 2 files changed, 165 insertions(+), 3 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 212cfc1b..8917905e 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -845,7 +845,7 @@ __PACKAGE__->register_method({
return $res;
 }});
 
-my $parse_restore_archive = sub {
+my $classify_restore_archive = sub {
 my ($storecfg, $archive) = @_;
 
 my ($archive_storeid, $archive_volname) = 
PVE::Storage::parse_volume_id($archive, 1);
@@ -859,6 +859,21 @@ my $parse_restore_archive = sub {
$res->{type} = 'pbs';
return $res;
}
+   my $log_function = sub {
+   my ($log_level, $message) = @_;
+   my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level);
+   print "$prefix: $message\n";
+   };
+   my $backup_provider = PVE::Storage::new_backup_provider(
+   $storecfg,
+   $archive_storeid,
+   $log_function,
+   );
+   if ($backup_provider) {
+   $res->{type} = 'external';
+   $res->{'backup-provider'} = $backup_provider;
+   return $res;
+   }
 }
 my $path = PVE::Storage::abs_filesystem_path($storecfg, $archive);
 $res->{type} = 'file';
@@ -1011,7 +1026,7 @@ __PACKAGE__->register_method({
'backup',
);
 
-   $archive = $parse_restore_archive->($storecfg, $archive);
+   $archive = $classify_restore_archive->($storecfg, $archive);
}
}
 
@@ -1069,7 +1084,15 @@ __PACKAGE__->register_method({
PVE::QemuServer::check_restore_permissions($rpcenv, 
$authuser, $merged);
}
}
-   if ($archive->{type} eq 'file' || $archive->{type} eq 'pipe') {
+   if (my $backup_provider = $archive->{'backup-provider'}) {
+   PVE::QemuServer::restore_external_archive(
+   $backup_provider,
+   $archive->{volid},
+   $vmid,
+   $authuser,
+   $restore_options,
+   );
+   } elsif ($archive->{type} eq 'file' || $archive->{type} eq 
'pipe') {
die "live-restore is only compatible with backup images 
from a Proxmox Backup Server\n"
if $live_restore;
PVE::QemuServer::restore_file_archive($archive->{path} // 
'-', $vmid, $authuser, $restore_options);
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 37f56f69..6cd21b7d 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -7245,6 +7245,145 @@ sub restore_proxmox_backup_archive {
 }
 }
 
+sub restore_external_archive {
+my ($backup_provider, $archive, $vmid, $user, $options) = @_;
+
+die "live restore from backup provider is not implemented\n" if 
$options->{live};
+
+my $storecfg = PVE::Storage::config();
+
+my ($storeid, $volname) = PVE::Storage::parse_volume_id($archive);
+my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
+
+my $tmpdir = "/var/tmp/vzdumptmp$$";
+rmtree $tmpdir;
+mkpath $tmpdir;
+
+my $conffile = PVE::QemuConfig->config_file($vmid);
+# disable interrupts (always do cleanups)
+local $SIG{INT} =
+   local $SIG{TERM} =
+   local $SIG{QUIT} =
+   local $SIG{HUP} = sub { print STDERR "got interrupt - ignored\n"; };
+
+# Note: $oldconf is undef if VM does not exists
+my $cfs_path = PVE::QemuConfig->cfs_config_path($vmid);
+my $oldconf = PVE::Cluster::cfs_read_file($cfs_path);
+my $new_conf_raw = '';
+
+my $rpcenv = PVE::RPCEnvironment::get();
+my $devinfo = {}; # info about drives included in backup
+my $virtdev_hash = {}; # info about allocated drives
+
+eval {
+   # enable interrupts
+   local $SIG{INT} =
+   local $SIG{TERM} =
+   local $SIG{QUIT} =
+   local $SIG{HUP} =
+   local $SIG{PIPE} = sub { die "interrupted by signal\n"; };
+
+   my $cfgfn = "$tmpdir/qemu-server.conf";
+   my $firewall_config_fn = "$tmpdir/fw.conf";
+
+   my $cmd = "restore";
+
+   my ($mechanism, $vmtype) =
+   $backup

[pve-devel] [RFC manager v2 25/25] backup: implement backup for external providers

2024-08-13 Thread Fiona Ebner
Hooks from the backup provider are called during start/end/abort for
both job and backup. And it is necessary to adapt some log messages
and special case some things like is already done for PBS, e.g. log
file handling.

Signed-off-by: Fiona Ebner 
---

Changes in v2:
* Adapt to API changes.

 PVE/VZDump.pm   | 62 -
 test/vzdump_new_test.pl |  3 ++
 2 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/PVE/VZDump.pm b/PVE/VZDump.pm
index f1a6b220..2b3e7b1c 100644
--- a/PVE/VZDump.pm
+++ b/PVE/VZDump.pm
@@ -206,7 +206,15 @@ sub storage_info {
 $info->{'prune-backups'} = 
PVE::JSONSchema::parse_property_string('prune-backups', 
$scfg->{'prune-backups'})
if defined($scfg->{'prune-backups'});
 
-if ($type eq 'pbs') {
+my $backup_provider = PVE::Storage::new_backup_provider(
+   $cfg,
+   $storage,
+   sub { debugmsg($_[0], $_[1]); },
+);
+
+if ($backup_provider) {
+   $info->{'backup-provider'} = $backup_provider
+} elsif ($type eq 'pbs') {
$info->{pbs} = 1;
 } else {
$info->{dumpdir} = PVE::Storage::get_backup_dir($cfg, $storage);
@@ -706,6 +714,7 @@ sub new {
$opts->{scfg} = $info->{scfg};
$opts->{pbs} = $info->{pbs};
$opts->{'prune-backups'} //= $info->{'prune-backups'};
+   $self->{'backup-provider'} = $info->{'backup-provider'} if 
$info->{'backup-provider'};
}
 } elsif ($opts->{dumpdir}) {
$add_error->("dumpdir '$opts->{dumpdir}' does not exist")
@@ -990,7 +999,7 @@ sub exec_backup_task {
}
}
 
-   if (!$self->{opts}->{pbs}) {
+   if (!$self->{opts}->{pbs} && !$self->{'backup-provider'}) {
$task->{logfile} = "$opts->{dumpdir}/$basename.log";
}
 
@@ -1000,7 +1009,11 @@ sub exec_backup_task {
$ext .= ".${comp_ext}";
}
 
-   if ($self->{opts}->{pbs}) {
+   if ($self->{'backup-provider'}) {
+   die "unable to pipe backup to stdout\n" if $opts->{stdout};
+   $task->{target} = 
$self->{'backup-provider'}->backup_get_archive_name(
+   $vmid, $vmtype, $task->{backup_time});
+   } elsif ($self->{opts}->{pbs}) {
die "unable to pipe backup to stdout\n" if $opts->{stdout};
$task->{target} = $pbs_snapshot_name;
} else {
@@ -1018,7 +1031,7 @@ sub exec_backup_task {
my $pid = $$;
if ($opts->{tmpdir}) {
$task->{tmpdir} = "$opts->{tmpdir}/vzdumptmp${pid}_$vmid/";
-   } elsif ($self->{opts}->{pbs}) {
+   } elsif ($self->{opts}->{pbs} || $self->{'backup-provider'}) {
$task->{tmpdir} = "/var/tmp/vzdumptmp${pid}_$vmid";
} else {
# dumpdir is posix? then use it as temporary dir
@@ -1090,6 +1103,10 @@ sub exec_backup_task {
if ($mode eq 'stop') {
$plugin->prepare ($task, $vmid, $mode);
 
+   if ($self->{'backup-provider'}) {
+   $self->{'backup-provider'}->backup_hook(
+   'start', $vmid, $vmtype, { 'start-time' => 
$task->{backup_time} });
+   }
$self->run_hook_script ('backup-start', $task, $logfd);
 
if ($running) {
@@ -1104,6 +1121,10 @@ sub exec_backup_task {
} elsif ($mode eq 'suspend') {
$plugin->prepare ($task, $vmid, $mode);
 
+   if ($self->{'backup-provider'}) {
+   $self->{'backup-provider'}->backup_hook(
+   'start', $vmid, $vmtype, { 'start-time' => 
$task->{backup_time} });
+   }
$self->run_hook_script ('backup-start', $task, $logfd);
 
if ($vmtype eq 'lxc') {
@@ -1130,6 +1151,10 @@ sub exec_backup_task {
}
 
} elsif ($mode eq 'snapshot') {
+   if ($self->{'backup-provider'}) {
+   $self->{'backup-provider'}->backup_hook(
+   'start', $vmid, $vmtype, { 'start-time' => 
$task->{backup_time} });
+   }
$self->run_hook_script ('backup-start', $task, $logfd);
 
my $snapshot_count = $task->{snapshot_count} || 0;
@@ -1172,11 +1197,13 @@ sub exec_backup_task {
return;
}
 
-   my $archive_txt = $self->{opts}->{pbs} ? 'Proxmox Backup Server' : 
'vzdump';
+   my $archive_txt = 'vzdump';
+   $archive_txt = 'Proxmox Backup Server' if $self->{opts}->{pbs};
+   $archive_txt = $self->{'backup-provider'}->provider_name() if 
$self->{'backup-provider'};
debugmsg('info', "creating $archive_txt archive '$task->{target}'", 
$logfd);
$plugin->archive($task, $vmid, $task->{tmptar}, $comp);
 
-   if ($self->{opts}->{pbs}) {
+   if ($self->{'backup-provider'} || $self->{opts}->{pbs}) {
# size is added to task struct in guest vzdump plugins
} else {
rename ($task->{tmptar}, $task->{target}) ||
@@ -1190,7 +1217,8 @@ sub exec_backup_task {
 
# Mark as protected before pruning.
if (my 

[pve-devel] [POC storage v2 12/25] add backup provider example

2024-08-13 Thread Fiona Ebner
The example uses a simple directory structure to save the backups,
grouped by guest ID. VM backups are saved as configuration files and
qcow2 images, with backing files when doing incremental backups.
Container backups are saved as configuration files and a tar file or
squashfs image (added to test the 'directory' restore mechanism).

Whether to use incremental VM backups and which backup mechanisms to
use can be configured in the storage configuration.

The 'nbdinfo' binary from the 'libnbd-bin' package is required for
backup mechanism 'nbd' for VM backups, the 'mksquashfs' binary from
the 'squashfs-tools' package is required for backup mechanism
'squashfs' for containers.

Signed-off-by: Fiona Ebner 
---

Changes in v2:
* Adapt to API changes.
* Add log function helpers.
* Use make_path and remove_tree instead of deprecated variants.
* Add support for 'block-device' backup mechanism for VMs.
* Use backup time for guest rather than the job start time in the
  archive name.

 .../BackupProvider/Plugin/DirectoryExample.pm | 694 ++
 src/PVE/BackupProvider/Plugin/Makefile|   2 +-
 .../Custom/BackupProviderDirExamplePlugin.pm  | 306 
 src/PVE/Storage/Custom/Makefile   |   5 +
 src/PVE/Storage/Makefile  |   1 +
 5 files changed, 1007 insertions(+), 1 deletion(-)
 create mode 100644 src/PVE/BackupProvider/Plugin/DirectoryExample.pm
 create mode 100644 src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
 create mode 100644 src/PVE/Storage/Custom/Makefile

diff --git a/src/PVE/BackupProvider/Plugin/DirectoryExample.pm 
b/src/PVE/BackupProvider/Plugin/DirectoryExample.pm
new file mode 100644
index 000..b01ad02
--- /dev/null
+++ b/src/PVE/BackupProvider/Plugin/DirectoryExample.pm
@@ -0,0 +1,694 @@
+package PVE::BackupProvider::Plugin::DirectoryExample;
+
+use strict;
+use warnings;
+
+use Fcntl qw(SEEK_SET);
+use File::Path qw(make_path remove_tree);
+use IO::File;
+use IPC::Open3;
+
+use PVE::Storage::Plugin;
+use PVE::Tools qw(file_get_contents file_read_firstline file_set_contents 
run_command);
+
+use base qw(PVE::BackupProvider::Plugin::Base);
+
+use constant {
+BLKDISCARD => 0x1277, # see linux/fs.h
+};
+
+# Private helpers
+
+my sub log_info {
+my ($self, $message) = @_;
+
+$self->{'log-function'}->('info', $message);
+}
+
+my sub log_warning {
+my ($self, $message) = @_;
+
+$self->{'log-function'}->('warn', $message);
+}
+
+my sub log_error {
+my ($self, $message) = @_;
+
+$self->{'log-function'}->('err', $message);
+}
+
+# Try to use the same bitmap ID as last time for incremental backup if the 
storage is configured for
+# incremental VM backup. Need to start fresh if there is no previous ID or the 
associated backup
+# doesn't exist.
+my sub get_bitmap_id {
+my ($self, $vmid, $vmtype) = @_;
+
+return if $self->{'storage-plugin'}->get_vm_backup_mode($self->{scfg}) ne 
'incremental';
+
+my $previous_info_dir = "$self->{scfg}->{path}/$vmid/";
+
+my $previous_info_file = "$previous_info_dir/previous-info";
+my $info = file_read_firstline($previous_info_file) // '';
+$self->{$vmid}->{'old-previous-info'} = $info;
+my ($bitmap_id, $previous_backup_id) = $info =~ m/^(\d+)\s+(\d+)$/;
+my $previous_backup_dir =
+   $previous_backup_id ? 
"$self->{scfg}->{path}/$vmid/$vmtype-$previous_backup_id" : undef;
+
+if ($bitmap_id && -d $previous_backup_dir) {
+   $self->{$vmid}->{'previous-backup-dir'} = $previous_backup_dir;
+} else {
+   # need to start fresh if there is no previous ID or the associated 
backup doesn't exist
+   $bitmap_id = $self->{$vmid}->{'backup-time'};
+}
+
+$self->{$vmid}->{'bitmap-id'} = $bitmap_id;
+make_path($previous_info_dir);
+die "unable to create directory $previous_info_dir\n" if !-d 
$previous_info_dir;
+file_set_contents($previous_info_file, "$bitmap_id 
$self->{$vmid}->{'backup-time'}");
+
+return $bitmap_id;
+}
+
+# Backup Provider API
+
+sub new {
+my ($class, $storage_plugin, $scfg, $storeid, $log_function) = @_;
+
+my $self = bless {
+   scfg => $scfg,
+   storeid => $storeid,
+   'storage-plugin' => $storage_plugin,
+   'log-function' => $log_function,
+}, $class;
+
+return $self;
+}
+
+sub provider_name {
+my ($self) = @_;
+
+return 'dir provider example';
+}
+
+# Hooks
+
+my sub job_start {
+my ($self, $start_time) = @_;
+
+log_info($self, "job start hook called");
+
+run_command(["modprobe", "nbd"]);
+
+log_info($self, "backup provider initialized successfully for new job 
$start_time");
+}
+
+sub job_hook {
+my ($self, $phase, $info) = @_;
+
+if ($phase eq 'start') {
+   job_start($self, $info->{'start-time'});
+} elsif ($phase eq 'end') {
+   log_info($self, "job end hook called");
+} elsif ($phase eq 'abort') {
+   log_info($self, "job abort hook called with error - $info->{error}");
+}
+
+# ignore unknown p

[pve-devel] [RFC storage v2 10/25] plugin: introduce new_backup_provider() method

2024-08-13 Thread Fiona Ebner
The new_backup_provider() method can be used by storage plugins for
external backup providers. If the method returns a provider, Proxmox
VE will use callbacks to that provider for backups and restore instead
of using its usual backup/restore mechanisms.

API age and version are both bumped.

The backup provider API is split into two parts, both of which again
need different implementations for VM and LXC guests:

1. Backup API

There are two hook callback functions, namely:
1. job_hook() is called during the start/end/abort phases of the
   whole backup job.
2. backup_hook() is called during the start/end/abort phases of the
   backup of an individual guest.

The backup_get_mechanism() method is used to decide on the backup
mechanism. Currently, 'block-device' or 'nbd' for VMs, and 'directory'
for containers is possible. The method also let's the plugin indicate
whether to use a bitmap for incremental VM backup or not. It is enough
to implement one mechanism for VMs and one mechanism for containers.

Next, there are methods for backing up the guest's configuration and
data, backup_vm() for VM backup and backup_container() for container
backup.

Finally, some helpers like getting the provider name or volume ID for
the backup target, as well as for handling the backup log.

1.1 Backup Mechanisms

VM:

Access to the data on the VM's disk from the time the backup started
is made available via a so-called "snapshot access". This is either
the full image, or in case a bitmap is used, the dirty parts of the
image since the last time the bitmap was used for a successful backup.
Reading outside of the dirty parts will result in an error. After
backing up each part of the disk, it should be discarded in the export
to avoid unnecessary space usage on the Proxmox VE side (there is an
associated fleecing image).

VM mechanism 'block-device':

The snapshot access is exposed as a block device. If used, a bitmap is
passed along.

VM mechanism 'nbd':

The snapshot access and, if used, bitmap are exported via NBD.

Container mechanism 'directory':

A copy or snapshot of the container's filesystem state is made
available as a directory.

2. Restore API

The restore_get_mechanism() method is used to decide on the restore
mechanism. Currently, 'qemu-img' for VMs, and 'directory' or 'tar' for
containers are possible. It is enough to implement one mechanism for
VMs and one mechanism for containers.

Next, methods for extracting the guest and firewall configuration and
the implementations of the restore mechanism via a pair of methods: an
init method, for making the data available to Proxmox VE and a cleanup
method that is called after restore.

For VMs, there also is a restore_vm_get_device_info() helper required,
to get the disks included in the backup and their sizes.

2.1. Restore Mechanisms

VM mechanism 'qemu-img':

The backup provider gives a path to the disk image that will be
restored. The path needs to be something 'qemu-img' can deal with,
e.g. can also be an NBD URI or similar.

Container mechanism 'directory':

The backup provider gives the path to a directory with the full
filesystem structure of the container.

Container mechanism 'directory':

The backup provider gives the path to a (potentially compressed) tar
archive with the full filesystem structure of the container.

See the PVE::BackupProvider::Plugin module for the full API
documentation.

Signed-off-by: Fiona Ebner 
---

Changes in v2:
* Merge hook API into a single function for backup and for jobs.
* Add restore_vm_init() and restore_vm_cleanup() for better
  flexibility to allow preparing the whole restore. Question is
  if restore_vm_volume_init() and restore_vm_volume_cleanup() should
  be dropped (but certain providers might prefer using only those)?
  Having both is more flexible, but makes the API longer of course.
* Switch to backup_vm() (was per-volume backup_vm_volume() before) and
  backup_container(), passing along the configuration files, rather
  than having dedicated methods for the configuration files, for
  giving the backup provider more flexibility.
* Pass backup time to backup 'start' hook and use that in the
  directory example rather than the job start time.
* Use POD for base plugin documentation and flesh out documentation.
* Use 'BackupProvider::Plugin::' namespace.
* Rename $exclude_paths to $exclude_patterns and clarify what should
  be supported.
* Rename backup_get_target() method to backup_get_archive_name() for
  clarity.
* Rename extract_guest_config() to restore_get_guest_config() for
  better consistency with other (restore) API methods.

 src/PVE/BackupProvider/Makefile|3 +
 src/PVE/BackupProvider/Plugin/Base.pm  | 1149 
 src/PVE/BackupProvider/Plugin/Makefile |5 +
 src/PVE/Makefile   |1 +
 src/PVE/Storage.pm |   12 +-
 src/PVE/Storage/Plugin.pm  |   15 +
 6 files changed, 1183 insertions(+), 2 deletions(-)
 create mode 100644 

[pve-devel] [PATCH qemu v2 01/25] block/reqlist: allow adding overlapping requests

2024-08-13 Thread Fiona Ebner
Allow overlapping request by removing the assert that made it
impossible. There are only two callers:

1. block_copy_task_create()

It already asserts the very same condition before calling
reqlist_init_req().

2. cbw_snapshot_read_lock()

There is no need to have read requests be non-overlapping in
copy-before-write when used for snapshot-access. In fact, there was no
protection against two callers of cbw_snapshot_read_lock() calling
reqlist_init_req() with overlapping ranges and this could lead to an
assertion failure [1].

In particular, with the reproducer script below [0], two
cbw_co_snapshot_block_status() callers could race, with the second
calling reqlist_init_req() before the first one finishes and removes
its conflicting request.

[0]:

> #!/bin/bash -e
> dd if=/dev/urandom of=/tmp/disk.raw bs=1M count=1024
> ./qemu-img create /tmp/fleecing.raw -f raw 1G
> (
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev raw,node-name=node0,file.driver=file,file.filename=/tmp/disk.raw \
> --blockdev 
> raw,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.raw \
> < {"execute": "qmp_capabilities"}
> {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", 
> "file": "node0", "target": "node1", "node-name": "node3" } }
> {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", 
> "file": "node3", "node-name": "snap0" } }
> {"execute": "nbd-server-start", "arguments": {"addr": { "type": "unix", 
> "data": { "path": "/tmp/nbd.socket" } } } }
> {"execute": "block-export-add", "arguments": {"id": "exp0", "node-name": 
> "snap0", "type": "nbd", "name": "exp0"}}
> EOF
> ) &
> sleep 5
> while true; do
> ./qemu-nbd -d /dev/nbd0
> ./qemu-nbd -c /dev/nbd0 nbd:unix:/tmp/nbd.socket:exportname=exp0 -f raw -r
> nbdinfo --map 'nbd+unix:///exp0?socket=/tmp/nbd.socket'
> done

[1]:

> #5  0x71e5f0088eb2 in __GI___assert_fail (...) at ./assert/assert.c:101
> #6  0x615285438017 in reqlist_init_req (...) at ../block/reqlist.c:23
> #7  0x6152853e2d98 in cbw_snapshot_read_lock (...) at 
> ../block/copy-before-write.c:237
> #8  0x6152853e3068 in cbw_co_snapshot_block_status (...) at 
> ../block/copy-before-write.c:304
> #9  0x6152853f4d22 in bdrv_co_snapshot_block_status (...) at 
> ../block/io.c:3726
> #10 0x61528543a63e in snapshot_access_co_block_status (...) at 
> ../block/snapshot-access.c:48
> #11 0x6152853f1a0a in bdrv_co_do_block_status (...) at ../block/io.c:2474
> #12 0x6152853f2016 in bdrv_co_common_block_status_above (...) at 
> ../block/io.c:2652
> #13 0x6152853f22cf in bdrv_co_block_status_above (...) at 
> ../block/io.c:2732
> #14 0x6152853d9a86 in blk_co_block_status_above (...) at 
> ../block/block-backend.c:1473
> #15 0x61528538da6c in blockstatus_to_extents (...) at ../nbd/server.c:2374
> #16 0x61528538deb1 in nbd_co_send_block_status (...) at 
> ../nbd/server.c:2481
> #17 0x61528538f424 in nbd_handle_request (...) at ../nbd/server.c:2978
> #18 0x61528538f906 in nbd_trip (...) at ../nbd/server.c:3121
> #19 0x6152855a7caf in coroutine_trampoline (...) at 
> ../util/coroutine-ucontext.c:175

Cc: qemu-sta...@nongnu.org
Suggested-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Fiona Ebner 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---

No changes in v2.

 block/copy-before-write.c | 3 ++-
 block/reqlist.c   | 2 --
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 50cc4c7aae..a5bb4d14f6 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -67,7 +67,8 @@ typedef struct BDRVCopyBeforeWriteState {
 
 /*
  * @frozen_read_reqs: current read requests for fleecing user in bs->file
- * node. These areas must not be rewritten by guest.
+ * node. These areas must not be rewritten by guest. There can be multiple
+ * overlapping read requests.
  */
 BlockReqList frozen_read_reqs;
 
diff --git a/block/reqlist.c b/block/reqlist.c
index 08cb57cfa4..098e807378 100644
--- a/block/reqlist.c
+++ b/block/reqlist.c
@@ -20,8 +20,6 @@
 void reqlist_init_req(BlockReqList *reqs, BlockReq *req, int64_t offset,
   int64_t bytes)
 {
-assert(!reqlist_find_conflict(reqs, offset, bytes));
-
 *req = (BlockReq) {
 .offset = offset,
 .bytes = bytes,
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH qemu v2 05/25] PVE backup: include device name in error when setting up snapshot access fails

2024-08-13 Thread Fiona Ebner
Signed-off-by: Fiona Ebner 
---

No changes in v2.

 pve-backup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/pve-backup.c b/pve-backup.c
index 33c23e53c2..d931746453 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -626,7 +626,8 @@ static void create_backup_jobs_bh(void *opaque) {
 bool discard_source = false;
 if (di->fleecing.bs) {
 if (setup_snapshot_access(di, &local_err) < 0) {
-error_setg(errp, "setting up snapshot access for fleecing 
failed: %s",
+error_setg(errp, "%s - setting up snapshot access for fleecing 
failed: %s",
+   di->device_name,
local_err ? error_get_pretty(local_err) : "unknown 
error");
 cleanup_snapshot_access(di);
 bdrv_drained_end(di->bs);
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [RFC qemu v2 06/25] PVE backup: add target ID in backup state

2024-08-13 Thread Fiona Ebner
In preparation for allowing multiple backup providers. Each backup
target can then have its own dirty bitmap and there can be additional
checks that the current backup state is actually associated to the
expected target.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 pve-backup.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/pve-backup.c b/pve-backup.c
index d931746453..e8031bb89c 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -70,6 +70,7 @@ static struct PVEBackupState {
 JobTxn *txn;
 CoMutex backup_mutex;
 CoMutex dump_callback_mutex;
+char *target_id;
 } backup_state;
 
 static void pvebackup_init(void)
@@ -848,7 +849,7 @@ UuidInfo coroutine_fn *qmp_backup(
 
 if (backup_state.di_list) {
 error_set(errp, ERROR_CLASS_GENERIC_ERROR,
-  "previous backup not finished");
+  "previous backup by provider '%s' not finished", 
backup_state.target_id);
 qemu_co_mutex_unlock(&backup_state.backup_mutex);
 return NULL;
 }
@@ -1100,6 +1101,11 @@ UuidInfo coroutine_fn *qmp_backup(
 backup_state.vmaw = vmaw;
 backup_state.pbs = pbs;
 
+if (backup_state.target_id) {
+g_free(backup_state.target_id);
+}
+backup_state.target_id = g_strdup("Proxmox");
+
 backup_state.di_list = di_list;
 
 uuid_info = g_malloc0(sizeof(*uuid_info));
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [RFC qemu v2 07/25] PVE backup: get device info: allow caller to specify filter for which devices use fleecing

2024-08-13 Thread Fiona Ebner
For providing snapshot-access to external backup providers, EFI and
TPM also need an associated fleecing image. The new caller will thus
need a different filter.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 pve-backup.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index e8031bb89c..d0593fc581 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -717,7 +717,7 @@ static void create_backup_jobs_bh(void *opaque) {
 /*
  * EFI disk and TPM state are small and it's just not worth setting up 
fleecing for them.
  */
-static bool device_uses_fleecing(const char *device_id)
+static bool fleecing_no_efi_tpm(const char *device_id)
 {
 return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, 
"drive-tpmstate", 14);
 }
@@ -729,7 +729,7 @@ static bool device_uses_fleecing(const char *device_id)
  */
 static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
 const char *devlist,
-bool fleecing,
+bool (*device_uses_fleecing)(const char*),
 Error **errp)
 {
 gchar **devs = NULL;
@@ -755,7 +755,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
 di->bs = bs;
 di->device_name = g_strdup(bdrv_get_device_name(bs));
 
-if (fleecing && device_uses_fleecing(*d)) {
+if (device_uses_fleecing && device_uses_fleecing(*d)) {
 g_autofree gchar *fleecing_devid = g_strconcat(*d, 
"-fleecing", NULL);
 BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
 if (!fleecing_blk) {
@@ -858,7 +858,8 @@ UuidInfo coroutine_fn *qmp_backup(
 format = has_format ? format : BACKUP_FORMAT_VMA;
 
 bdrv_graph_co_rdlock();
-di_list = get_device_info(devlist, has_fleecing && fleecing, &local_err);
+di_list = get_device_info(devlist, (has_fleecing && fleecing) ? 
fleecing_no_efi_tpm : NULL,
+  &local_err);
 bdrv_graph_co_rdunlock();
 if (local_err) {
 error_propagate(errp, local_err);
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH qemu-server v2 20/25] restore: die early when there is no size for a device

2024-08-13 Thread Fiona Ebner
Makes it a clean error for buggy (external) backup providers where the
size might not be set at all.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 PVE/QemuServer.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index e5ff5efb..37f56f69 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6755,6 +6755,7 @@ my $restore_allocate_devices = sub {
 my $map = {};
 foreach my $virtdev (sort keys %$virtdev_hash) {
my $d = $virtdev_hash->{$virtdev};
+   die "got no size for '$virtdev'\n" if !defined($d->{size});
my $alloc_size = int(($d->{size} + 1024 - 1)/1024);
my $storeid = $d->{storeid};
my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH qemu-server v2 16/25] backup: cleanup: check if VM is running before issuing QMP commands

2024-08-13 Thread Fiona Ebner
When the VM is only started for backup, the VM will be stopped at that
point again. While the detach helpers do not warn about errors
currently, that might change in the future. This is also in
preparation for other cleanup QMP helpers that are more verbose about
failure.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 PVE/VZDump/QemuServer.pm | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index b2ced154..c46e607c 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -1118,13 +1118,14 @@ sub snapshot {
 sub cleanup {
 my ($self, $task, $vmid) = @_;
 
-$detach_tpmstate_drive->($task, $vmid);
-
-if ($task->{'use-fleecing'}) {
-   detach_fleecing_images($task->{disks}, $vmid);
-   cleanup_fleecing_images($self, $task->{disks});
+# If VM was started only for backup, it is already stopped now.
+if (PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
+   $detach_tpmstate_drive->($task, $vmid);
+   detach_fleecing_images($task->{disks}, $vmid) if 
$task->{'use-fleecing'};
 }
 
+cleanup_fleecing_images($self, $task->{disks}) if $task->{'use-fleecing'};
+
 if ($self->{qmeventd_fh}) {
close($self->{qmeventd_fh});
 }
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH qemu-server v2 15/25] backup: move cleanup of fleecing images to cleanup method

2024-08-13 Thread Fiona Ebner
TPM drives are already detached there and it's better to group
these things together.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 PVE/VZDump/QemuServer.pm | 25 +
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index 012c9210..b2ced154 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -690,7 +690,6 @@ sub archive_pbs {
 
 # get list early so we die on unkown drive types before doing anything
 my $devlist = _get_task_devlist($task);
-my $use_fleecing;
 
 $self->enforce_vm_running_for_backup($vmid);
 $self->{qmeventd_fh} = PVE::QemuServer::register_qmeventd_handle($vmid);
@@ -721,7 +720,7 @@ sub archive_pbs {
 
my $is_template = 
PVE::QemuConfig->is_template($self->{vmlist}->{$vmid});
 
-   $use_fleecing = check_and_prepare_fleecing(
+   $task->{'use-fleecing'} = check_and_prepare_fleecing(
$self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, 
$qemu_support);
 
my $fs_frozen = $self->qga_fs_freeze($task, $vmid);
@@ -735,7 +734,7 @@ sub archive_pbs {
devlist => $devlist,
'config-file' => $conffile,
};
-   $params->{fleecing} = JSON::true if $use_fleecing;
+   $params->{fleecing} = JSON::true if $task->{'use-fleecing'};
 
if (defined(my $ns = $scfg->{namespace})) {
$params->{'backup-ns'} = $ns;
@@ -784,11 +783,6 @@ sub archive_pbs {
 }
 $self->restore_vm_power_state($vmid);
 
-if ($use_fleecing) {
-   detach_fleecing_images($task->{disks}, $vmid);
-   cleanup_fleecing_images($self, $task->{disks});
-}
-
 die $err if $err;
 }
 
@@ -891,7 +885,6 @@ sub archive_vma {
 }
 
 my $devlist = _get_task_devlist($task);
-my $use_fleecing;
 
 $self->enforce_vm_running_for_backup($vmid);
 $self->{qmeventd_fh} = PVE::QemuServer::register_qmeventd_handle($vmid);
@@ -911,7 +904,7 @@ sub archive_vma {
 
$attach_tpmstate_drive->($self, $task, $vmid);
 
-   $use_fleecing = check_and_prepare_fleecing(
+   $task->{'use-fleecing'} = check_and_prepare_fleecing(
$self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, 
$qemu_support);
 
my $outfh;
@@ -942,7 +935,7 @@ sub archive_vma {
devlist => $devlist
};
$params->{'firewall-file'} = $firewall if -e $firewall;
-   $params->{fleecing} = JSON::true if $use_fleecing;
+   $params->{fleecing} = JSON::true if $task->{'use-fleecing'};
add_backup_performance_options($params, $opts->{performance}, 
$qemu_support);
 
$qmpclient->queue_cmd($vmid, $backup_cb, 'backup', %$params);
@@ -984,11 +977,6 @@ sub archive_vma {
 
 $self->restore_vm_power_state($vmid);
 
-if ($use_fleecing) {
-   detach_fleecing_images($task->{disks}, $vmid);
-   cleanup_fleecing_images($self, $task->{disks});
-}
-
 if ($err) {
if ($cpid) {
kill(9, $cpid);
@@ -1132,6 +1120,11 @@ sub cleanup {
 
 $detach_tpmstate_drive->($task, $vmid);
 
+if ($task->{'use-fleecing'}) {
+   detach_fleecing_images($task->{disks}, $vmid);
+   cleanup_fleecing_images($self, $task->{disks});
+}
+
 if ($self->{qmeventd_fh}) {
close($self->{qmeventd_fh});
 }
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH manager v2 24/25] ui: backup: also check for backup subtype to classify archive

2024-08-13 Thread Fiona Ebner
In anticipation of future storage plugins that might not have
PBS-specific formats or adhere to the vzdump naming scheme for
backups.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 www/manager6/Utils.js  | 10 ++
 www/manager6/grid/BackupView.js|  4 ++--
 www/manager6/storage/BackupView.js |  4 ++--
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/www/manager6/Utils.js b/www/manager6/Utils.js
index db86fa9a..a8e4e8ee 100644
--- a/www/manager6/Utils.js
+++ b/www/manager6/Utils.js
@@ -693,12 +693,14 @@ Ext.define('PVE.Utils', {
'snippets': gettext('Snippets'),
 },
 
-volume_is_qemu_backup: function(volid, format) {
-   return format === 'pbs-vm' || volid.match(':backup/vzdump-qemu-');
+volume_is_qemu_backup: function(volume) {
+   return volume.format === 'pbs-vm' || 
volume.volid.match(':backup/vzdump-qemu-') ||
+   volume.subtype === 'qemu';
 },
 
-volume_is_lxc_backup: function(volid, format) {
-   return format === 'pbs-ct' || 
volid.match(':backup/vzdump-(lxc|openvz)-');
+volume_is_lxc_backup: function(volume) {
+   return volume.format === 'pbs-ct' || 
volume.volid.match(':backup/vzdump-(lxc|openvz)-') ||
+   volume.subtype === 'lxc';
 },
 
 authSchema: {
diff --git a/www/manager6/grid/BackupView.js b/www/manager6/grid/BackupView.js
index e71d1c88..ef3649c6 100644
--- a/www/manager6/grid/BackupView.js
+++ b/www/manager6/grid/BackupView.js
@@ -29,11 +29,11 @@ Ext.define('PVE.grid.BackupView', {
var vmtypeFilter;
if (vmtype === 'lxc' || vmtype === 'openvz') {
vmtypeFilter = function(item) {
-   return PVE.Utils.volume_is_lxc_backup(item.data.volid, 
item.data.format);
+   return PVE.Utils.volume_is_lxc_backup(item.data);
};
} else if (vmtype === 'qemu') {
vmtypeFilter = function(item) {
-   return PVE.Utils.volume_is_qemu_backup(item.data.volid, 
item.data.format);
+   return PVE.Utils.volume_is_qemu_backup(item.data);
};
} else {
throw "unsupported VM type '" + vmtype + "'";
diff --git a/www/manager6/storage/BackupView.js 
b/www/manager6/storage/BackupView.js
index 878e1c8f..ad6e6a01 100644
--- a/www/manager6/storage/BackupView.js
+++ b/www/manager6/storage/BackupView.js
@@ -84,9 +84,9 @@ Ext.define('PVE.storage.BackupView', {
disabled: true,
handler: function(b, e, rec) {
let vmtype;
-   if (PVE.Utils.volume_is_qemu_backup(rec.data.volid, 
rec.data.format)) {
+   if (PVE.Utils.volume_is_qemu_backup(rec.data)) {
vmtype = 'qemu';
-   } else if (PVE.Utils.volume_is_lxc_backup(rec.data.volid, 
rec.data.format)) {
+   } else if (PVE.Utils.volume_is_lxc_backup(rec.data)) {
vmtype = 'lxc';
} else {
return;
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [RFC qemu v2 08/25] PVE backup: implement backup access setup and teardown API for external providers

2024-08-13 Thread Fiona Ebner
For external backup providers, the state of the VM's disk images at
the time the backup is started is preserved via a snapshot-access
block node. Old data is moved to the fleecing image when new guest
writes come in. The snapshot-access block node, as well as the
associated bitmap in case of incremental backup, will be exported via
NBD to the external provider. The NBD export will be done by the
management layer, the missing functionality is setting up and tearing
down the snapshot-access block nodes, which this patch adds.

It is necessary to also set up fleecing for EFI and TPM disks, so that
old data can be moved out of the way when a new guest write comes in.

There can only be one regular backup or one active backup access at
a time, because both require replacing the original block node of the
drive. Thus the backup state is re-used, and checks are added to
prohibit regular backup while snapshot access is active and vice
versa.

The block nodes added by the backup-access-setup QMP call are not
tracked anywhere else (there is no job they are associated to like for
regular backup). This requires adding a callback for teardown when
QEMU exits, i.e. in qemu_cleanup(). Otherwise, there will be an
assertion failure that the block graph is not empty when QEMU exits
before the backup-access-teardown QMP command is called.

The code for the qmp_backup_access_setup() was based on the existing
qmp_backup() routine.

The return value for the setup QMP command contains information about
the snapshot-access block nodes that can be used by the management
layer to set up the NBD exports.

Signed-off-by: Fiona Ebner 
---

Changes in v2:
* Also return the size of the block devices in the setup call.

 pve-backup.c | 273 +++
 pve-backup.h |  16 +++
 qapi/block-core.json |  45 +++
 system/runstate.c|   6 +
 4 files changed, 340 insertions(+)
 create mode 100644 pve-backup.h

diff --git a/pve-backup.c b/pve-backup.c
index d0593fc581..d3370d6744 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1,4 +1,5 @@
 #include "proxmox-backup-client.h"
+#include "pve-backup.h"
 #include "vma.h"
 
 #include "qemu/osdep.h"
@@ -585,6 +586,37 @@ static int setup_snapshot_access(PVEBackupDevInfo *di, 
Error **errp)
 return 0;
 }
 
+static void setup_all_snapshot_access_bh(void *opaque)
+{
+assert(!qemu_in_coroutine());
+
+CoCtxData *data = (CoCtxData*)opaque;
+Error **errp = (Error**)data->data;
+
+Error *local_err = NULL;
+
+GList *l =  backup_state.di_list;
+while (l) {
+PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+l = g_list_next(l);
+
+bdrv_drained_begin(di->bs);
+
+if (setup_snapshot_access(di, &local_err) < 0) {
+cleanup_snapshot_access(di);
+bdrv_drained_end(di->bs);
+error_setg(errp, "%s - setting up snapshot access failed: %s", 
di->device_name,
+   local_err ? error_get_pretty(local_err) : "unknown 
error");
+break;
+}
+
+bdrv_drained_end(di->bs);
+}
+
+/* return */
+aio_co_enter(data->ctx, data->co);
+}
+
 /*
  * backup_job_create can *not* be run from a coroutine, so this can't either.
  * The caller is responsible that backup_mutex is held nonetheless.
@@ -722,6 +754,11 @@ static bool fleecing_no_efi_tpm(const char *device_id)
 return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, 
"drive-tpmstate", 14);
 }
 
+static bool fleecing_all(const char *device_id)
+{
+return true;
+}
+
 /*
  * Returns a list of device infos, which needs to be freed by the caller. In
  * case of an error, errp will be set, but the returned value might still be a
@@ -810,6 +847,242 @@ err:
 return di_list;
 }
 
+BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
+const char *target_id,
+const char *devlist,
+Error **errp)
+{
+assert(qemu_in_coroutine());
+
+qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+Error *local_err = NULL;
+GList *di_list = NULL;
+GList *l;
+
+if (backup_state.di_list) {
+error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+  "previous backup by provider '%s' not finished", 
backup_state.target_id);
+qemu_co_mutex_unlock(&backup_state.backup_mutex);
+return NULL;
+}
+
+bdrv_graph_co_rdlock();
+di_list = get_device_info(devlist, fleecing_all, &local_err);
+bdrv_graph_co_rdunlock();
+if (local_err) {
+error_propagate(errp, local_err);
+goto err;
+}
+assert(di_list);
+
+size_t total = 0;
+
+l = di_list;
+while (l) {
+PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+l = g_list_next(l);
+
+ssize_t size = bdrv_getlength(di->bs);
+if (size < 0) {
+error_setg_errno(errp, -size, "bdrv_getlength failed");
+goto err;
+}
+di->size = size;
+total += size;
+
+di->comple

[pve-devel] [RFC qemu-server v2 18/25] backup: allow adding fleecing images also for EFI and TPM

2024-08-13 Thread Fiona Ebner
For the external backup API, it will be necessary to add a fleecing
image even for small disks like EFI and TPM, because there is no other
place the old data could be copied to when a new guest write comes in.

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 PVE/VZDump/QemuServer.pm | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index 98685127..4ad4a154 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -548,7 +548,7 @@ my sub cleanup_fleecing_images {
 }
 
 my sub allocate_fleecing_images {
-my ($self, $disks, $vmid, $fleecing_storeid, $format) = @_;
+my ($self, $disks, $vmid, $fleecing_storeid, $format, $all_images) = @_;
 
 die "internal error - no fleecing storage specified\n" if 
!$fleecing_storeid;
 
@@ -559,7 +559,8 @@ my sub allocate_fleecing_images {
my $n = 0; # counter for fleecing image names
 
for my $di ($disks->@*) {
-   next if $di->{virtdev} =~ m/^(?:tpmstate|efidisk)\d$/; # too small 
to be worth it
+   # EFI/TPM are usually too small to be worth it, but it's required 
for external providers
+   next if !$all_images && $di->{virtdev} =~ 
m/^(?:tpmstate|efidisk)\d$/;
if ($di->{type} eq 'block' || $di->{type} eq 'file') {
my $scfg = PVE::Storage::storage_config($self->{storecfg}, 
$fleecing_storeid);
my $name = "vm-$vmid-fleece-$n";
@@ -623,7 +624,7 @@ my sub attach_fleecing_images {
 }
 
 my sub check_and_prepare_fleecing {
-my ($self, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support) = 
@_;
+my ($self, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support, 
$all_images) = @_;
 
 # Even if the VM was started specifically for fleecing, it's possible that 
the VM is resumed and
 # then starts doing IO. For VMs that are not resumed the fleecing images 
will just stay empty,
@@ -644,7 +645,8 @@ my sub check_and_prepare_fleecing {
$self->{storecfg}, $fleecing_opts->{storage});
my $format = scalar(grep { $_ eq 'qcow2' } $valid_formats->@*) ? 
'qcow2' : 'raw';
 
-   allocate_fleecing_images($self, $disks, $vmid, 
$fleecing_opts->{storage}, $format);
+   allocate_fleecing_images(
+   $self, $disks, $vmid, $fleecing_opts->{storage}, $format, 
$all_images);
attach_fleecing_images($self, $disks, $vmid, $format);
 }
 
@@ -735,7 +737,7 @@ sub archive_pbs {
my $is_template = 
PVE::QemuConfig->is_template($self->{vmlist}->{$vmid});
 
$task->{'use-fleecing'} = check_and_prepare_fleecing(
-   $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, 
$qemu_support);
+   $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, 
$qemu_support, 0);
 
my $fs_frozen = $self->qga_fs_freeze($task, $vmid);
 
@@ -919,7 +921,7 @@ sub archive_vma {
$attach_tpmstate_drive->($self, $task, $vmid);
 
$task->{'use-fleecing'} = check_and_prepare_fleecing(
-   $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, 
$qemu_support);
+   $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, 
$qemu_support, 0);
 
my $outfh;
if ($opts->{stdout}) {
-- 
2.39.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [RFC qemu v2 09/25] PVE backup: implement bitmap support for external backup access

2024-08-13 Thread Fiona Ebner
There can be one dirty bitmap for each backup target ID (which are
tracked in the backup_access_bitmaps hash table). The QMP user can
specify the ID of the bitmap it likes to use. This ID is then compared
to the current one for the given target. If they match, the bitmap is
re-used (should it still exist on the drive, otherwise re-created). If
there is a mismatch, the old bitmap is removed and a new one is
created.

The return value of the QMP command includes information about what
bitmap action was taken. Similar to what the query-backup QMP command
returns for regular backup. It also includes the bitmap name and
associated block node, so the management layer can then set up an NBD
export with the bitmap.

While the backup access is active, a background bitmap is also
required. This is necessary to implement bitmap handling according to
the original reference [0]. In particular:

- in the error case, new writes since the backup access was set up are
  in the background bitmap. Because of failure, the previously tracked
  writes from the backup access bitmap are still required too. Thus,
  the bitmap is merged with the background bitmap to get all new
  writes since the last backup.

- in the success case, continue tracking for the next incremental
  backup in the backup access bitmap. New writes since the backup
  access was set up are in the background bitmap. Because the backup
  was successfully, clear the backup access bitmap and merge back the
  background bitmap to get only the new writes.

Since QEMU cannot know if the backup was successful or not (except if
failure already happens during the setup QMP command), the management
layer needs to tell it via the teardown QMP command.

The bitmap action is also recorded in the device info now.

[0]: 
https://lore.kernel.org/qemu-devel/b68833dd-8864-4d72-7c61-c134a9835...@ya.ru/

Signed-off-by: Fiona Ebner 
---

No changes in v2.

 pve-backup.c | 175 ++-
 pve-backup.h |   2 +-
 qapi/block-core.json |  22 +-
 system/runstate.c|   2 +-
 4 files changed, 193 insertions(+), 8 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index d3370d6744..5f8dd396d5 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -15,6 +15,7 @@
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/cutils.h"
+#include "qemu/error-report.h"
 
 #if defined(CONFIG_MALLOC_TRIM)
 #include 
@@ -41,6 +42,7 @@
  */
 
 const char *PBS_BITMAP_NAME = "pbs-incremental-dirty-bitmap";
+const char *BACKGROUND_BITMAP_NAME = "backup-access-background-bitmap";
 
 static struct PVEBackupState {
 struct {
@@ -72,6 +74,7 @@ static struct PVEBackupState {
 CoMutex backup_mutex;
 CoMutex dump_callback_mutex;
 char *target_id;
+GHashTable *backup_access_bitmaps; // key=target_id, value=bitmap_name
 } backup_state;
 
 static void pvebackup_init(void)
@@ -99,6 +102,8 @@ typedef struct PVEBackupDevInfo {
 char* device_name;
 int completed_ret; // INT_MAX if not completed
 BdrvDirtyBitmap *bitmap;
+BdrvDirtyBitmap *background_bitmap; // used for external backup access
+PBSBitmapAction bitmap_action;
 BlockDriverState *target;
 BlockJob *job;
 } PVEBackupDevInfo;
@@ -362,6 +367,67 @@ static void coroutine_fn pvebackup_co_complete_stream(void 
*opaque)
 qemu_co_mutex_unlock(&backup_state.backup_mutex);
 }
 
+/*
+ * New writes since the backup access was set up are in the background bitmap. 
Because of failure,
+ * the previously tracked writes in di->bitmap are still required too. Thus, 
merge with the
+ * background bitmap to get all new writes since the last backup.
+ */
+static void handle_backup_access_bitmaps_in_error_case(PVEBackupDevInfo *di)
+{
+Error *local_err = NULL;
+
+if (di->bs && di->background_bitmap) {
+bdrv_drained_begin(di->bs);
+if (di->bitmap) {
+bdrv_enable_dirty_bitmap(di->bitmap);
+if (!bdrv_merge_dirty_bitmap(di->bitmap, di->background_bitmap, 
NULL, &local_err)) {
+warn_report("backup access: %s - could not merge bitmaps in 
error path - %s",
+di->device_name,
+local_err ? error_get_pretty(local_err) : "unknown 
error");
+/*
+ * Could not merge, drop original bitmap too.
+ */
+bdrv_release_dirty_bitmap(di->bitmap);
+}
+} else {
+warn_report("backup access: %s - expected bitmap not present", 
di->device_name);
+}
+bdrv_release_dirty_bitmap(di->background_bitmap);
+bdrv_drained_end(di->bs);
+}
+}
+
+/*
+ * Continue tracking for next incremental backup in di->bitmap. New writes 
since the backup access
+ * was set up are in the background bitmap. Because the backup was successful, 
clear di->bitmap and
+ * merge back the background bitmap to get only the new writes.
+ */
+static void handle_backup_access_bitmaps_afte

[pve-devel] [POC storage v2 13/25] Borg plugin

2024-08-13 Thread Fiona Ebner
Archive names start with the guest type and ID and then the same
timestamp format as PBS.

Container archives have the following structure:
guest.config
firewall.config
filesystem/ # containing the whole filesystem structure

VM archives have the following structure
guest.config
firewall.config
volumes/ # containing a raw file for each device

A bindmount (respectively symlinks) are used to achieve this
structure, because Borg doesn't seem to support renaming on-the-fly.
(Prefix stripping via the "slashdot hack" would have helped slightly,
but is only in Borg >= 1.4
https://github.com/borgbackup/borg/actions/runs/7967940995)

NOTE: running via SSH was not yet tested. Bandwidth limit is not yet
honored and the task size is not calculated yet. Discard for VM
backups would also be nice to have, but it's not entirely clear how
(parsing progress and discarding according to that is one idea).
There is no dirty bitmap support, not sure if that is feasible to add.

Signed-off-by: Fiona Ebner 
---

New in v2.

 src/PVE/BackupProvider/Plugin/Borg.pm  | 373 ++
 src/PVE/BackupProvider/Plugin/Makefile |   2 +-
 src/PVE/Storage.pm |   2 +
 src/PVE/Storage/BorgBackupPlugin.pm| 506 +
 src/PVE/Storage/Makefile   |   1 +
 5 files changed, 883 insertions(+), 1 deletion(-)
 create mode 100644 src/PVE/BackupProvider/Plugin/Borg.pm
 create mode 100644 src/PVE/Storage/BorgBackupPlugin.pm

diff --git a/src/PVE/BackupProvider/Plugin/Borg.pm 
b/src/PVE/BackupProvider/Plugin/Borg.pm
new file mode 100644
index 000..b65321c
--- /dev/null
+++ b/src/PVE/BackupProvider/Plugin/Borg.pm
@@ -0,0 +1,373 @@
+package PVE::BackupProvider::Plugin::Borg;
+
+use strict;
+use warnings;
+
+use File::chdir;
+use File::Basename qw(basename);
+use File::Path qw(make_path remove_tree);
+use POSIX qw(strftime);
+
+use PVE::Tools;
+
+# ($vmtype, $vmid, $time_string)
+our $ARCHIVE_RE_3 = 
qr!^pve-(lxc|qemu)-([0-9]+)-([0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z)$!;
+
+sub archive_name {
+my ($vmtype, $vmid, $backup_time) = @_;
+
+return "pve-${vmtype}-${vmid}-" . strftime("%FT%TZ", gmtime($backup_time));
+}
+
+# remove_tree can be very verbose by default, do explicit error handling and 
limit to one message
+my sub _remove_tree {
+my ($path) = @_;
+
+remove_tree($path, { error => \my $err });
+if ($err && @$err) { # empty array if no error
+   for my $diag (@$err) {
+   my ($file, $message) = %$diag;
+   die "cannot remove_tree '$path': $message\n" if $file eq '';
+   die "cannot remove_tree '$path': unlinking $file failed - 
$message\n";
+   }
+}
+}
+
+my sub prepare_run_dir {
+my ($archive, $operation) = @_;
+
+my $run_dir = "/run/pve-storage-borg-plugin/${archive}.${operation}";
+_remove_tree($run_dir);
+make_path($run_dir);
+die "unable to create directory $run_dir\n" if !-d $run_dir;
+
+return $run_dir;
+}
+
+my sub log_info {
+my ($self, $message) = @_;
+
+$self->{'log-function'}->('info', $message);
+}
+
+my sub log_warning {
+my ($self, $message) = @_;
+
+$self->{'log-function'}->('warn', $message);
+}
+
+my sub log_error {
+my ($self, $message) = @_;
+
+$self->{'log-function'}->('err', $message);
+}
+
+my sub file_contents_from_archive {
+my ($self, $archive, $file) = @_;
+
+my $run_dir = prepare_run_dir($archive, "file-contents");
+
+my $raw;
+
+eval {
+   local $CWD = $run_dir;
+
+   $self->{'storage-plugin'}->borg_cmd_extract(
+   $self->{scfg},
+   $self->{storeid},
+   $archive,
+   [$file],
+   );
+
+   $raw = PVE::Tools::file_get_contents("${run_dir}/${file}");
+};
+my $err = $@;
+eval { _remove_tree($run_dir); };
+log_warning($self, $@) if $@;
+die $err if $err;
+
+return $raw;
+}
+
+# Plugin implementation
+
+sub new {
+my ($class, $storage_plugin, $scfg, $storeid, $log_function) = @_;
+
+my $self = bless {
+   scfg => $scfg,
+   storeid => $storeid,
+   'storage-plugin' => $storage_plugin,
+   'log-function' => $log_function,
+}, $class;
+
+return $self;
+}
+
+sub provider_name {
+my ($self) = @_;
+
+return "Borg";
+}
+
+sub job_hook {
+my ($self, $phase, $info) = @_;
+
+if ($phase eq 'start') {
+   $self->{'job-id'} = $info->{'start-time'};
+}
+
+return;
+}
+
+sub backup_hook {
+my ($self, $phase, $vmid, $vmtype, $info) = @_;
+
+if ($phase eq 'start') {
+   $self->{$vmid}->{'task-size'} = 0;
+}
+
+return;
+}
+
+sub backup_get_mechanism {
+my ($self, $vmid, $vmtype) = @_;
+
+return ('block-device', undef) if $vmtype eq 'qemu';
+return ('directory', undef) if $vmtype eq 'lxc';
+
+die "unsupported VM type '$vmtype'\n";
+}
+
+sub backup_get_archive_name {
+my ($self, $vmid, $vmtype, $backup_time) = @_;
+
+return $self->{$vmid}->{archive} = archive_name($vmtype, $

[pve-devel] [RFC qemu-server v2 19/25] backup: implement backup for external providers

2024-08-13 Thread Fiona Ebner
The state of the VM's disk images at the time the backup is started is
preserved via a snapshot-access block node. Old data is moved to the
fleecing image when new guest writes come in. The snapshot-access
block node, as well as the associated bitmap in case of incremental
backup, will be made available to the external provider. They are
exported via NBD and for 'nbd' mechanism, the NBD socket path is
passed to the provider, while for 'block-device' mechanism, the NBD
export is made accessible as a regular block device first and the
bitmap information is made available via a $next_dirty_region->()
function. For 'block-device', the 'nbdinfo' binary is required.

The provider can indicate that it wants to do an incremental backup by
returning the bitmap ID that was used for a previous backup and it
will then be told if the bitmap was newly created (either first backup
or old bitmap was invalid) or if the bitmap can be reused.

The provider then reads the parts of the NBD or block device it needs,
either the full disk for full backup, or the dirty parts according to
the bitmap for incremental backup. The bitmap has to be respected,
reads to other parts of the image will return an error. After backing
up each part of the disk, it should be discarded in the export to
avoid unnecessary space usage in the fleecing image (requires the
storage underlying the fleecing image to support discard too).

Signed-off-by: Fiona Ebner 
---

Changes in v2:
* Support 'block-device' mechanism, by exposing the NBD export as a
  block device via qemu-nbd.
* Adapt to API changes, i.e. pass all volumes as well as configuration
  files to the provider at once.

 PVE/VZDump/QemuServer.pm | 308 ++-
 1 file changed, 307 insertions(+), 1 deletion(-)

diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index 4ad4a154..9daebbc2 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -20,7 +20,7 @@ use PVE::QMPClient;
 use PVE::Storage::Plugin;
 use PVE::Storage::PBSPlugin;
 use PVE::Storage;
-use PVE::Tools;
+use PVE::Tools qw(run_command);
 use PVE::VZDump;
 use PVE::Format qw(render_duration render_bytes);
 
@@ -291,6 +291,8 @@ sub archive {
 
 if ($self->{vzdump}->{opts}->{pbs}) {
$self->archive_pbs($task, $vmid);
+} elsif ($self->{vzdump}->{'backup-provider'}) {
+   $self->archive_external($task, $vmid);
 } else {
$self->archive_vma($task, $vmid, $filename, $comp);
 }
@@ -1136,6 +1138,23 @@ sub cleanup {
 
 # If VM was started only for backup, it is already stopped now.
 if (PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
+   if ($task->{cleanup}->{'nbd-stop'}) {
+   eval { PVE::QemuServer::QMPHelpers::nbd_stop($vmid); };
+   $self->logerr($@) if $@;
+   }
+
+   if (my $info = $task->{cleanup}->{'backup-access-teardown'}) {
+   my $params = {
+   'target-id' => $info->{'target-id'},
+   timeout => 60,
+   success => $info->{success} ? JSON::true : JSON::false,
+   };
+
+   $self->loginfo("tearing down backup-access");
+   eval { mon_cmd($vmid, "backup-access-teardown", $params->%*) };
+   $self->logerr($@) if $@;
+   }
+
$detach_tpmstate_drive->($task, $vmid);
detach_fleecing_images($task->{disks}, $vmid) if 
$task->{'use-fleecing'};
 }
@@ -1147,4 +1166,291 @@ sub cleanup {
 }
 }
 
+my sub block_device_backup_cleanup {
+my ($self, $paths, $cpids) = @_;
+
+for my $path ($paths->@*) {
+   eval { run_command(["qemu-nbd", "-d", $path ]); };
+   $self->log('warn', "unable to disconnect NBD backup source '$path' - 
$@") if $@;
+}
+
+my $waited;
+my $wait_limit = 5;
+for ($waited = 0; $waited < $wait_limit && scalar(keys $cpids->%*); 
$waited++) {
+   while ((my $cpid = waitpid(-1, POSIX::WNOHANG)) > 0) {
+   delete($cpids->{$cpid});
+   }
+   if ($waited == 0) {
+   kill 15, $_ for keys $cpids->%*;
+   }
+   sleep 1;
+}
+if ($waited == $wait_limit && scalar(keys $cpids->%*)) {
+   kill 9, $_ for keys $cpids->%*;
+   sleep 1;
+   while ((my $cpid = waitpid(-1, POSIX::WNOHANG)) > 0) {
+   delete($cpids->{$cpid});
+   }
+   $self->log('warn', "unable to collect nbdinfo child process '$_'") for 
keys $cpids->%*;
+}
+}
+
+my sub block_device_backup_prepare {
+my ($self, $devicename, $size, $nbd_path, $bitmap_name, $count) = @_;
+
+my $nbd_info_uri = "nbd+unix:///${devicename}?socket=${nbd_path}";
+my $qemu_nbd_uri = "nbd:unix:${nbd_path}:exportname=${devicename}";
+
+my $cpid;
+my $error_fh;
+my $next_dirty_region;
+
+# If there is no dirty bitmap, it can be treated as if there's a full 
dirty one. The output of
+# nbdinfo is a list of tuples with offset, length, type, description. The 
first bit of 'type' is
+# set when the bitmap is dirty, see QEMU's docs/int

[pve-devel] [RFC container v2 23/25] backup: implement restore for external providers

2024-08-13 Thread Fiona Ebner
First, the provider is asked about what restore mechanism to use.
Currently, 'directory' and 'tar' are possible, for restoring either
from a directory containing the full filesystem structure (for which
rsync is used) or a potentially compressed tar file containing the
same.

The new functions are copied and adapted from the existing ones for
PBS or tar and it might be worth to factor out the common parts.

Signed-off-by: Fiona Ebner 
---

Changes in v2:
* Adapt to API changes.

 src/PVE/LXC/Create.pm | 141 ++
 1 file changed, 141 insertions(+)

diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm
index 117103c..9d1c337 100644
--- a/src/PVE/LXC/Create.pm
+++ b/src/PVE/LXC/Create.pm
@@ -25,6 +25,24 @@ sub restore_archive {
if ($scfg->{type} eq 'pbs') {
return restore_proxmox_backup_archive($storage_cfg, $archive, 
$rootdir, $conf, $no_unpack_error, $bwlimit);
}
+   my $log_function = sub {
+   my ($log_level, $message) = @_;
+   my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level);
+   print "$prefix: $message\n";
+   };
+   my $backup_provider =
+   PVE::Storage::new_backup_provider($storage_cfg, $storeid, 
$log_function);
+   if ($backup_provider) {
+   return restore_external_archive(
+   $backup_provider,
+   $storeid,
+   $volname,
+   $rootdir,
+   $conf,
+   $no_unpack_error,
+   $bwlimit,
+   );
+   }
 }
 
 $archive = PVE::Storage::abs_filesystem_path($storage_cfg, $archive) if 
$archive ne '-';
@@ -118,6 +136,55 @@ sub restore_tar_archive {
 die $err if $err && !$no_unpack_error;
 }
 
+sub restore_external_archive {
+my ($backup_provider, $storeid, $volname, $rootdir, $conf, 
$no_unpack_error, $bwlimit) = @_;
+
+my ($mechanism, $vmtype) = 
$backup_provider->restore_get_mechanism($volname, $storeid);
+die "cannot restore non-LXC guest of type '$vmtype'\n" if $vmtype ne 'lxc';
+
+my $info = $backup_provider->restore_container_init($volname, $storeid, 
{});
+eval {
+   if ($mechanism eq 'tar') {
+   my $tar_path = $info->{'tar-path'}
+   or die "did not get path to tar file from backup provider\n";
+   die "not a regular file '$tar_path'" if !-f $tar_path;
+   restore_tar_archive($tar_path, $rootdir, $conf, $no_unpack_error, 
$bwlimit);
+   } elsif ($mechanism eq 'directory') {
+   my $directory = $info->{'archive-directory'}
+   or die "did not get path to archive directory from backup 
provider\n";
+   die "not a directory '$directory'" if !-d $directory;
+
+   my $rsync = ['rsync', '--stats', '-h', '-X', '-A', '--numeric-ids', 
'-aH', '--delete',
+   '--no-whole-file', '--sparse', '--one-file-system', 
'--relative'];
+   push $rsync->@*, '--bwlimit', $bwlimit if $bwlimit;
+   push $rsync->@*, "${directory}/./", $rootdir;
+
+   my $transferred = '';
+   my $outfunc = sub {
+   return if $_[0] !~ /^Total transferred file size: (.+)$/;
+   $transferred = $1;
+   };
+   my $errfunc = sub { log_warn($_[0]); };
+
+   my $starttime = time();
+   PVE::Tools::run_command($rsync, outfunc => $outfunc, errfunc => 
$errfunc);
+   my $delay = time () - $starttime;
+
+   print "sync finished - transferred ${transferred} in ${delay}s\n";
+   } else {
+   die "mechanism '$mechanism' requested by backup provider is not 
supported for LXCs\n";
+   }
+};
+my $err = $@;
+eval { $backup_provider->restore_container_cleanup($volname, $storeid, 
{}); };
+if (my $cleanup_err = $@) {
+   die $cleanup_err if !$err;
+   warn $cleanup_err;
+}
+die $err if $err;
+
+}
+
 sub recover_config {
 my ($storage_cfg, $volid, $vmid) = @_;
 
@@ -126,6 +193,8 @@ sub recover_config {
my $scfg = PVE::Storage::storage_check_enabled($storage_cfg, $storeid);
if ($scfg->{type} eq 'pbs') {
return recover_config_from_proxmox_backup($storage_cfg, $volid, 
$vmid);
+   } elsif (PVE::Storage::new_backup_provider($storage_cfg, $storeid, sub 
{})) {
+   return recover_config_from_external_backup($storage_cfg, $volid, 
$vmid);
}
 }
 
@@ -200,6 +269,26 @@ sub recover_config_from_tar {
 return wantarray ? ($conf, $mp_param) : $conf;
 }
 
+sub recover_config_from_external_backup {
+my ($storage_cfg, $volid, $vmid) = @_;
+
+$vmid //= 0;
+
+my $raw = PVE::Storage::extract_vzdump_config($storage_cfg, $volid);
+
+my $conf = PVE::LXC::Config::parse_pct_config("/lxc/${vmid}.conf" , $raw);
+
+delete $conf->{snapshots};
+
+my $mp_param = {};
+PVE::LXC::Config->foreach_volume($conf, sub {
+   my ($ms, $mountpoint) = @_;
+   $mp_param->{$ms} = $conf->{$ms};
+});
+
+

[pve-devel] applied: [PATCH docs] pve-network: correct language errors

2024-08-13 Thread Fiona Ebner
Am 13.08.24 um 12:23 schrieb Alexander Zeidler:
> Signed-off-by: Alexander Zeidler 
> ---
>  pve-network.adoc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/pve-network.adoc b/pve-network.adoc
> index acdcf39..434430d 100644
> --- a/pve-network.adoc
> +++ b/pve-network.adoc
> @@ -447,7 +447,7 @@ slaves in the single logical bonded interface such that 
> different
>  network-peers use different MAC addresses for their network packet
>  traffic.
>  
> -If your switch support the LACP (IEEE 802.3ad) protocol then we recommend 
> using
> +If your switch supports the LACP (IEEE 802.3ad) protocol then we recommend 
> using
>  the corresponding bonding mode (802.3ad). Otherwise you should generally use 
> the
>  active-backup mode.
>  
> @@ -490,7 +490,7 @@ iface vmbr0 inet static
>  
>  
>  [thumbnail="default-network-setup-bond.svg"]
> -Another possibility it to use the bond directly as bridge port.
> +Another possibility is to use the bond directly as bridge port.
>  This can be used to make the guest network fault-tolerant.
>  
>  .Example: Use a bond as bridge port

applied, thanks! While adding the missing comma Lukas noted and his R-b.


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] applied: [PATCH docs] local-zfs: correct language errors

2024-08-13 Thread Fiona Ebner
Am 13.08.24 um 12:23 schrieb Alexander Zeidler:
> Signed-off-by: Alexander Zeidler 
> ---
>  local-zfs.adoc | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 

applied, thanks! Squashed in a change to wrap lines that would've been
longer than 80 characters.


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] applied: [PATCH docs] pve-network: correct language errors

2024-08-13 Thread Fiona Ebner
Am 13.08.24 um 15:57 schrieb Fiona Ebner:
> Am 13.08.24 um 12:23 schrieb Alexander Zeidler:
>> Signed-off-by: Alexander Zeidler 
>> ---
>>  pve-network.adoc | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/pve-network.adoc b/pve-network.adoc
>> index acdcf39..434430d 100644
>> --- a/pve-network.adoc
>> +++ b/pve-network.adoc
>> @@ -447,7 +447,7 @@ slaves in the single logical bonded interface such that 
>> different
>>  network-peers use different MAC addresses for their network packet
>>  traffic.
>>  
>> -If your switch support the LACP (IEEE 802.3ad) protocol then we recommend 
>> using
>> +If your switch supports the LACP (IEEE 802.3ad) protocol then we recommend 
>> using
>>  the corresponding bonding mode (802.3ad). Otherwise you should generally 
>> use the
>>  active-backup mode.
>>  
>> @@ -490,7 +490,7 @@ iface vmbr0 inet static
>>  
>>  
>>  [thumbnail="default-network-setup-bond.svg"]
>> -Another possibility it to use the bond directly as bridge port.
>> +Another possibility is to use the bond directly as bridge port.
>>  This can be used to make the guest network fault-tolerant.
>>  
>>  .Example: Use a bond as bridge port
> 
> applied, thanks! While adding the missing comma Lukas noted and his R-b.

I also added a follow-up using "as the bridge port" instead of "as
bridge port" while at it.


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] [RFC firewall/proxmox{-ve-rs, -firewall, -perl-rs} 00/21] autogenerate ipsets for sdn objects

2024-08-13 Thread Max Carrara
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote:
> This patch series adds support for autogenerating ipsets for SDN objects. It
> autogenerates ipsets for every VNet as follows:
>
> * ipset containing all IP ranges of the VNet
> * ipset containing all gateways of the VNet
> * ipset containing all IP ranges of the subnet - except gateways
> * ipset containing all dhcp ranges of the vnet

Gave the entire RFC a spin - apart from a few minor version bumps and
one small rejected hunk, everything applied and built just fine.
Encountered no other issues in that regard.

My full review can be found below.

Review: RFC: autogenerate ipsets for sdn objects


Building


As I also mention in some patches inline, a couple version bumps were
necessary to get everything to build correctly.

Those are as follows:

- proxmox-ve-rs
  - proxmox-sys = "0.6.2"
  - serde_with = "3.8.1"

- proxmox-firewall
  - proxmox-sys = "0.6.2"

Additionally, patch 21 contained one rejected hunk:

diff a/pve-rs/Cargo.toml b/pve-rs/Cargo.toml(rejected hunks)
@@ -43,3 +43,4 @@ proxmox-subscription = "0.4"
 proxmox-sys = "0.5"
 proxmox-tfa = { version = "4.0.4", features = ["api"] }
 proxmox-time = "2"
+proxmox-ve-config = { version = "0.1.0" }


This got rejected because `proxmox-sys` was already bumped to "0.6" in
`pve-rs`.

Simply adding the dependency manually allowed me to build `pve-rs`.

Testing
---

You saw a lot of this off-list already (thanks again for the help btw),
but want to mention it here as well, so it doesn't get lost.

Setup
*

- Installed all packages on my development VM on which I then
  performed my tests.
- No issues were encountered during installation.
- Installed `dnsmasq` on the VM.
- Disabled `dnsmasq` permanently via `systemctl disable --now dnsmasq`.

Simple Zone & VNet
**

- Added a new simple zone in Datacenter > SDN > Zones.
  - Enabled automatic DHCP on the zone.
- Added a new VNet named `vnet0` and assigned it to the new simple zone.
  - Subnet: 172.16.100.0/24
- Gateway: 172.16.100.1
- DHCP Range: 172.16.100.100 - 172.16.100.200
- SNAT: enabled

Cluster Firewall


- Edited cluster firewall rules.
- Immediately noticed that the new ipsets appeared in the UI and were
  able to be selected.
- Configured a basic firewall rule as example.
  - Type: out
  - Action: REJECT
  - Macro: Web
  - Source: +dc/vnet0-all
  - Log: info
- While this does not block web traffic for VMs, this was nevertheless
  useful to check whether the ipsets and `iptables` FW config were
  generated correctly.
- Relevant output of `iptables-save`:

# iptables-save | grep -E '\-\-dport (80|443) '
-A PVEFW-HOST-OUT -p tcp -m set --match-set PVEFW-0-vnet0-all-v4 src -m tcp 
--dport 80 -m limit --limit 1/sec -j NFLOG --nflog-prefix ":0:6:PVEFW-HOST-OUT: 
REJECT: "
-A PVEFW-HOST-OUT -p tcp -m set --match-set PVEFW-0-vnet0-all-v4 src -m tcp 
--dport 80 -j PVEFW-reject
-A PVEFW-HOST-OUT -p tcp -m set --match-set PVEFW-0-vnet0-all-v4 src -m tcp 
--dport 443 -m limit --limit 1/sec -j NFLOG --nflog-prefix 
":0:6:PVEFW-HOST-OUT: REJECT: "
-A PVEFW-HOST-OUT -p tcp -m set --match-set PVEFW-0-vnet0-all-v4 src -m tcp 
--dport 443 -j PVEFW-reject

- The set passed via `--match-set` also appears in `ipset -L`:

# ipset -L | grep -A 8 'PVEFW-0-vnet0-all-v4'
Name: PVEFW-0-vnet0-all-v4
Type: hash:net
Revision: 7
Header: family inet hashsize 64 maxelem 64 bucketsize 12 initval 0xa4c09bc0
Size in memory: 504
References: 12
Number of entries: 1
Members:
172.16.100.0/24

- Very nice.

- All of the other ipsets for the VNet also appear in `ipset -L` as expected
  (-all, -dhcp, -gateway for v4 and v6 each)

- When removing removing `+dc/vnet0-all` from the firewall rule and
  leaving the source empty, outgoing web traffic was blocked, as
  expected.
  - Keeping it there does *not* block outgoing web traffic, as expected.

Host Firewall
*

- The cluster firewall rule above was deactivated.
- Otherwise, the exact same steps as above were performed, just on the
  host firewall.
- The results are exactly the same, as expected.

VM / CT ipsets
**

- All containers and VMs correctly got their own ipset
  (checked with `ipset -L`).
- Assigning a VM to the VNet makes it show up in IPAM and also updates
  its corresponding ipset correctly.
- Adding the same firewall rule as above to a VM blocks the VM's
  outgoing web traffic, as expected.
  - Changing the source to the VM's ipset, in this case `+guest-ipam-102`,
also blocks the VM's outgoing web traffic, as expected.

- Output of `iptables-save | grep -E '\-\-dport (80|443) '` on the node:

# iptables-save | grep -E '\-\-dport (80|443) ' 


  

Re: [pve-devel] [PATCH proxmox-ve-rs 01/21] debian: add files for packaging

2024-08-13 Thread Max Carrara
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote:
> Since we now have a standalone repository for Proxmox VE related
> crates, add the required files for packaging the crates contained in
> this repository.
>
> Signed-off-by: Stefan Hanreich 
> ---
>  .cargo/config.toml |  5 ++
>  .gitignore |  8 +++
>  Cargo.toml | 17 +++
>  Makefile   | 69 ++
>  build.sh   | 35 +
>  bump.sh| 44 
>  proxmox-ve-config/Cargo.toml   | 16 +++---
>  proxmox-ve-config/debian/changelog |  5 ++
>  proxmox-ve-config/debian/control   | 43 
>  proxmox-ve-config/debian/copyright | 19 +++
>  proxmox-ve-config/debian/debcargo.toml |  4 ++
>  11 files changed, 255 insertions(+), 10 deletions(-)
>  create mode 100644 .cargo/config.toml
>  create mode 100644 .gitignore
>  create mode 100644 Cargo.toml
>  create mode 100644 Makefile
>  create mode 100755 build.sh
>  create mode 100755 bump.sh
>  create mode 100644 proxmox-ve-config/debian/changelog
>  create mode 100644 proxmox-ve-config/debian/control
>  create mode 100644 proxmox-ve-config/debian/copyright
>  create mode 100644 proxmox-ve-config/debian/debcargo.toml
>
> diff --git a/.cargo/config.toml b/.cargo/config.toml
> new file mode 100644
> index 000..3b5b6e4
> --- /dev/null
> +++ b/.cargo/config.toml
> @@ -0,0 +1,5 @@
> +[source]
> +[source.debian-packages]
> +directory = "/usr/share/cargo/registry"
> +[source.crates-io]
> +replace-with = "debian-packages"
> diff --git a/.gitignore b/.gitignore
> new file mode 100644
> index 000..d72b68b
> --- /dev/null
> +++ b/.gitignore
> @@ -0,0 +1,8 @@
> +/target
> +/*/target
> +Cargo.lock
> +**/*.rs.bk
> +/*.buildinfo
> +/*.changes
> +/build
> +/*-deb
> diff --git a/Cargo.toml b/Cargo.toml
> new file mode 100644
> index 000..ab23d89
> --- /dev/null
> +++ b/Cargo.toml
> @@ -0,0 +1,17 @@
> +[workspace]
> +members = [
> +"proxmox-ve-config",
> +]
> +exclude = [
> +"build",
> +]
> +resolver = "2"
> +
> +[workspace.package]
> +authors = ["Proxmox Support Team "]
> +edition = "2021"
> +license = "AGPL-3"
> +homepage = "https://proxmox.com";
> +exclude = [ "debian" ]
> +rust-version = "1.70"
> +
> diff --git a/Makefile b/Makefile
> new file mode 100644
> index 000..0da9b74
> --- /dev/null
> +++ b/Makefile
> @@ -0,0 +1,69 @@
> +# Shortcut for common operations:
> +
> +CRATES != echo proxmox-*/Cargo.toml | sed -e 's|/Cargo.toml||g'
> +
> +# By default we just run checks:
> +.PHONY: all
> +all: check
> +
> +.PHONY: deb
> +deb: $(foreach c,$(CRATES), $c-deb)
> + echo $(foreach c,$(CRATES), $c-deb)
> + lintian build/*.deb
> +
> +.PHONY: dsc
> +dsc: $(foreach c,$(CRATES), $c-dsc)
> + echo $(foreach c,$(CRATES), $c-dsc)
> + lintian build/*.dsc
> +
> +.PHONY: autopkgtest
> +autopkgtest: $(foreach c,$(CRATES), $c-autopkgtest)
> +
> +.PHONY: dinstall
> +dinstall:
> + $(MAKE) clean
> + $(MAKE) deb
> + sudo -k dpkg -i build/librust-*.deb
> +
> +%-deb:
> + ./build.sh $*
> + touch $@
> +
> +%-dsc:
> + BUILDCMD='dpkg-buildpackage -S -us -uc -d' ./build.sh $*
> + touch $@
> +
> +%-autopkgtest:
> + autopkgtest build/$* build/*.deb -- null
> + touch $@
> +
> +.PHONY: check
> +check:
> + cargo test
> +
> +# Prints a diff between the current code and the one rustfmt would produce
> +.PHONY: fmt
> +fmt:
> + cargo +nightly fmt -- --check
> +
> +# Doc without dependencies
> +.PHONY: doc
> +doc:
> + cargo doc --no-deps
> +
> +.PHONY: clean
> +clean:
> + cargo clean
> + rm -rf build/
> + rm -f -- *-deb *-dsc *-autopkgtest *.build *.buildinfo *.changes
> +
> +.PHONY: update
> +update:
> + cargo update
> +
> +%-upload: %-deb
> + cd build; \
> + dcmd --deb rust-$*_*.changes \
> + | grep -v '.changes$$' \
> + | tar -cf "$@.tar" -T-; \
> + cat "$@.tar" | ssh -X repo...@repo.proxmox.com upload --product 
> devel --dist bookworm
> diff --git a/build.sh b/build.sh
> new file mode 100755
> index 000..39a8302
> --- /dev/null
> +++ b/build.sh
> @@ -0,0 +1,35 @@
> +#!/bin/sh
> +
> +set -eux
> +
> +export CARGO=/usr/bin/cargo
> +export RUSTC=/usr/bin/rustc
> +
> +CRATE=$1
> +BUILDCMD=${BUILDCMD:-"dpkg-buildpackage -b -uc -us"}
> +
> +mkdir -p build
> +echo system >build/rust-toolchain
> +rm -rf "build/${CRATE}"
> +
> +CONTROL="$PWD/${CRATE}/debian/control"
> +
> +if [ -e "$CONTROL" ]; then
> +# check but only warn, debcargo fails anyway if crates are missing
> +dpkg-checkbuilddeps $PWD/${CRATE}/debian/control || true
> +# rm -f "$PWD/${CRATE}/debian/control"
> +fi
> +
> +debcargo package \
> +--config "$PWD/${CRATE}/debian/debcargo.toml" \
> +--changelog-ready \
> +--no-overlay-write-back \
> +--directory "$PWD/build/${CRATE}" \
> +"${CRATE}" \
> + 

Re: [pve-devel] [PATCH proxmox-ve-rs 02/21] firewall: add ip range types

2024-08-13 Thread Max Carrara
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote:
> Currently we are using tuples to represent IP ranges which is
> suboptimal. Validation logic and invariant checking needs to happen at
> every site using the IP range rather than having a unified struct for
> enforcing those invariants.

That's something I completely support; as you know I'm a fan of
representing state / invariants / etc. via types ;)

>
> Signed-off-by: Stefan Hanreich 
> ---
>  .../src/firewall/types/address.rs | 230 +-
>  1 file changed, 228 insertions(+), 2 deletions(-)
>
> diff --git a/proxmox-ve-config/src/firewall/types/address.rs 
> b/proxmox-ve-config/src/firewall/types/address.rs
> index e48ac1b..ddf4652 100644
> --- a/proxmox-ve-config/src/firewall/types/address.rs
> +++ b/proxmox-ve-config/src/firewall/types/address.rs
> @@ -1,9 +1,9 @@
> -use std::fmt;
> +use std::fmt::{self, Display};
>  use std::net::{IpAddr, Ipv4Addr, Ipv6Addr};
>  use std::ops::Deref;
>  
>  use anyhow::{bail, format_err, Error};
> -use serde_with::DeserializeFromStr;
> +use serde_with::{DeserializeFromStr, SerializeDisplay};
>  
>  #[derive(Clone, Copy, Debug, Eq, PartialEq)]
>  pub enum Family {
> @@ -239,6 +239,202 @@ impl> From for Ipv6Cidr {
>  }
>  }
>  
> +#[derive(Clone, Copy, Debug, PartialOrd, Ord, PartialEq, Eq, Hash)]
> +pub enum IpRangeError {
> +MismatchedFamilies,
> +StartGreaterThanEnd,
> +InvalidFormat,
> +}
> +
> +impl std::error::Error for IpRangeError {}
> +
> +impl Display for IpRangeError {
> +fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> +f.write_str(match self {
> +IpRangeError::MismatchedFamilies => "mismatched ip address 
> families",
> +IpRangeError::StartGreaterThanEnd => "start is greater than end",
> +IpRangeError::InvalidFormat => "invalid ip range format",
> +})
> +}
> +}
> +
> +/// represents a range of IPv4 or IPv6 addresses

Small thing: I'd prefer

Represents a range of IPv4 or IPv6 addresses.

  instead.

Mainly because most docstrings are written that way, and you do it in
later patches in a couple places as well anyways. Just gonna mention
this here once as you do this a couple times in this series in order to
avoid unnecessary noise.

IMO it's a really minor thing, but since you told me off-list that
you're in the process of neatly documenting everything, I thought I'd
mention it here.

> +///
> +/// For more information see [`AddressRange`]
> +#[derive(Clone, Copy, Debug, PartialEq, Eq, SerializeDisplay, 
> DeserializeFromStr)]
> +pub enum IpRange {
> +V4(AddressRange),
> +V6(AddressRange),
> +}
> +
> +impl IpRange {
> +/// returns the family of the IpRange
> +pub fn family(&self) -> Family {
> +match self {
> +IpRange::V4(_) => Family::V4,
> +IpRange::V6(_) => Family::V6,
> +}
> +}
> +
> +/// creates a new [`IpRange`] from two [`IpAddr`]
> +///
> +/// # Errors
> +///
> +/// This function will return an error if start and end IP address are 
> not from the same family.
> +pub fn new(start: impl Into, end: impl Into) -> 
> Result {
> +match (start.into(), end.into()) {
> +(IpAddr::V4(start), IpAddr::V4(end)) => Self::new_v4(start, end),
> +(IpAddr::V6(start), IpAddr::V6(end)) => Self::new_v6(start, end),
> +_ => Err(IpRangeError::MismatchedFamilies),
> +}
> +}
> +
> +/// construct a new Ipv4 Range
> +pub fn new_v4(
> +start: impl Into,
> +end: impl Into,
> +) -> Result {
> +Ok(IpRange::V4(AddressRange::new_v4(start, end)?))
> +}
> +
> +pub fn new_v6(
> +start: impl Into,
> +end: impl Into,
> +) -> Result {
> +Ok(IpRange::V6(AddressRange::new_v6(start, end)?))
> +}
> +}
> +
> +impl std::str::FromStr for IpRange {
> +type Err = IpRangeError;
> +
> +fn from_str(s: &str) -> Result {
> +if let Ok(range) = s.parse() {
> +return Ok(IpRange::V4(range));
> +}
> +
> +if let Ok(range) = s.parse() {
> +return Ok(IpRange::V6(range));
> +}
> +
> +Err(IpRangeError::InvalidFormat)
> +}
> +}
> +
> +impl fmt::Display for IpRange {
> +fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> +match self {
> +IpRange::V4(range) => range.fmt(f),
> +IpRange::V6(range) => range.fmt(f),
> +}
> +}
> +}
> +
> +/// represents a range of IP addresses from start to end
> +///
> +/// This type is for encapsulation purposes for the [`IpRange`] enum and 
> should be instantiated via
> +/// that enum.
> +///
> +/// # Invariants
> +///
> +/// * start and end have the same IP address family
> +/// * start is lesser than or equal to end
> +///
> +/// # Textual representation
> +///
> +/// Two IP addresses separated by a hyphen, e.g.: `127.0.0.1-127.0.0.255`
> +#[derive(Clone, Copy, Debug,

Re: [pve-devel] [PATCH proxmox-ve-rs 05/21] iprange: add methods for converting an ip range to cidrs

2024-08-13 Thread Max Carrara
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote:
> This is mainly used in proxmox-perl-rs, so the generated ipsets can be
> used in pve-firewall where only CIDRs are supported.
>
> Signed-off-by: Stefan Hanreich 
> ---
>  .../src/firewall/types/address.rs | 818 ++
>  1 file changed, 818 insertions(+)
>
> diff --git a/proxmox-ve-config/src/firewall/types/address.rs 
> b/proxmox-ve-config/src/firewall/types/address.rs
> index 8db3942..3238601 100644
> --- a/proxmox-ve-config/src/firewall/types/address.rs
> +++ b/proxmox-ve-config/src/firewall/types/address.rs
> @@ -303,6 +303,17 @@ impl IpRange {
>  ) -> Result {
>  Ok(IpRange::V6(AddressRange::new_v6(start, end)?))
>  }
> +
> +/// converts an IpRange into the minimal amount of CIDRs
> +///
> +/// see the concrete implementations of [`AddressRange`] or 
> [`AddressRange`]
> +/// respectively
> +pub fn to_cidrs(&self) -> Vec {
> +match self {
> +IpRange::V4(range) => 
> range.to_cidrs().into_iter().map(Cidr::from).collect(),
> +IpRange::V6(range) => 
> range.to_cidrs().into_iter().map(Cidr::from).collect(),
> +}
> +}
>  }
>  
>  impl std::str::FromStr for IpRange {
> @@ -362,6 +373,71 @@ impl AddressRange {
>  
>  Ok(Self { start, end })
>  }
> +
> +/// returns the minimum amount of CIDRs that exactly represent the range
> +///
> +/// The idea behind this algorithm is as follows:
> +///
> +/// Start iterating with current = start of the IP range
> +///
> +/// Find two netmasks
> +/// * The largest CIDR that the current IP can be the first of
> +/// * The largest CIDR that *only* contains IPs from current - end
> +///
> +/// Add the smaller of the two CIDRs to our result and current to the 
> first IP that is in
> +/// the range but not in the CIDR we just added. Proceed until we 
> reached the end of the IP
> +/// range.

Would mybe prefer some more inline formatting / minor rewording
regarding the algorithm's steps above, simply for readability's sake
(e.g. when rendering the docs).

Sort of like:


1. Start iteration: Set `current` to `start` of the IP range

2. Find two netmasks:
  - The largest CIDR that the `current` IP can be the first of
  - The largest CIDR that *only* contains IPs from `current` to `end`

3. Add the smaller of the two CIDRs to our result and `current` to the first IP 
that is in
the range but *not* in the CIDR we just added. Proceed until we reached the end 
of the IP
range.


Again, just a small thing, but thought I'd mention it.

> +///
> +pub fn to_cidrs(&self) -> Vec {
> +let mut cidrs = Vec::new();
> +
> +let mut current = u32::from_be_bytes(self.start.octets());
> +let end = u32::from_be_bytes(self.end.octets());
> +
> +if current == end {
> +// valid Ipv4 since netmask is 32
> +cidrs.push(Ipv4Cidr::new(current, 32).unwrap());
> +return cidrs;
> +}
> +
> +// special case this, since this is the only possibility of overflow
> +// when calculating delta_min_mask - makes everything a lot easier
> +if current == u32::MIN && end == u32::MAX {
> +// valid Ipv4 since it is `0.0.0.0/0`
> +cidrs.push(Ipv4Cidr::new(current, 0).unwrap());
> +return cidrs;
> +}
> +
> +while current <= end {
> +// netmask of largest CIDR that current IP can be the first of
> +// cast is safe, because trailing zeroes can at most be 32
> +let current_max_mask = IPV4_LENGTH - (current.trailing_zeros() 
> as u8);
> +
> +// netmask of largest CIDR that *only* contains IPs of the 
> remaining range
> +// is at most 32 due to unwrap_or returning 32 and ilog2 being 
> at most 31
> +let delta_min_mask = ((end - current) + 1) // safe due to 
> special case above
> +.checked_ilog2() // should never occur due to special case, 
> but for good measure
> +.map(|mask| IPV4_LENGTH - mask as u8)
> +.unwrap_or(IPV4_LENGTH);
> +
> +// at most 32, due to current/delta being at most 32
> +let netmask = u8::max(current_max_mask, delta_min_mask);
> +
> +// netmask is at most 32, therefore safe to unwrap
> +cidrs.push(Ipv4Cidr::new(current, netmask).unwrap());
> +
> +let delta = 2u32.saturating_pow((IPV4_LENGTH - netmask).into());
> +
> +if let Some(result) = current.checked_add(delta) {
> +current = result
> +} else {
> +// we reached the end of IP address space
> +break;
> +}
> +}
> +
> +cidrs
> +}
>  }
>  
>  impl AddressRange {
> @@ -377,6 +453,61 @@ impl AddressRange {
>  
>  Ok(Self { start, end })
>  }
> +
> +/// returns the minimum amount 

Re: [pve-devel] [PATCH proxmox-ve-rs 10/21] sdn: add ipam module

2024-08-13 Thread Max Carrara
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote:
> This module includes structs for representing the JSON schema from the
> PVE ipam. Those can be used to parse the current IPAM state.
>
> We also include a general Ipam struct, and provide a method for
> converting the PVE IPAM to the general struct. The idea behind this
> is that we have multiple IPAM plugins in PVE and will likely add
> support for importing them in the future. With the split, we can have
> our dedicated structs for representing the different data formats from
> the different IPAM plugins and then convert them into a common
> representation that can then be used internally, decoupling the
> concrete plugin from the code using the IPAM configuration.

Big fan of this - as I had already mentioned, I find it always nice to
have different types for such things.

IMO it would be neat if those types were logically grouped though, e.g.
the types for PVE could live in a separate `mod` inside the file. Or, if
you want, you can also convert `ipam.rs` to a `ipam/mod.rs` and add
further files depending on the types of different representations there.

Either solution would make it harder for these types to become
"intermingled" in the future, IMO. So this kind of grouping would simply
serve as a "decent barrier" between the separate representations.
Perhaps the module(s) could then also use a little bit of developer
documentation or something that describes how and why the types are
organized that way.

It's probably best to just add `mod`s for now, as those can be split up
into files later anyway. (Just don't `use super::*;` :P )

>
> Enforcing the invariants the way we currently do adds a bit of runtime
> complexity when building the object, but we get the upside of never
> being able to construct an invalid struct. For the amount of entries
> the ipam usually has, this should be fine. Should it turn out to be
> not performant enough we could always add a HashSet for looking up
> values and speeding up the validation. For now, I wanted to avoid the
> additional complexity.
>
> Signed-off-by: Stefan Hanreich 
> ---
>  .../src/firewall/types/address.rs |   8 +
>  proxmox-ve-config/src/guest/vm.rs |   4 +
>  proxmox-ve-config/src/sdn/ipam.rs | 330 ++
>  proxmox-ve-config/src/sdn/mod.rs  |   2 +
>  4 files changed, 344 insertions(+)
>  create mode 100644 proxmox-ve-config/src/sdn/ipam.rs
>
> diff --git a/proxmox-ve-config/src/firewall/types/address.rs 
> b/proxmox-ve-config/src/firewall/types/address.rs
> index a0b82c5..3ad1a7a 100644
> --- a/proxmox-ve-config/src/firewall/types/address.rs
> +++ b/proxmox-ve-config/src/firewall/types/address.rs
> @@ -61,6 +61,14 @@ impl Cidr {
>  pub fn is_ipv6(&self) -> bool {
>  matches!(self, Cidr::Ipv6(_))
>  }
> +
> +pub fn contains_address(&self, ip: &IpAddr) -> bool {
> +match (self, ip) {
> +(Cidr::Ipv4(cidr), IpAddr::V4(ip)) => cidr.contains_address(ip),
> +(Cidr::Ipv6(cidr), IpAddr::V6(ip)) => cidr.contains_address(ip),
> +_ => false,
> +}
> +}
>  }
>  
>  impl fmt::Display for Cidr {
> diff --git a/proxmox-ve-config/src/guest/vm.rs 
> b/proxmox-ve-config/src/guest/vm.rs
> index a7ea9bb..6a706c7 100644
> --- a/proxmox-ve-config/src/guest/vm.rs
> +++ b/proxmox-ve-config/src/guest/vm.rs
> @@ -17,6 +17,10 @@ static LOCAL_PART: [u8; 8] = [0xFE, 0x80, 0x00, 0x00, 
> 0x00, 0x00, 0x00, 0x00];
>  static EUI64_MIDDLE_PART: [u8; 2] = [0xFF, 0xFE];
>  
>  impl MacAddress {
> +pub fn new(address: [u8; 6]) -> Self {
> +Self(address)
> +}
> +
>  /// generates a link local IPv6-address according to RFC 4291 (Appendix 
> A)
>  pub fn eui64_link_local_address(&self) -> Ipv6Addr {
>  let head = &self.0[..3];
> diff --git a/proxmox-ve-config/src/sdn/ipam.rs 
> b/proxmox-ve-config/src/sdn/ipam.rs
> new file mode 100644
> index 000..682bbe7
> --- /dev/null
> +++ b/proxmox-ve-config/src/sdn/ipam.rs
> @@ -0,0 +1,330 @@
> +use std::{
> +collections::{BTreeMap, HashMap},
> +error::Error,
> +fmt::Display,
> +net::IpAddr,
> +};
> +
> +use serde::Deserialize;
> +
> +use crate::{
> +firewall::types::Cidr,
> +guest::{types::Vmid, vm::MacAddress},
> +sdn::{SdnNameError, SubnetName, ZoneName},
> +};
> +
> +/// struct for deserializing a gateway entry in PVE IPAM
> +///
> +/// They are automatically generated by the PVE SDN module when creating a 
> new subnet.
> +#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Hash, PartialOrd, Ord)]
> +pub struct IpamJsonDataGateway {
> +#[serde(rename = "gateway")]
> +_gateway: u8,
> +}
> +
> +/// struct for deserializing a guest entry in PVE IPAM
> +///
> +/// They are automatically created when adding a guest to a VNet that has a 
> Subnet with DHCP
> +/// configured.
> +#[derive(Clone, Debug, Deserialize, PartialEq, Eq, Hash, PartialOrd, Ord)]
> +pub struct IpamJsonDataVm {
> +  

Re: [pve-devel] issues with Virtio-SCSI devicde on Proxmox...

2024-08-13 Thread DERUMIER, Alexandre via pve-devel
--- Begin Message ---
Hi,

Currently they are no way to add custom options for virtio-devices
command line from vm config, so it should be patched to add support for
openvms os and add special tuning.

for 
example:https://git.proxmox.com/?p=qemu-server.git;a=blob_plain;f=PVE/QemuServer.pm;hb=HEAD
sub print_netdevice_full {
 

if ($conf->{ostype} && $conf->{ostype} eq 'openvms') {
tmpstr .= ",disable_legacy=on";
}


>>I have tried both “virtio-scsi-pci” and “virtio-scsi-single”, but no
>>luck. The emulated virtio-scsi device is a legacy device
>>
>>
>>Questions:
>>* is there a way to configure a modern virtio-scsi devcie (i.e.
>>disable_legacy=on) ?
>>

Do you mean than currently on proxmox, the default is legacy mode and
and not modern ?  I was pretty sure than qemu was defaulting to modern,

and that you need to use  ",disable-legacy=off,disable-modern=true" to
switch to legacy.

(and AFAIK, proxmox qemu version don't have any patch forcing legacy)

but maybe I'm wrong



maybe it's only enabled by default with pci-express (using q35 machine
model) ?

https://github.com/qemu/qemu/blob/9eb51530c12ae645b91e308d16196c68563ea883/docs/pcie.txt
"Virtio devices plugged into PCI Express ports are PCI Express devices
and
have "1.0" behavior by default without IO support.
In both cases disable-legacy and disable-modern properties can be used
to override the behaviour."


in qemu code I see:

https://github.com/qemu/qemu/blob/9eb51530c12ae645b91e308d16196c68563ea883/hw/core/machine.c#L258
GlobalProperty hw_compat_2_6[] = {
{ "virtio-mmio", "format_transport_address", "off" },
/* Optional because not all virtio-pci devices support legacy mode
*/
{ "virtio-pci", "disable-modern", "on",  .optional = true },
{ "virtio-pci", "disable-legacy", "off", .optional = true },
};
const size_t hw_compat_2_6_len = G_N_ELEMENTS(hw_compat_2_6);

so I'm really not sure what is the real default


do you have qemu command line of another kvm hypervisor
(ovirt,rhev,...) where it's working correctly ?  Maybe they are
enabling modern bits manually , when a virtio device is supporting it ?



 Message initial 
De: Christian Moser 
Répondre à: Proxmox VE development discussion 
À: pve-devel@lists.proxmox.com
Cc: Christian Moser 
Objet: [pve-devel] issues with Virtio-SCSI devicde on Proxmox...
Date: 12/08/2024 12:40:10

Hello,

I work for VSI (VMS Software Inc) which is porting the OpenVMS
operating system to x86. At this point we successfully on various
hypervisors, but have some issues on the KVM running on Proxmox.

The OpenVMS VM works just fine with SATA disks and it also works with
for example virtio-network device etc., but trying to use
virtio-scsi hangs  the driver. I have debugged this and I can
successfully configure the port/controller, send the IO request to the
device. It then gets processed by the device, which posts the results
and sets the interrupt bit in the ISR register, but it never
asserts the interrupt hence the driver never gets notified and the I/O
hangs.

I have tried both “virtio-scsi-pci” and “virtio-scsi-single”, but no
luck. The emulated virtio-scsi device is a legacy device. But
then again, the virtio-network device is also a legacy device and here
we are getting interrupts. One thing which bothers me is the
fact that the “legacy interrupt disable” bit is set in the PCI config
space of the virtio-scsi device (i.e. bit 10 at offset 4)

Questions:
*   is there a way to configure a modern virtio-scsi devcie (i.e.
disable_legacy=on) ?
*   why is the legacy interrupt bit set in the PCI config space ?
*   Are there any working driver for virtio-scsi on this KVM using
Q35 machine? i.e. any other OS

Any thoughts why these interrupts are not getting delivered on the PCIE
bus?

thanks

/cmos

___
Christian Moser
Mobile:    +358-40-5022105  
Email:  c...@maklee.com
URL:   www.maklee.com

___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://antiphishing.vadesecure.com/v4?f=MXNCQ25TYXBra2RFV0VXZHojMIIFPS
zCvkhJApFQpU82zfAhP9K7STA7y-
wx1jAbgWgLhPFHm0qw3LR0qKok0w&i=cVVpR014Sk93Z0c1QzFSMs3n08s7G9NqllHFFNJ9
1W0&k=ThnE&r=VWoxMWNDTTl5UmRwenhPYicvAeXd4xlreb0WEHQKfL1pVhoIeFvTknKKfK
g5BYLZ&s=c0303a1ebcdeef9c5d38eaa27d25b873b9eed7bddb72bc270d5944f93d63bb
ec&u=https%3A%2F%2Flists.proxmox.com%2Fcgi-
bin%2Fmailman%2Flistinfo%2Fpve-devel

--- End Message ---
___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] [PATCH pve-firewall 19/21] add support for loading sdn firewall configuration

2024-08-13 Thread Max Carrara
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote:
> Signed-off-by: Stefan Hanreich 
> ---
>  src/PVE/Firewall.pm | 43 +--
>  1 file changed, 41 insertions(+), 2 deletions(-)
>
> diff --git a/src/PVE/Firewall.pm b/src/PVE/Firewall.pm
> index 09544ba..95325a0 100644
> --- a/src/PVE/Firewall.pm
> +++ b/src/PVE/Firewall.pm
> @@ -25,6 +25,7 @@ use PVE::Tools qw($IPV4RE $IPV6RE);
>  use PVE::Tools qw(run_command lock_file dir_glob_foreach);
>  
>  use PVE::Firewall::Helpers;
> +use PVE::RS::Firewall::SDN;
>  
>  my $pvefw_conf_dir = "/etc/pve/firewall";
>  my $clusterfw_conf_filename = "$pvefw_conf_dir/cluster.fw";
> @@ -3644,7 +3645,7 @@ sub lock_clusterfw_conf {
>  }
>  
>  sub load_clusterfw_conf {
> -my ($filename) = @_;
> +my ($filename, $load_sdn_config) = @_;

Small thing:

I would suggest using a prototype here and also accept a hash reference
OR a hash as the last parameter, so that the call signature is a little
more readable.

E.g. right now it's:

load_clusterfw_conf(undef, 1)

VS:

load_clusterfw_conf(undef, { load_sdn_config => 1 })

Or:

load_clusterfw_conf(undef, load_sdn_config => 1)

I know we're gonna phase this whole thing out eventually, but little
things like this help a lot in the long run, IMO. It makes it a little
clearer what the subroutine does at call sites.

I'm not sure if these subroutines are used elsewhere (didn't really
bother to check, sorry), so perhaps you could pass `$filename` via the
hash as well, as an optional parameter. Then it's immediately clear what
*everything* stands for, because a sole `undef` "hides" what's actually
passed to the subroutine.

>  
>  $filename = $clusterfw_conf_filename if !defined($filename);
>  my $empty_conf = {
> @@ -3657,12 +3658,50 @@ sub load_clusterfw_conf {
>   ipset_comments => {},
>  };
>  
> +if ($load_sdn_config) {
> + my $sdn_conf = load_sdn_conf();
> + $empty_conf = { %$empty_conf, %$sdn_conf };
> +}
> +
>  my $cluster_conf = generic_fw_config_parser($filename, $empty_conf, 
> $empty_conf, 'cluster');
>  $set_global_log_ratelimit->($cluster_conf->{options});
>  
>  return $cluster_conf;
>  }
>  
> +sub load_sdn_conf {
> +my $rpcenv = PVE::RPCEnvironment::get();
> +my $authuser = $rpcenv->get_user();
> +
> +my $guests = PVE::Cluster::get_vmlist();
> +my $allowed_vms = [];
> +foreach my $vmid (sort keys %{$guests->{ids}}) {
> + next if !$rpcenv->check($authuser, "/vms/$vmid", [ 'VM.Audit' ], 1);
> + push @$allowed_vms, $vmid;
> +}
> +
> +my $vnets = PVE::Network::SDN::Vnets::config(1);
> +my $privs = [ 'SDN.Audit', 'SDN.Allocate' ];
> +my $allowed_vnets = [];
> +foreach my $vnet (sort keys %{$vnets->{ids}}) {
> + my $zone = $vnets->{ids}->{$vnet}->{zone};
> + next if !$rpcenv->check_any($authuser, "/sdn/zones/$zone/$vnet", 
> $privs, 1);
> + push @$allowed_vnets, $vnet;
> +}
> +
> +my $sdn_config = {
> + ipset => {} ,
> + ipset_comments => {},
> +};
> +
> +eval {
> + $sdn_config = PVE::RS::Firewall::SDN::config($allowed_vnets, 
> $allowed_vms);
> +};
> +warn $@ if $@;
> +
> +return $sdn_config;
> +}
> +
>  sub save_clusterfw_conf {
>  my ($cluster_conf) = @_;
>  
> @@ -4731,7 +4770,7 @@ sub init {
>  sub update {
>  my $code = sub {
>  
> - my $cluster_conf = load_clusterfw_conf();
> + my $cluster_conf = load_clusterfw_conf(undef, 1);
>   my $hostfw_conf = load_hostfw_conf($cluster_conf);
>  
>   if (!is_enabled_and_not_nftables($cluster_conf, $hostfw_conf)) {



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] [PATCH proxmox-perl-rs 21/21] add PVE::RS::Firewall::SDN module

2024-08-13 Thread Max Carrara
On Wed Jun 26, 2024 at 2:15 PM CEST, Stefan Hanreich wrote:
> Used for obtaining the IPSets that get autogenerated by the nftables
> firewall. The returned configuration has the same format as the
> pve-firewall uses internally, making it compatible with the existing
> pve-firewall code.
>
> Signed-off-by: Stefan Hanreich 
> ---
>  pve-rs/Cargo.toml  |   1 +
>  pve-rs/Makefile|   1 +
>  pve-rs/src/firewall/mod.rs |   1 +
>  pve-rs/src/firewall/sdn.rs | 130 +
>  pve-rs/src/lib.rs  |   1 +
>  5 files changed, 134 insertions(+)
>  create mode 100644 pve-rs/src/firewall/mod.rs
>  create mode 100644 pve-rs/src/firewall/sdn.rs
>
> diff --git a/pve-rs/Cargo.toml b/pve-rs/Cargo.toml
> index e40588d..f612b3a 100644
> --- a/pve-rs/Cargo.toml
> +++ b/pve-rs/Cargo.toml
> @@ -43,3 +43,4 @@ proxmox-subscription = "0.4"
>  proxmox-sys = "0.5"
>  proxmox-tfa = { version = "4.0.4", features = ["api"] }
>  proxmox-time = "2"
> +proxmox-ve-config = { version = "0.1.0" }

This hunk doesn't apply anymore because proxmox-sys was bumped.

Manually adding proxmox-ve-config as depencency works just fine though,
so this needs just a little rebase.

> diff --git a/pve-rs/Makefile b/pve-rs/Makefile
> index c6b4e08..d01da69 100644
> --- a/pve-rs/Makefile
> +++ b/pve-rs/Makefile
> @@ -28,6 +28,7 @@ PERLMOD_GENPACKAGE := /usr/lib/perlmod/genpackage.pl \
>  
>  PERLMOD_PACKAGES := \
> PVE::RS::APT::Repositories \
> +   PVE::RS::Firewall::SDN \
> PVE::RS::OpenId \
> PVE::RS::ResourceScheduling::Static \
> PVE::RS::TFA
> diff --git a/pve-rs/src/firewall/mod.rs b/pve-rs/src/firewall/mod.rs
> new file mode 100644
> index 000..8bd18a8
> --- /dev/null
> +++ b/pve-rs/src/firewall/mod.rs
> @@ -0,0 +1 @@
> +pub mod sdn;
> diff --git a/pve-rs/src/firewall/sdn.rs b/pve-rs/src/firewall/sdn.rs
> new file mode 100644
> index 000..55f3e93
> --- /dev/null
> +++ b/pve-rs/src/firewall/sdn.rs
> @@ -0,0 +1,130 @@
> +#[perlmod::package(name = "PVE::RS::Firewall::SDN", lib = "pve_rs")]
> +mod export {
> +use std::collections::HashMap;
> +use std::{fs, io};
> +
> +use anyhow::{bail, Context, Error};
> +use serde::Serialize;
> +
> +use proxmox_ve_config::{
> +common::Allowlist,
> +firewall::types::ipset::{IpsetAddress, IpsetEntry},
> +firewall::types::Ipset,
> +guest::types::Vmid,
> +sdn::{
> +config::{RunningConfig, SdnConfig},
> +ipam::{Ipam, IpamJson},
> +SdnNameError, VnetName,

SdnNameError isn't used here.

> +},
> +};
> +
> +#[derive(Clone, Debug, Default, Serialize)]
> +pub struct LegacyIpsetEntry {
> +nomatch: bool,
> +cidr: String,
> +comment: Option,
> +}
> +
> +impl LegacyIpsetEntry {
> +pub fn from_ipset_entry(entry: &IpsetEntry) -> Vec 
> {
> +let mut entries = Vec::new();
> +
> +match &entry.address {
> +IpsetAddress::Alias(name) => {
> +entries.push(Self {
> +nomatch: entry.nomatch,
> +cidr: name.to_string(),
> +comment: entry.comment.clone(),
> +});
> +}
> +IpsetAddress::Cidr(cidr) => {
> +entries.push(Self {
> +nomatch: entry.nomatch,
> +cidr: cidr.to_string(),
> +comment: entry.comment.clone(),
> +});
> +}
> +IpsetAddress::Range(range) => {
> +entries.extend(range.to_cidrs().into_iter().map(|cidr| 
> Self {
> +nomatch: entry.nomatch,
> +cidr: cidr.to_string(),
> +comment: entry.comment.clone(),
> +}))
> +}
> +};
> +
> +entries
> +}
> +}
> +
> +#[derive(Clone, Debug, Default, Serialize)]
> +pub struct SdnFirewallConfig {
> +ipset: HashMap>,
> +ipset_comments: HashMap,
> +}
> +
> +impl SdnFirewallConfig {
> +pub fn new() -> Self {
> +Default::default()
> +}
> +
> +pub fn extend_ipsets(&mut self, ipsets: impl IntoIterator Ipset>) {
> +for ipset in ipsets {
> +let entries = ipset
> +.iter()
> +.flat_map(LegacyIpsetEntry::from_ipset_entry)
> +.collect();
> +
> +self.ipset.insert(ipset.name().name().to_string(), entries);
> +
> +if let Some(comment) = &ipset.comment {
> +self.ipset_comments
> +.insert(ipset.name().name().to_string(), 
> comment.to_string());
> +}
> +}
> +}
> +}
> +
> +const SDN_RUNNING_CONFIG: &str = "/etc/pve/sdn/.runn

[pve-devel] [PATCH installer v2 4/5] fix #5250: tui: expose new btrfs `compress` option

2024-08-13 Thread Christoph Heiss
Signed-off-by: Christoph Heiss 
---
Changes v1 -> v2:
  * rebased on master

 proxmox-tui-installer/src/views/bootdisk.rs | 37 +
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/proxmox-tui-installer/src/views/bootdisk.rs 
b/proxmox-tui-installer/src/views/bootdisk.rs
index 8db33dd..625374e 100644
--- a/proxmox-tui-installer/src/views/bootdisk.rs
+++ b/proxmox-tui-installer/src/views/bootdisk.rs
@@ -19,8 +19,9 @@ use proxmox_installer_common::{
 check_zfs_raid_config,
 },
 options::{
-AdvancedBootdiskOptions, BootdiskOptions, BtrfsBootdiskOptions, 
BtrfsCompressOption, Disk,
-FsType, LvmBootdiskOptions, ZfsBootdiskOptions, ZFS_CHECKSUM_OPTIONS, 
ZFS_COMPRESS_OPTIONS,
+AdvancedBootdiskOptions, BootdiskOptions, BtrfsBootdiskOptions, Disk, 
FsType,
+LvmBootdiskOptions, ZfsBootdiskOptions, BTRFS_COMPRESS_OPTIONS, 
ZFS_CHECKSUM_OPTIONS,
+ZFS_COMPRESS_OPTIONS,
 },
 setup::{BootType, ProductConfig, ProxmoxProduct, RuntimeInfo},
 };
@@ -573,12 +574,23 @@ struct BtrfsBootdiskOptionsView {
 
 impl BtrfsBootdiskOptionsView {
 fn new(runinfo: &RuntimeInfo, options: &BtrfsBootdiskOptions) -> Self {
-let view = MultiDiskOptionsView::new(
-&runinfo.disks,
-&options.selected_disks,
-FormView::new().child("hdsize", 
DiskSizeEditView::new().content(options.disk_size)),
-)
-.top_panel(TextView::new("Btrfs integration is a technology 
preview!").center());
+let inner = FormView::new()
+.child(
+"compress",
+SelectView::new()
+.popup()
+.with_all(BTRFS_COMPRESS_OPTIONS.iter().map(|o| 
(o.to_string(), *o)))
+.selected(
+BTRFS_COMPRESS_OPTIONS
+.iter()
+.position(|o| *o == options.compress)
+.unwrap_or_default(),
+),
+)
+.child("hdsize", 
DiskSizeEditView::new().content(options.disk_size));
+
+let view = MultiDiskOptionsView::new(&runinfo.disks, 
&options.selected_disks, inner)
+.top_panel(TextView::new("Btrfs integration is a technology 
preview!").center());
 
 Self { view }
 }
@@ -592,17 +604,16 @@ impl BtrfsBootdiskOptionsView {
 
 fn get_values(&mut self) -> Option<(Vec, BtrfsBootdiskOptions)> {
 let (disks, selected_disks) = self.view.get_disks_and_selection()?;
-let disk_size = self
-.view
-.get_options_view()?
-.get_value::(0)?;
+let view = self.view.get_options_view()?;
+let compress = view.get_value::, _>(0)?;
+let disk_size = view.get_value::(1)?;
 
 Some((
 disks,
 BtrfsBootdiskOptions {
 disk_size,
 selected_disks,
-compress: BtrfsCompressOption::default(),
+compress,
 },
 ))
 }
-- 
2.45.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH installer v2 1/5] fix #5250: install: config: add new `btrfs_opts` with `compress` config option

2024-08-13 Thread Christoph Heiss
Signed-off-by: Christoph Heiss 
---
Changes v1 -> v2:
  * no changes

 Proxmox/Install/Config.pm   | 15 ++
 proxmox-auto-installer/src/utils.rs |  1 +
 proxmox-installer-common/src/options.rs | 31 +
 proxmox-installer-common/src/setup.rs   | 21 --
 proxmox-tui-installer/src/setup.rs  |  2 ++
 proxmox-tui-installer/src/views/bootdisk.rs |  5 ++--
 6 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/Proxmox/Install/Config.pm b/Proxmox/Install/Config.pm
index ae70093..7e67f69 100644
--- a/Proxmox/Install/Config.pm
+++ b/Proxmox/Install/Config.pm
@@ -79,6 +79,9 @@ my sub init_cfg {
copies => 1,
arc_max => Proxmox::Install::RunEnv::default_zfs_arc_max(), # in MiB
},
+   btrfs_opts => {
+   compress => 'off',
+   },
# TODO: single disk selection config
target_hd => undef,
disk_selection => {},
@@ -173,6 +176,18 @@ sub get_zfs_opt {
 return defined($k) ? $zfs_opts->{$k} : $zfs_opts;
 }
 
+sub set_btrfs_opt {
+my ($k, $v) = @_;
+my $opts = get('btrfs_opts');
+croak "unknown btrfs opts key '$k'" if !exists($opts->{$k});
+$opts->{$k} = $v;
+}
+sub get_btrfs_opt {
+my ($k) = @_;
+my $opts = get('btrfs_opts');
+return defined($k) ? $opts->{$k} : $opts;
+}
+
 sub set_target_hd { set_key('target_hd', $_[0]); }
 sub get_target_hd { return get('target_hd'); }
 
diff --git a/proxmox-auto-installer/src/utils.rs 
b/proxmox-auto-installer/src/utils.rs
index 45ad222..ae7dbbd 100644
--- a/proxmox-auto-installer/src/utils.rs
+++ b/proxmox-auto-installer/src/utils.rs
@@ -340,6 +340,7 @@ pub fn parse_answer(
 minfree: None,
 maxvz: None,
 zfs_opts: None,
+btrfs_opts: None,
 target_hd: None,
 disk_selection: BTreeMap::new(),
 existing_storage_auto_rename: 1,
diff --git a/proxmox-installer-common/src/options.rs 
b/proxmox-installer-common/src/options.rs
index 9375ded..d99e26d 100644
--- a/proxmox-installer-common/src/options.rs
+++ b/proxmox-installer-common/src/options.rs
@@ -102,10 +102,40 @@ impl LvmBootdiskOptions {
 }
 }
 
+/// See the accompanying mount option in btrfs(5).
+#[derive(Copy, Clone, Debug, Default, Deserialize, Eq, PartialEq)]
+#[serde(rename_all(deserialize = "lowercase"))]
+pub enum BtrfsCompressOption {
+On,
+#[default]
+Off,
+Zlib,
+Lzo,
+Zstd,
+}
+
+impl fmt::Display for BtrfsCompressOption {
+fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
+write!(f, "{}", format!("{self:?}").to_lowercase())
+}
+}
+
+impl From<&BtrfsCompressOption> for String {
+fn from(value: &BtrfsCompressOption) -> Self {
+value.to_string()
+}
+}
+
+pub const BTRFS_COMPRESS_OPTIONS: &[BtrfsCompressOption] = {
+use BtrfsCompressOption::*;
+&[On, Off, Zlib, Lzo, Zstd]
+};
+
 #[derive(Clone, Debug)]
 pub struct BtrfsBootdiskOptions {
 pub disk_size: f64,
 pub selected_disks: Vec,
+pub compress: BtrfsCompressOption,
 }
 
 impl BtrfsBootdiskOptions {
@@ -115,6 +145,7 @@ impl BtrfsBootdiskOptions {
 Self {
 disk_size: disk.size,
 selected_disks: (0..disks.len()).collect(),
+compress: BtrfsCompressOption::default(),
 }
 }
 }
diff --git a/proxmox-installer-common/src/setup.rs 
b/proxmox-installer-common/src/setup.rs
index e4e609b..d7f1c47 100644
--- a/proxmox-installer-common/src/setup.rs
+++ b/proxmox-installer-common/src/setup.rs
@@ -13,8 +13,8 @@ use serde::{de, Deserialize, Deserializer, Serialize, 
Serializer};
 
 use crate::{
 options::{
-BtrfsRaidLevel, Disk, FsType, ZfsBootdiskOptions, ZfsChecksumOption, 
ZfsCompressOption,
-ZfsRaidLevel,
+BtrfsBootdiskOptions, BtrfsCompressOption, BtrfsRaidLevel, Disk, 
FsType,
+ZfsBootdiskOptions, ZfsChecksumOption, ZfsCompressOption, ZfsRaidLevel,
 },
 utils::CidrAddress,
 };
@@ -221,6 +221,20 @@ impl From for InstallZfsOption {
 }
 }
 
+#[derive(Debug, Deserialize, Serialize)]
+pub struct InstallBtrfsOption {
+#[serde(serialize_with = "serialize_as_display")]
+pub compress: BtrfsCompressOption,
+}
+
+impl From for InstallBtrfsOption {
+fn from(opts: BtrfsBootdiskOptions) -> Self {
+InstallBtrfsOption {
+compress: opts.compress,
+}
+}
+}
+
 pub fn read_json Deserialize<'de>, P: AsRef>(path: P) -> 
Result {
 let file = File::open(path).map_err(|err| err.to_string())?;
 let reader = BufReader::new(file);
@@ -475,6 +489,9 @@ pub struct InstallConfig {
 #[serde(skip_serializing_if = "Option::is_none")]
 pub zfs_opts: Option,
 
+#[serde(skip_serializing_if = "Option::is_none")]
+pub btrfs_opts: Option,
+
 #[serde(
 serialize_with = "serialize_disk_opt",
 skip_serializing_if = "Option::is_none",
diff --git a/proxmox-tui-installer/src/setup.rs 
b/proxmox-tui-installer/src/setup.rs
index e

[pve-devel] [PATCH installer v2 2/5] fix #5250: install: write btrfs `compress` option to fstab

2024-08-13 Thread Christoph Heiss
`compress` instead of `compress-force` is used, as the latter can have
unindented (performance) implications, as the name implies. That would
be neither expected by users nor should such a decision made without the
user explicitly opting for it.

Others do the same, e.g. the installer for RedHat/Fedora systems (aka.
Anaconda) opts for `compress` too.

Signed-off-by: Christoph Heiss 
---
Changes v1 -> v2:
  * no changes
  * moved some mount setup code here from next patch

 Proxmox/Install.pm | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/Proxmox/Install.pm b/Proxmox/Install.pm
index fa2702c..5c64c3d 100644
--- a/Proxmox/Install.pm
+++ b/Proxmox/Install.pm
@@ -1065,7 +1065,16 @@ sub extract_data {
 
die "unable to detect FS UUID" if !defined($fsuuid);
 
-   $fstab .= "UUID=$fsuuid / btrfs defaults 0 1\n";
+   my $btrfs_opts = Proxmox::Install::Config::get_btrfs_opt();
+
+   my $mountopts = 'defaults';
+   if ($btrfs_opts->{compress} eq 'on') {
+   $mountopts .= ',compress';
+   } elsif ($btrfs_opts->{compress} ne 'off') {
+   $mountopts .= ",compress=$btrfs_opts->{compress}";
+   }
+
+   $fstab .= "UUID=$fsuuid / btrfs $mountopts 0 1\n";
} else {
my $root_mountopt = $fssetup->{$filesys}->{root_mountopt} || 
'defaults';
$fstab .= "$rootdev / $filesys ${root_mountopt} 0 1\n";
-- 
2.45.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH installer v2 3/5] fix #5250: proxinstall: expose new btrfs `compress` option

2024-08-13 Thread Christoph Heiss
Signed-off-by: Christoph Heiss 
---
Changes v1 -> v2:
  * no changes
  * moved some mount option setup code to previous patch

 proxinstall | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/proxinstall b/proxinstall
index 12f3eaa..c7776f0 100755
--- a/proxinstall
+++ b/proxinstall
@@ -1160,6 +1160,21 @@ my $create_raid_advanced_grid = sub {
 my $create_btrfs_raid_advanced_grid = sub {
 my ($hdsize_btn) = @_;
 my $labeled_widgets = [];
+
+my $combo_compress = Gtk3::ComboBoxText->new();
+$combo_compress->set_tooltip_text("btrfs compression algorithm for boot 
volume");
+my $comp_opts = ["on", "off", "zlib", "lzo", "zstd"];
+foreach my $opt (@$comp_opts) {
+   $combo_compress->append($opt, $opt);
+}
+my $compress = Proxmox::Install::Config::get_btrfs_opt('compress') // 
'off';
+$combo_compress->set_active_id($compress);
+$combo_compress->signal_connect (changed => sub {
+   my $w = shift;
+   Proxmox::Install::Config::set_btrfs_opt('compress', 
$w->get_active_text());
+});
+push @$labeled_widgets, ['compress', $combo_compress];
+
 push @$labeled_widgets, ['hdsize', $hdsize_btn, 'GB'];
 return $create_label_widget_grid->($labeled_widgets);;
 };
-- 
2.45.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH installer v2 5/5] fix #5250: auto-installer: expose new btrfs `compress` option

2024-08-13 Thread Christoph Heiss
Signed-off-by: Christoph Heiss 
---
Changes v1 -> v2:
  * squashed in separate tests patch

 proxmox-auto-installer/src/answer.rs  |  6 -
 proxmox-auto-installer/src/utils.rs   |  6 -
 .../tests/resources/parse_answer/btrfs.json   | 24 +++
 .../tests/resources/parse_answer/btrfs.toml   | 17 +
 4 files changed, 51 insertions(+), 2 deletions(-)
 create mode 100644 
proxmox-auto-installer/tests/resources/parse_answer/btrfs.json
 create mode 100644 
proxmox-auto-installer/tests/resources/parse_answer/btrfs.toml

diff --git a/proxmox-auto-installer/src/answer.rs 
b/proxmox-auto-installer/src/answer.rs
index d691da1..fd0ed0d 100644
--- a/proxmox-auto-installer/src/answer.rs
+++ b/proxmox-auto-installer/src/answer.rs
@@ -1,6 +1,9 @@
 use clap::ValueEnum;
 use proxmox_installer_common::{
-options::{BtrfsRaidLevel, FsType, ZfsChecksumOption, ZfsCompressOption, 
ZfsRaidLevel},
+options::{
+BtrfsCompressOption, BtrfsRaidLevel, FsType, ZfsChecksumOption, 
ZfsCompressOption,
+ZfsRaidLevel,
+},
 utils::{CidrAddress, Fqdn},
 };
 use serde::{Deserialize, Serialize};
@@ -270,6 +273,7 @@ pub struct LvmOptions {
 pub struct BtrfsOptions {
 pub hdsize: Option,
 pub raid: Option,
+pub compress: Option,
 }
 
 #[derive(Clone, Deserialize, Serialize, Debug, PartialEq)]
diff --git a/proxmox-auto-installer/src/utils.rs 
b/proxmox-auto-installer/src/utils.rs
index ae7dbbd..9e86053 100644
--- a/proxmox-auto-installer/src/utils.rs
+++ b/proxmox-auto-installer/src/utils.rs
@@ -11,7 +11,8 @@ use crate::{
 use proxmox_installer_common::{
 options::{FsType, NetworkOptions, ZfsChecksumOption, ZfsCompressOption},
 setup::{
-InstallConfig, InstallRootPassword, InstallZfsOption, LocaleInfo, 
RuntimeInfo, SetupInfo,
+InstallBtrfsOption, InstallConfig, InstallRootPassword, 
InstallZfsOption, LocaleInfo,
+RuntimeInfo, SetupInfo,
 },
 };
 use serde::{Deserialize, Serialize};
@@ -394,6 +395,9 @@ pub fn parse_answer(
 config.hdsize = btrfs
 .hdsize
 .unwrap_or(runtime_info.disks[first_selected_disk].size);
+config.btrfs_opts = Some(InstallBtrfsOption {
+compress: btrfs.compress.unwrap_or_default(),
+})
 }
 }
 Ok(config)
diff --git a/proxmox-auto-installer/tests/resources/parse_answer/btrfs.json 
b/proxmox-auto-installer/tests/resources/parse_answer/btrfs.json
new file mode 100644
index 000..0330a38
--- /dev/null
+++ b/proxmox-auto-installer/tests/resources/parse_answer/btrfs.json
@@ -0,0 +1,24 @@
+{
+  "autoreboot": 1,
+  "cidr": "192.168.1.114/24",
+  "country": "at",
+  "dns": "192.168.1.254",
+  "domain": "testinstall",
+  "disk_selection": {
+"6": "6",
+"7": "7"
+  },
+  "filesys": "btrfs (RAID1)",
+  "gateway": "192.168.1.1",
+  "hdsize": 80.0,
+  "existing_storage_auto_rename": 1,
+  "hostname": "pveauto",
+  "keymap": "de",
+  "mailto": "mail@no.invalid",
+  "mngmt_nic": "eno1",
+  "root_password": { "plain": "123456" },
+  "timezone": "Europe/Vienna",
+  "btrfs_opts": {
+"compress": "zlib"
+  }
+}
diff --git a/proxmox-auto-installer/tests/resources/parse_answer/btrfs.toml 
b/proxmox-auto-installer/tests/resources/parse_answer/btrfs.toml
new file mode 100644
index 000..8fcd27d
--- /dev/null
+++ b/proxmox-auto-installer/tests/resources/parse_answer/btrfs.toml
@@ -0,0 +1,17 @@
+[global]
+keyboard = "de"
+country = "at"
+fqdn = "pveauto.testinstall"
+mailto = "mail@no.invalid"
+timezone = "Europe/Vienna"
+root_password = "123456"
+
+[network]
+source = "from-dhcp"
+
+[disk-setup]
+filesystem = "btrfs"
+btrfs.raid = "raid1"
+btrfs.compress = "zlib"
+btrfs.hdsize = 80
+disk_list = ["sda", "sdb"]
-- 
2.45.2



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



[pve-devel] [PATCH installer v2 0/5] fix #5250: add btrfs `compress` mount option support

2024-08-13 Thread Christoph Heiss
Fixes #5250 [0].

Pretty much as it says on the tin, this adds the `compress` mount option
for Btrfs much in the same way as the ZFS equivalent.

W.r.t to the discussion in #5250 - `compress` is used. For a detailed
explanation, see patch #2.

[0] https://bugzilla.proxmox.com/show_bug.cgi?id=5250

History
===
v1: https://lists.proxmox.com/pipermail/pve-devel/2024-May/063845.html

Notable changes v1 -> v2:
  * rebased on latest master
  * squashed auto-installer tests into feature patch

Diffstat


Christoph Heiss (5):
  fix #5250: install: config: add new `btrfs_opts` with `compress`
config option
  fix #5250: install: write btrfs `compress` option to fstab
  fix #5250: proxinstall: expose new btrfs `compress` option
  fix #5250: tui: expose new btrfs `compress` option
  fix #5250: auto-installer: expose new btrfs `compress` option

 Proxmox/Install.pm| 11 +-
 Proxmox/Install/Config.pm | 15 
 proxinstall   | 15 
 proxmox-auto-installer/src/answer.rs  |  6 +++-
 proxmox-auto-installer/src/utils.rs   |  7 +++-
 .../tests/resources/parse_answer/btrfs.json   | 24 +
 .../tests/resources/parse_answer/btrfs.toml   | 17 ++
 proxmox-installer-common/src/options.rs   | 31 +
 proxmox-installer-common/src/setup.rs | 21 ++--
 proxmox-tui-installer/src/setup.rs|  2 ++
 proxmox-tui-installer/src/views/bootdisk.rs   | 34 +--
 11 files changed, 167 insertions(+), 16 deletions(-)
 create mode 100644 
proxmox-auto-installer/tests/resources/parse_answer/btrfs.json
 create mode 100644 
proxmox-auto-installer/tests/resources/parse_answer/btrfs.toml

--
2.44.0



___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



Re: [pve-devel] issues with Virtio-SCSI devicde on Proxmox...

2024-08-13 Thread DERUMIER, Alexandre via pve-devel
--- Begin Message ---
Hi,

I didn't see the responde of Fiona but indeed:

https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg01567.html

"virtio devices can be exposed in upto three ways

 - Legacy - follows virtio 0.9 specification. always uses PCI
ID range 0x1000-0x103F
 - Transitional - follows virtio 0.9 specification by default, but
  can auto-negotiate with guest for 1.0 spce. Always
  uses PCI ID range 0x1000-0x103F
 - Modern - follows virtio 1.0 specification. always uses PCI
ID range 0x1040-0x107F

With QEMU, historically devices placed on a PCI bus will always default
to being in transitional mode, while devices placed on a PCI-E bus will
always dfault to being in modern mode.
"


 Message initial 
De: Fiona Ebner 
Répondre à: Proxmox VE development discussion 
À: Proxmox VE development discussion ,
Christian Moser 
Objet: Re: [pve-devel] issues with Virtio-SCSI devicde on Proxmox...
Date: 13/08/2024 10:55:47

Hi,

Am 12.08.24 um 12:40 schrieb Christian Moser:
> Hello,
> 
> I work for VSI (VMS Software Inc) which is porting the OpenVMS
> operating system to x86. At this point we successfully on various
> hypervisors, but have some issues on the KVM running on Proxmox.
> 
> The OpenVMS VM works just fine with SATA disks and it also works with
> for example virtio-network device etc., but trying to use
> virtio-scsi hangs  the driver. I have debugged this and I can
> successfully configure the port/controller, send the IO request to
> the
> device. It then gets processed by the device, which posts the results
> and sets the interrupt bit in the ISR register, but it never
> asserts the interrupt hence the driver never gets notified and the
> I/O hangs.
> 
> I have tried both “virtio-scsi-pci” and “virtio-scsi-single”, but no
> luck. The emulated virtio-scsi device is a legacy device. But
> then again, the virtio-network device is also a legacy device and
> here we are getting interrupts. One thing which bothers me is the
> fact that the “legacy interrupt disable” bit is set in the PCI config
> space of the virtio-scsi device (i.e. bit 10 at offset 4)
> 
> Questions:
> * is there a way to configure a modern virtio-scsi devcie (i.e.
> disable_legacy=on) ?

I've already answered this question when you asked in a mail addressed
directly to me:

Am 12.08.24 um 11:58 schrieb Fiona Ebner:
> Hi,
> 
> It seems that you either need to attach the "virtio-scsi-pci" device
> to
> a pcie bus or explicitly set the "disable_legacy=on" option for the
> device, neither of which Proxmox VE currently does or allows
> configuring. The only way right now seems to be to attach the disk
> yourself via custom arguments (the 'args' in the Proxmox VE VM
> configuration), but then the disk will be invisible to Proxmox VE
> operations which look at specific disk keys in the configuration!
> 
> Feel free to open a feature request on our bug tracker to make this
> configurable:
> https://antiphishing.vadesecure.com/v4?f=MDk0SW9xRkhTVGYydkJlTIW3tbTr
> aK7neiUoWcvis0-pokd-Q2cwuWCZhgBcTw_yw4KqJS-oP6jCsk-zj-
> 1YMQ&i=ZURHSDhnY0huQ2tPS3VZahJdhRaQu1ItpJrYkl8wXrA&k=q1N6&r=RjIyR1Rob
> kVxVWlHTXhKT3I72JjP2S9ryFg3B_csBogfeb2oROpP8B8yUJUd6awk&s=1aab5a2ada7
> 3beb46aa02df4e18ff6c5ba2db6d6ff2d1f302a3c4c83b13c8ef6&u=https%3A%2F%2
> Fbugzilla.proxmox.com%2F
> 
> P.S. Please write to the developer list rather than individual
> developers for such questions in the feature:
> https://antiphishing.vadesecure.com/v4?f=MDk0SW9xRkhTVGYydkJlTIW3tbTr
> aK7neiUoWcvis0-pokd-Q2cwuWCZhgBcTw_yw4KqJS-oP6jCsk-zj-
> 1YMQ&i=ZURHSDhnY0huQ2tPS3VZahJdhRaQu1ItpJrYkl8wXrA&k=q1N6&r=RjIyR1Rob
> kVxVWlHTXhKT3I72JjP2S9ryFg3B_csBogfeb2oROpP8B8yUJUd6awk&s=50133960c87
> 16b5426bc084f398f7760f04af8739fd68cad36d17b1dcd887778&u=https%3A%2F%2
> Flists.proxmox.com%2Fcgi-bin%2Fmailman%2Flistinfo%2Fpve-devel
> 
> Best Regards,
> Fiona

> * why is the legacy interrupt bit set in the PCI config space ?

Most likely because the virtio-scsi-pci is configured without the
"disable_legacy=on" option. If not explicitly set, the option will be
"disable_legacy=auto" and when not attached to PCIe (which is the case
for Proxmox VE currently), then legacy mode will be enabled.

> * Are there any working driver for virtio-scsi on this KVM using Q35
> machine? i.e. any other OS

The virtio drivers for Windows and the ones in Linux work just fine
with
our configuration.


> Any thoughts why these interrupts are not getting delivered on the
> PCIE bus?

We do not configure the virtio-scsi-pci on a PCIe bus currently, see my
initial answer.

Best Regards,
Fiona


___
pve-devel mailing list
pve-devel@lists.proxmox.com
https://antiphishing.vadesecure.com/v4?f=MDk0SW9xRkhTVGYydkJlTIW3tbTraK
7neiUoWcvis0-pokd-Q2cwuWCZhgBcTw_yw4KqJS-oP6jCsk-zj-
1YMQ&i=ZURHSDhnY0huQ2tPS3VZahJdhRaQu1ItpJrYkl8wXrA&k=q1N6&r=RjIyR1RobkV
xVWlHTXhKT3I72JjP2S9ryFg3B_csBogfeb2oROpP8B8yUJUd6awk&s=50133960c871