[pve-devel] [PATCH cluster/manager/storage/docs 0/9] fix #4886: improve SSH handling
this series replaces the old mechanism that used a cluster-wide merged known hosts file with distributing of each node's host key via pmxcfs, and pinning the distributed key explicitly for internal SSH connections. the main changes in pve-cluster somewhat break the old manager and storage versions, but only when such a partial upgrade is mixed with a host key rotation of some sort. pve-storage uses a newly introduced helper, so needs a versioned dependency accordingly. the last pve-docs patch has a placeholder for the actual version shipping the changes which needs to be replaced when applying. there's still some potential for follow-ups: - 'pvecm ssh' wrapper to debug and/or re-use the host key pinning (and other future changes) - also add non-RSA host keys - key (and thus authorized keys) and/or sshd disentangling (this potentially also affects external access, so might be done on a major release to give more heads up) cluster: Fabian Grünbichler (4): fix #4886: write node SSH hostkey to pmxcfs fix #4886: SSH: pin node's host key if available ssh: expose SSH options on their own pvecm: stop merging SSH known hosts by default src/PVE/CLI/pvecm.pm | 10 -- src/PVE/Cluster/Setup.pm | 24 +--- src/PVE/SSHInfo.pm | 31 +++ 3 files changed, 56 insertions(+), 9 deletions(-) docs: Fabian Grünbichler (2): ssh: make pitfalls a regular section instead of block ssh: document PVE-specific setup pvecm.adoc | 26 +- 1 file changed, 21 insertions(+), 5 deletions(-) manager: Fabian Grünbichler (2): vnc: use SSH command helper pvesh: use SSH command helper PVE/API2/Nodes.pm | 3 ++- PVE/CLI/pvesh.pm | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) storage: Fabian Grünbichler (1): upload: use SSH helper to get ssh/scp options src/PVE/API2/Storage/Status.pm | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH docs 2/2] ssh: document PVE-specific setup
such as adapted configs and managed files. Signed-off-by: Fabian Grünbichler --- Notes: actual version needs to be inserted! pvecm.adoc | 18 ++ 1 file changed, 18 insertions(+) diff --git a/pvecm.adoc b/pvecm.adoc index 5b5b27b..3a32cfb 100644 --- a/pvecm.adoc +++ b/pvecm.adoc @@ -918,6 +918,24 @@ transfer memory and disk contents. * Storage replication +SSH setup +~ + +On {pve} systems, the following changes are made to the SSH configuration/setup: + +* the `root` user's SSH client config gets setup to prefer `AES` over `ChaCha20` + +* the `root` user's `authorized_keys` file gets linked to + `/etc/pve/priv/authorized_keys`, merging all authorized keys within a cluster + +* `sshd` is configured to allow logging in as root with a password + +NOTE: Older systems might also have `/etc/ssh/ssh_known_hosts` set up as symlink +pointing to `/etc/pve/priv/known_hosts`, containing a merged version of all +node host keys. This system was replaced with explicit host key pinning in +`pve-cluster <>`, the symlink can be deconfigured if still in +place by running `pvecm updatecerts --unmerge-known-hosts`. + Pitfalls due to automatic execution of `.bashrc` and siblings ~ -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH docs 1/2] ssh: make pitfalls a regular section instead of block
because we'll add another one before it, and formatting is off otherwise. Signed-off-by: Fabian Grünbichler --- pvecm.adoc | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/pvecm.adoc b/pvecm.adoc index 1f78585..5b5b27b 100644 --- a/pvecm.adoc +++ b/pvecm.adoc @@ -918,9 +918,9 @@ transfer memory and disk contents. * Storage replication -.Pitfalls due to automatic execution of `.bashrc` and siblings -[IMPORTANT] - +Pitfalls due to automatic execution of `.bashrc` and siblings +~ + In case you have a custom `.bashrc`, or similar files that get executed on login by the configured shell, `ssh` will automatically run it once the session is established successfully. This can cause some unexpected behavior, as those @@ -940,8 +940,6 @@ case $- in *) return;; esac - - Corosync External Vote Support -- -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH cluster 2/4] fix #4886: SSH: pin node's host key if available
if the target node has already stored their SSH host key on pmxcfs, pin it and ignore the global known hosts information. Signed-off-by: Fabian Grünbichler --- src/PVE/SSHInfo.pm | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/src/PVE/SSHInfo.pm b/src/PVE/SSHInfo.pm index c351148..fad23bf 100644 --- a/src/PVE/SSHInfo.pm +++ b/src/PVE/SSHInfo.pm @@ -49,11 +49,24 @@ sub get_ssh_info { sub ssh_info_to_command_base { my ($info, @extra_options) = @_; + +my $nodename = $info->{name}; + +my $known_hosts_file = "/etc/pve/nodes/$nodename/ssh_known_hosts"; +my $known_hosts_options = undef; +if (-f $known_hosts_file) { + $known_hosts_options = [ + '-o', "UserKnownHostsFile=$known_hosts_file", + '-o', 'GlobalKnownHostsFile=none', + ]; +} + return [ '/usr/bin/ssh', '-e', 'none', '-o', 'BatchMode=yes', - '-o', 'HostKeyAlias='.$info->{name}, + '-o', 'HostKeyAlias='.$nodename, + defined($known_hosts_options) ? @$known_hosts_options : (), @extra_options ]; } -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH cluster 1/4] fix #4886: write node SSH hostkey to pmxcfs
so that we can explicitly pin just this key when doing intra-cluster SSH connections. this works similar to the certificate cache we use for API proxying, but without automatic invalidation, since node A doesn't have access to node B's host key.. Signed-off-by: Fabian Grünbichler --- Notes: we could store more than just the RSA one there, but that would have some potential for fallout.. the filename could also be changed to reflect what it contains, not what is used for - e.g., "ssh_host_keys" src/PVE/Cluster/Setup.pm | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/PVE/Cluster/Setup.pm b/src/PVE/Cluster/Setup.pm index 07020d7..4b6f013 100644 --- a/src/PVE/Cluster/Setup.pm +++ b/src/PVE/Cluster/Setup.pm @@ -220,6 +220,20 @@ sub ssh_unmerge_known_hosts { PVE::Tools::file_set_contents($ssh_system_known_hosts, $old); } +sub ssh_create_node_known_hosts { +my ($nodename) = @_; + +my $hostkey = PVE::Tools::file_get_contents($ssh_host_rsa_id); +# Note: file sometimes containe empty lines at start, so we use multiline match +die "can't parse $ssh_host_rsa_id" if $hostkey !~ m/^(ssh-rsa\s\S+)(\s.*)?$/m; +$hostkey = $1; + +my $raw = "$nodename $hostkey"; +PVE::Tools::file_set_contents("/etc/pve/nodes/$nodename/ssh_known_hosts", $raw); + +# TODO: also setup custom keypair and client config here to disentangle entirely from /root/.ssh? +} + sub ssh_merge_known_hosts { my ($nodename, $ip_address, $createLink) = @_; @@ -823,6 +837,7 @@ sub updatecerts_and_ssh { $p->("merge authorized SSH keys and known hosts"); ssh_merge_keys(); ssh_merge_known_hosts($nodename, $local_ip_address, 1); +ssh_create_node_known_hosts($nodename); gen_pve_vzdump_files(); } -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH cluster 3/4] ssh: expose SSH options on their own
for example, to re-use with an scp command. Signed-off-by: Fabian Grünbichler --- this is used by pve-storage, versioned dependency needed accordingly. src/PVE/SSHInfo.pm | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/src/PVE/SSHInfo.pm b/src/PVE/SSHInfo.pm index fad23bf..a26ae31 100644 --- a/src/PVE/SSHInfo.pm +++ b/src/PVE/SSHInfo.pm @@ -47,7 +47,7 @@ sub get_ssh_info { }; } -sub ssh_info_to_command_base { +sub ssh_info_to_ssh_opts { my ($info, @extra_options) = @_; my $nodename = $info->{name}; @@ -62,8 +62,6 @@ sub ssh_info_to_command_base { } return [ - '/usr/bin/ssh', - '-e', 'none', '-o', 'BatchMode=yes', '-o', 'HostKeyAlias='.$nodename, defined($known_hosts_options) ? @$known_hosts_options : (), @@ -71,6 +69,18 @@ sub ssh_info_to_command_base { ]; } +sub ssh_info_to_command_base { +my ($info, @extra_options) = @_; + +my $opts = ssh_info_to_ssh_opts($info, @extra_options); + +return [ + '/usr/bin/ssh', + '-e', 'none', # only works for ssh, not scp! + $opts->@*, +]; +} + sub ssh_info_to_command { my ($info, @extra_options) = @_; my $cmd = ssh_info_to_command_base($info, @extra_options); -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH cluster 4/4] pvecm: stop merging SSH known hosts by default
and allow explicitly unmerging to remove the symlink altogether. Signed-off-by: Fabian Grünbichler --- src/PVE/CLI/pvecm.pm | 10 -- src/PVE/Cluster/Setup.pm | 9 ++--- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/src/PVE/CLI/pvecm.pm b/src/PVE/CLI/pvecm.pm index 0005e4b..0e8ca8f 100755 --- a/src/PVE/CLI/pvecm.pm +++ b/src/PVE/CLI/pvecm.pm @@ -567,12 +567,18 @@ __PACKAGE__->register_method ({ type => 'boolean', optional => 1, }, + 'unmerge-known-hosts' => { + description => "Unmerge legacy SSH known hosts.", + type => 'boolean', + optional => 1, + default => 0, + }, }, }, returns => { type => 'null' }, code => sub { my ($param) = @_; - my ($force_new_cert, $silent) = $param->@{qw(force silent)}; + my ($force_new_cert, $silent, $unmerge) = $param->@{qw(force silent unmerge-known-hosts)}; # pveproxy's ExecStartPre calls this, and as we do IO (on /etc/pve) that can hang # (uninterruptible D state) we could fail the whole service, rendering the API guaranteed @@ -585,7 +591,7 @@ __PACKAGE__->register_method ({ usleep(100 * 1000); } - PVE::Cluster::Setup::updatecerts_and_ssh($force_new_cert, $silent); + PVE::Cluster::Setup::updatecerts_and_ssh($force_new_cert, $silent, $unmerge); PVE::Cluster::prepare_observed_file_basedirs(); }); if ($got_timeout) { diff --git a/src/PVE/Cluster/Setup.pm b/src/PVE/Cluster/Setup.pm index 4b6f013..42dff85 100644 --- a/src/PVE/Cluster/Setup.pm +++ b/src/PVE/Cluster/Setup.pm @@ -816,7 +816,7 @@ sub generate_local_files { } sub updatecerts_and_ssh { -my ($force_new_cert, $silent) = @_; +my ($force_new_cert, $silent, $unmerge_ssh) = @_; my $p = sub { print "$_[0]\n" if !$silent }; @@ -834,9 +834,12 @@ sub updatecerts_and_ssh { $p->("generate new node certificate") if $force_new_cert; gen_pve_node_files($nodename, $local_ip_address, $force_new_cert); -$p->("merge authorized SSH keys and known hosts"); +$p->("merge authorized SSH keys"); ssh_merge_keys(); -ssh_merge_known_hosts($nodename, $local_ip_address, 1); +if ($unmerge_ssh) { + $p->("unmerge SSH known hosts"); + ssh_unmerge_known_hosts(); +} ssh_create_node_known_hosts($nodename); gen_pve_vzdump_files(); } -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH manager 2/2] pvesh: use SSH command helper
to benefit from future improvements like known host key pinning. Signed-off-by: Fabian Grünbichler --- PVE/CLI/pvesh.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/PVE/CLI/pvesh.pm b/PVE/CLI/pvesh.pm index 44a65213c..d373ae29f 100755 --- a/PVE/CLI/pvesh.pm +++ b/PVE/CLI/pvesh.pm @@ -116,7 +116,7 @@ sub proxy_handler { } } -my @ssh_tunnel_cmd = ('ssh', '-o', 'BatchMode=yes', "root\@$remip"); +my $ssh_tunnel_cmd = PVE::SSHInfo::ssh_info_to_command({ ip => $remip, name => $node }); my @pvesh_cmd = ('pvesh', '--noproxy', $cmd, $path, '--output-format', 'json'); if (scalar(@$args)) { @@ -126,7 +126,7 @@ sub proxy_handler { my $res = ''; PVE::Tools::run_command( - [ @ssh_tunnel_cmd, '--', @pvesh_cmd ], + [ $ssh_tunnel_cmd->@*, '--', @pvesh_cmd ], errmsg => "proxy handler failed", outfunc => sub { $res .= shift }, ); -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH manager 1/2] vnc: use SSH command helper
to benefit from future improvements there, like pinning the known host key. Signed-off-by: Fabian Grünbichler --- PVE/API2/Nodes.pm | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/PVE/API2/Nodes.pm b/PVE/API2/Nodes.pm index 3619190de..39139d8a9 100644 --- a/PVE/API2/Nodes.pm +++ b/PVE/API2/Nodes.pm @@ -965,7 +965,8 @@ my $get_vnc_connection_info = sub { my ($remip, $family); if ($node ne 'localhost' && $node ne PVE::INotify::nodename()) { ($remip, $family) = PVE::Cluster::remote_node_ip($node); - $remote_cmd = ['/usr/bin/ssh', '-e', 'none', '-t', $remip , '--']; + $remote_cmd = PVE::SSHInfo::ssh_info_to_command({ ip => $remip, name => $node }, ('-t')); + push @$remote_cmd, '--'; } else { $family = PVE::Tools::get_host_address_family($node); } -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH storage 1/1] upload: use SSH helper to get ssh/scp options
Signed-off-by: Fabian Grünbichler --- requires versioned dependency on libpve-cluster-perl with the new helper src/PVE/API2/Storage/Status.pm | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/PVE/API2/Storage/Status.pm b/src/PVE/API2/Storage/Status.pm index b2336e6..d6de7fb 100644 --- a/src/PVE/API2/Storage/Status.pm +++ b/src/PVE/API2/Storage/Status.pm @@ -466,9 +466,9 @@ __PACKAGE__->register_method ({ if ($node ne 'localhost' && $node ne PVE::INotify::nodename()) { my $remip = PVE::Cluster::remote_node_ip($node); - my @ssh_options = ('-o', 'BatchMode=yes'); + my $ssh_options = PVE::SSHInfo::ssh_info_to_ssh_opts({ ip => $remip, name => $node }); - my @remcmd = ('/usr/bin/ssh', @ssh_options, $remip, '--'); + my @remcmd = ('/usr/bin/ssh', $ssh_options->@*, $remip, '--'); eval { # activate remote storage run_command([@remcmd, '/usr/sbin/pvesm', 'status', '--storage', $param->{storage}]); @@ -480,7 +480,7 @@ __PACKAGE__->register_method ({ errmsg => "mkdir failed", ); - $cmd = ['/usr/bin/scp', @ssh_options, '-p', '--', $tmpfilename, "[$remip]:" . PVE::Tools::shell_quote($dest)]; + $cmd = ['/usr/bin/scp', $ssh_options->@*, '-p', '--', $tmpfilename, "[$remip]:" . PVE::Tools::shell_quote($dest)]; $err_cleanup = sub { run_command([@remcmd, 'rm', '-f', '--', $dest]) }; } else { -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] [RFC cluster 0/2] fix #4886: improve SSH handling
--- Begin Message --- On Tue, Jan 09, 2024 at 09:57:30AM +0100, Fabian Grünbichler wrote: > > Esi Y via pve-devel hat am 09.01.2024 06:01 > > CET geschrieben: > > On Thu, Dec 21, 2023 at 10:53:11AM +0100, Fabian Grünbichler wrote: > > > RFC since this would be a bigger change in how we approach intra-cluster > > > SSH access. > > > > > > there are still a few parts that currently don't use SSHInfo, but > > > would need to be switched over if we want to pursue this approach: > > > > > > - get_vnc_connection_info in PVE::API2::Nodes > > > - 'upload' API endpoint in PVE::API2::Storage::Status > > > - SSH proxy in pvesh > > > > > > these changes would need to happen coordinated with the patches from > > > this RFC series! > > > > Not necessarily. > > if we do the unmerge as well, then yes - else a node with unmerged known host > would fail to connect to nodes that joined the cluster after unmerging. First of all, I have just seen the barrage of related patches re the helper, so let me say I think it's good to refactor and bring into consistency invoking ssh by that means. I apologise for late reply, it belongs to here though and I believe is still relevant with your currently pursued solution as well. PVE (and unsuspected users) would not need to depend on the helper if it had not depended on options such as -o HostKeyAlias. For the sake of brewity I filed bug report #5170 [1] re details of that, here I would just pitch that dropping in a config file into /etc/ssh/ssh_config.d/ would be much better than including options into each and every cluster SSH, a config file from which user connections would benefit as well and which would not run the risk of keeping stale entries in "remote" known_hosts partial blocks. The solution is to allow GlobalKnownHostsFile and UserKnownHostsFile as-is, but to add KnownHostsCommand to the client config. This allows for: 1. All entries to be considered alongside with cluster ones; 2. Check what is being sought for with %H and adapt approprietaly - no more worries about all the configs being in sync; 3. Log whenever something when wrong; 3. User never worrying about getting different ssh conn from a command-line and botching any one particulars machine local ones. [1] https://bugzilla.proxmox.com/show_bug.cgi?id=5170 > > > > next steps afterwards: > > > - unmerge known hosts in `pvecm updatecerts`, instead of merging > > > -- to disentangle regular ssh from intra-cluster SSH > > > > Both of these could be accomplished with codebase/complexity reduction in > > an approach to: > > I am not sure what "both" means here? there is only a single thing quoted ;) Fair enough, I expected there will still be some merging necessary, now those are just snippets. The current name could be confusing to a user, e.g. would be better called self_known_hosts or such, but see above, no need for that at all. > > 1. eliminate shared ssh config and key files, i.e. completely remove any > > need for symlinks; and > > that's the evaluate part further below. This is for another thread apparently, the issue is that this proposed known_hosts fix e.g. still does not do anything for the duplicates in authorized_keys. > > 2. instead initialise each node at join (first one at cluster creation) > > into their respective node-local files with ssh certs; whilst > > versus just copying the host key, which is far simpler? I will leave the certs discussion out of this thread to not have it convoluted, the issue with certs in current PVE architectural model is that the CA resides on all nodes anyhow. Initialising here meant nothing else than "HostCertificate /etc/ssh/clusterhostkey-cert.pub" > /etc/ssh/sshd_config.d/clustercerts.conf. > > 3. keeping the setup maintenance-free, since any single key > > addition/refresh does not need to propagate any individual keys around the > > cluster; meanwhile > > yes it does, changing the key would entail revoking the old one (and > distributing that!) and signing the new one. This is not fair counterpoint due to this fix still not addressing bug report #4670 [2] re authorized_keys. Revoking is an append to a file rather than deleting entries from existing, so simpler. [2] https://bugzilla.proxmox.com/show_bug.cgi?id=4670 > > 4. no requirement for additional -o options; > > we already need -o options anyway, so there is no downside to adding > additional ones See above detailed in bug #5170 [1], I just wish this does not exacetbate it, as it is an opportunity to fix 2in1. > > 5. no requirement for sshd config appends, just drop-ins; > > there's no need for sshd config changes either with the patches here? The approach as originally proposed (without KnownHostsCommand) is changing existing functionality - the keys are gone from the machine and it requires non-intuitive manual -o's. So no config changes, but behaviour change that would need documenting at least in the release notes. > > 6. existing X509
[pve-devel] [PATCH storage 1/2] lvm: ignore "activation skip" LV flag during LV activation
LVs with the "activation skip" flag (can be set via `lvchange -ky`) are only activated if `-K` is passed to the activation command. In the next major release, we intend to set the "activation skip" flag for newly created LVs. The goal is to prevent LVs from being auto-activated after boot, and thus fix bug #4997. In preparation for this future change, adjust activation commands issued by the LVM plugin to include the `-K` flag. For LVs without the "activation skip" flag, passing the `-K` flag during activation has no effect, so it should be harmless to always provide it. [1] https://bugzilla.proxmox.com/show_bug.cgi?id=4997 Suggested-by: Fabian Grünbichler Signed-off-by: Friedrich Weber --- src/PVE/Storage/LVMPlugin.pm | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm index 4b951e7..66b42de 100644 --- a/src/PVE/Storage/LVMPlugin.pm +++ b/src/PVE/Storage/LVMPlugin.pm @@ -342,7 +342,7 @@ sub lvcreate { $size .= "k"; # default to kilobytes } -my $cmd = ['/sbin/lvcreate', '-aly', '-Wy', '--yes', '--size', $size, '--name', $name]; +my $cmd = ['/sbin/lvcreate', '-aly', '-K', '-Wy', '--yes', '--size', $size, '--name', $name]; for my $tag (@$tags) { push @$cmd, '--addtag', $tag; } @@ -422,7 +422,7 @@ sub free_image { print "successfully removed volume $volname ($vg/del-$volname)\n"; }; -my $cmd = ['/sbin/lvchange', '-aly', "$vg/$volname"]; +my $cmd = ['/sbin/lvchange', '-aly', '-K', "$vg/$volname"]; run_command($cmd, errmsg => "can't activate LV '$vg/$volname' to zero-out its data"); $cmd = ['/sbin/lvchange', '--refresh', "$vg/$volname"]; run_command($cmd, errmsg => "can't refresh LV '$vg/$volname' to zero-out its data"); @@ -536,7 +536,7 @@ sub activate_volume { my $lvm_activate_mode = 'ey'; -my $cmd = ['/sbin/lvchange', "-a$lvm_activate_mode", $path]; +my $cmd = ['/sbin/lvchange', "-a$lvm_activate_mode", '-K', $path]; run_command($cmd, errmsg => "can't activate LV '$path'"); $cmd = ['/sbin/lvchange', '--refresh', $path]; run_command($cmd, errmsg => "can't refresh LV '$path' for activation"); -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH storage 2/2] fix #4997: lvm: set "activation skip" flag for newly created LVs
Activating an LV creates a device-mapper device. In a cluster with a shared LVM VG (e.g. on top of iSCSI) where an LV is active on nodes 1 and 2, deleting the LV on node 1 will not clean up the device-mapper device on node 2. If an LV with the same name is recreated later, the leftover device-mapper device will cause activation of that LV on node 2 to fail with: > device-mapper: create ioctl on [...] failed: Device or resource busy By default, LVM autoactivates all discovered LVs after boot, thus creating device-mapper device for all discovered LVs. As a result, certain combinations of guest removal (and thus LV removals) and node reboots can cause guest creation or VM live migration (which both entail LV activation) to fail with the above error message, see [1]. To avoid this issue in the future, adjust the LVM plugin to create new LVs with the "activation skip" flag. LVs with that flag are not activated unless `-K` is passed to the activation command. Consequently, new LVs will not be autoactivated after boot anymore, and removing LVs will not leave behind device-mapper devices on other nodes anymore. The LVM plugin is still able to activate LVs, as it already passes `-K` to all activation commands. Note that the flag is only set for newly created LVs, so LVs created before this patch can still trigger #4997. To avoid this, users can manually set the "activation skip" flag on existing LVs. [1] https://bugzilla.proxmox.com/show_bug.cgi?id=4997 Suggested-by: Fabian Grünbichler Signed-off-by: Friedrich Weber --- Notes: Should only be applied close to the next major release, see cover letter. src/PVE/Storage/LVMPlugin.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm index 66b42de..191e8d3 100644 --- a/src/PVE/Storage/LVMPlugin.pm +++ b/src/PVE/Storage/LVMPlugin.pm @@ -342,7 +342,7 @@ sub lvcreate { $size .= "k"; # default to kilobytes } -my $cmd = ['/sbin/lvcreate', '-aly', '-K', '-Wy', '--yes', '--size', $size, '--name', $name]; +my $cmd = ['/sbin/lvcreate', '-aly', '-K', '-ky', '-Wy', '--yes', '--size', $size, '--name', $name]; for my $tag (@$tags) { push @$cmd, '--addtag', $tag; } -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH storage 0/2] fix #4997: lvm: avoid autoactivating (new) LVs after boot
By default, LVM autoactivates LVs after boot. In a cluster with VM disks on a shared LVM VG (e.g. on top of iSCSI), this can indirectly cause guest creation or VM live-migration to fail. See bug #4997 [1] and patch #2 for details. The goal of this series is to avoid autoactivating LVs after boot. Fabian suggested to use the "activation skip" flag for LVs. LVs with that flag can only be activated if the `-K` flag is passed during activation (`-K` is not passed for autoactivation after boot). With patch #1, the LVM plugin passes the `-K` flag to activation commands. If the LV does not have the "activation skip" flag set, this should not have any effect, so it should be safe to apply this patch in the near future. It does not yet fix #4997, though. With patch #2, the LVM plugin sets the "activation skip" flag for newly created LVs. As this can be considered a breaking change, it may make sense to only apply it close to the PVE 9 release. If patch #1 has been available in the last PVE 8 minor release, this should ensure a smooth upgrade even if a cluster is temporarily mixed between PVE 8.x and 9 (PVE 8.x will be able to activate LVs created by PVE 9 with "activation skip"). This will fix #4997 for newly created LVs. Some points to discuss: * Fabian and I discussed whether it may be better to pass `-K` and set the "activation skip" flag only for LVs on a *shared* LVM storage. But this may cause issues for users that incorrectly mark an LVM storage as shared, create a bunch of LVs (with "activation skip" flag), then unset the "shared" flag, and won't be able to activate LVs afterwards (`lvchange -ay` without `-K` on an LV with "activation skip" is a noop). What do you think? * Even with patch #1 and #2 applied, users can still run into #4997 for LVs that were created before #2, so without the "activation skip" flag. So it might be good to include a note in the 8->9 release notes and/or a warning in the pve8to9 helper and/or ship a script that automatically sets the flag for all existing (PVE-owned) LVs. [1] https://bugzilla.proxmox.com/show_bug.cgi?id=4997 storage: Friedrich Weber (2): lvm: ignore "activation skip" LV flag during LV activation fix #4997: lvm: set "activation skip" flag for newly created LVs src/PVE/Storage/LVMPlugin.pm | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) Summary over all repositories: 1 files changed, 3 insertions(+), 0 deletions(-) -- Generated by git-murpp 0.5.0 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH storage] lvm: avoid warning due to human-readable text in vgs output
If the metadata archive under /etc/lvm/archive for a particular VG has a lot of files or is overly large, `vgs` occasionally prints a message to stdout [1]. The LVM plugin tries to parse this message and thus produces the following confusing warnings in the output of `pvesm status` or the journal (via pvestatd): Use of uninitialized value $size in int at [...]/LVMPlugin.pm line 133 Use of uninitialized value $free in int at [...]/LVMPlugin.pm line 133 Use of uninitialized value $lvcount in int [...]/LVMPlugin.pm line 133 Reported in enterprise support where LVM picked up on VGs on VM disks (due to a missing KRBD device filter in the LVM config), and since several VM disks had VGs with the same name, LVM quickly produced a lot of files in /etc/lvm/archive. Reproducible as follows with a VG named `spam`: for i in $(seq 8192); do vgrename spam spam2; vgrename spam2 spam; done rm /etc/lvm/backup/spam; vgs -o vg_name Output (linebreak for readability): Consider pruning spam VG archive with more then 8 MiB in 8268\n files (check archiving is needed in lvm.conf). VG spam [1] https://sourceware.org/git/?p=lvm2.git;a=blob;f=lib/format_text/archive.c;h=5acf0c04a;hb=38e0c7a1#l222 Signed-off-by: Friedrich Weber --- src/PVE/Storage/LVMPlugin.pm | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm index 4b951e7..557d602 100644 --- a/src/PVE/Storage/LVMPlugin.pm +++ b/src/PVE/Storage/LVMPlugin.pm @@ -130,6 +130,9 @@ sub lvm_vgs { my ($name, $size, $free, $lvcount, $pvname, $pvsize, $pvfree) = split (':', $line); + # skip human-readable messages that vgs occasionally prints to stdout + return if !defined($size); + $vgs->{$name} //= { size => int ($size), free => int ($free), -- 2.39.2 ___ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel