Am 05.09.25 um 11:09 AM schrieb Thomas Lamprecht:
> Am 05.09.25 um 10:54 schrieb Fiona Ebner:
>> Am 04.09.25 um 8:11 PM schrieb Thomas Lamprecht:
>>> Am 04.09.25 um 14:42 schrieb Fiona Ebner:
>>>> The virtual hardware is generated differently (at least for i440fx
>>>> machines) when host_mtu is set or not set on the netdev command line
>>>> [0]. When the MTU is the same value as the default 1500, Proxmox VE
>>>> did not add a host_mtu parameter. This is problematic for migration
>>>> where host_mtu is present on one end of the migration, but not on the
>>>> other [1]. Moreover, the effective setting in the guest (state) will
>>>> still be the host_mtu from the source side, even if a different value
>>>> is used for host_mtu on the target instance's commandline. This will
>>>> not lead to an error loading the migration stream in QEMU, but having
>>>> a larger host_mtu than the bridge MTU is still problematic for certain
>>>> network traffic like
>>>>> iperf3 -c 10.10.10.11 -u -l 2k
>>>> when host_mtu=9000 and bridge MTU=1500.
>>>>
>>>> Pass the values from the source to the target during migration to be
>>>> able to preserve them.
>>>
>>> Which breaks migration from new to old, which can be fine, but seems
>>> avoidable given that we got a tunnel that we can query stuff over.
>>
>> How can we query? The old tunnel only supports very specific commands
>> like 'quit' and 'resume $vmid'. Note that remote migration using the new
>> tunnel version is not broken - an old node will just ignore the
>> additional parameter in the passed-along JSON.
> 
> The absence of a command gives you also information.

Okay, so you mean adding a new command and using that to detect that the
node is recent enough? What should that command be? The capabilities one
you suggest below?

>>
>> We could do something like
>>
>> ssh ... qm start 0 --nets-host-mtu
>>
>> and match for "Unknown option: nets-host-mtu" for detection.
> 
> Yeah, that's exactly what I wrote later in my reply.

I thought you meant matching the error for the actual command. My
suggestion is using a dummy command for early detection and guard using
the new option for the actual command based on that.

>> Alternatively, we could bump the pve-manager version and guard adding
>> the option via the pmxcfs 'version-info' node kv. That mechanism wasn't
>> super reliable in the past though.
> 
> FWIW, we now re-broadcast that periodically and IIRC even on pmxcfs
> start up though.

Yes, and if we really can't get the info we can err on the side of
"assume it's recent enough".

>>> Maybe we could at least catch the "Unknown option: nets-host-mtu"
>>> error explicitly and add some context that the target likely just
>>> needs to be updated to make the migration work.
>>
>> If we don't want to go for either of the above or if there isn't an
>> other way to query, I'll go for that?
> 
> Would be fine for me, it's the simplest thing to do for now.
> 
> Adding some more fleshed out general approach for such things might
> be nice to have available for the future. That could be some
> versioning or a more structured capabilities query, that is split
> into required ones (which block the migration) and hints, that are
> for best-effort stuff, probably also including some basic version
> info like qemu-server, as that often is needed to know if a
> capability is required or not, like here, when migrating to a
> another 8.x node it won't matter, but for a 9.x target node we
> should enforce an e.g. nets-host-mtu to be available.

Sounds sensible.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to