Package: release.debian.org Severity: normal Tags: bookworm User: release.debian....@packages.debian.org Usertags: pu X-Debbugs-Cc: live-b...@packages.debian.org Control: affects -1 + src:live-boot
Dear stable release team, [ Reason ] I'd like to update live-boot to fix its PXE-booting process where, currently in Bookworm, live-boot, when it reaches the phase where it should do DHCP and fetch its squashfs over network, only attemps DHCP on the first NIC that it detect has a link up. [ Impact ] tl;dr: my patch makes live-boot tries DHCP on each and every NIC that has a link. Before the patch, live-boot instead miserably fails to do DHCP. So if 2 NICs are connected, but only one has DHCP, it may just fail if the one without DHCP is discovered first. More in details with my production system example: Let me describe our current use case where live-boot currently fails in Bookworm. We use a live Debian distribution (custom live image) that is booted over PXE to install Debian. DHCP / PXE is only available only in the first NIC (for us, a 1Gbits/s NIC, most of the time). The other 2 NICs (they are 25Gbits/s) are to be used in production. We do things this way for security, to segretate networking for boot and production. Unfortunately, if live-boot "sees" the 2x 25Gbits/s NIC first, before the 1Gbits/s, it will attempt to do DHCP on them (well, in fact, one of them), as they are connected (ie: they have a "link up"), and then since they have no DHCP offer, live-boot will fail. It will never attempt to do DHCP request on the 1Gbits/s NIC. The patch that I'm proposing, we already use it in production. What it does, is that with it, live-boot attempts to do DHCP on each and every NIC that has a link, one by one. So in the above scenario, it tries on the 2x25Gbits/s NICs (since they are discovered first), but as it fails on them, live-boot will continue and try on the 1Gbits/s NIC as well. [ Tests ] We've been patching our live systems ramdisk with the modified components/9990-select-eth-device.sh script (uncompress the ramdisk, replace the script with the new version, and recompress). This made our servers magically boot-up, trying all NICs. Doing this is painful, and has to be done manually after live-boot builds its initrd. One cannot simply use a modified live-boot package as live-boot is downloaded by live-build (from the provided URL in the config file) when the live image is created. So it would be very useful to have this fixed in stable, rather than pointing our users how to fix... [ Risks ] The code is trivial and easy to understand (a simple bash script). [ Checklist ] [x] *all* changes are documented in the d/changelog [x] I reviewed all changes and I approve them [x] attach debdiff against the package in (old)stable [x] the issue is verified as fixed in unstable [ Changes ] See attached diff file.
>From b3469874e91b0facdbf292e41c92bcec3d842dbb Mon Sep 17 00:00:00 2001 From: Thomas Goirand <z...@debian.org> Date: Thu, 28 Nov 2024 22:14:21 +0100 Subject: [PATCH] Do DHCP on multiple interfaces --- components/9990-select-eth-device.sh | 68 ++++++++++++++++------------ debian/changelog | 7 +++ 2 files changed, 46 insertions(+), 29 deletions(-) diff --git a/components/9990-select-eth-device.sh b/components/9990-select-eth-device.sh index b660a3d..719a234 100755 --- a/components/9990-select-eth-device.sh +++ b/components/9990-select-eth-device.sh @@ -93,46 +93,56 @@ Select_eth_device () fi found_eth_dev="" - while true + echo -n "Looking for a connected Ethernet interface." + + for interface in $l_interfaces do - echo -n "Looking for a connected Ethernet interface ..." + # ATTR{carrier} is not set if this is not done + echo -n " $interface ?" + ipconfig -c none -d $interface -t 1 >/dev/null 2>&1 + sleep 1 + done + + echo '' + for step in 1 2 3 4 5 + do for interface in $l_interfaces do - # ATTR{carrier} is not set if this is not done - echo -n " $interface ?" - ipconfig -c none -d $interface -t 1 >/dev/null 2>&1 - sleep 1 - done - - echo '' + # Skip the interface if it's already found. + IN_IT=no + for DEV in $found_eth_dev ; do + if [ "${DEV}" = "$interface" ] ; then + IN_IT=yes + fi + done - for step in 1 2 3 4 5 - do - for interface in $l_interfaces - do + if [ "${IN_IT}" = "no" ] ; then ip link set $interface up carrier=$(cat /sys/class/net/$interface/carrier \ 2>/dev/null) # link detected - - case "${carrier}" in - 1) - echo "Connected $interface found" - # inform initrd's init script : + if [ "${carrier}" = 1 ] ; then + echo "Connected $interface found" + # inform initrd's init script : + if [ -z "${found_eth_dev}" ] ; then + found_eth_dev="$interface" + else found_eth_dev="$found_eth_dev $interface" - found_eth_dev="$(echo $found_eth_dev | sed -e "s/^[[:space:]]*//g")" - ;; - esac - done - if [ -n "$found_eth_dev" ] - then - echo "DEVICE='$found_eth_dev'" >> /conf/param.conf - return - else - # wait a bit - sleep 1 + fi + fi fi done + # wait a bit + sleep 1 done + if [ -n "$found_eth_dev" ] + then + echo "Done searching for connected Ethernet interface." + echo "Writing DEVICE='$found_eth_dev' in /conf/param.conf." + echo "DEVICE='$found_eth_dev'" >> /conf/param.conf + else + echo "Could not find an interface that is up: giving-up..." + fi + return } diff --git a/debian/changelog b/debian/changelog index 460c3b2..db6caa1 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,10 @@ +live-boot (1:20230131+deb12u1) bookworm; urgency=medium + + * Add fix to get DHCP from all nics, not only the first one seen with link + up (Closes: #1069048). + + -- Thomas Goirand <z...@debian.org> Thu, 28 Nov 2024 22:14:44 +0100 + live-boot (1:20230131) unstable; urgency=medium [ Thore Sommer ] -- 2.39.5