Package: isc-dhcp-client Version: 4.2.4-7ubuntu12.8 Severity: important Tags: upstream
Dear isc-dhcp-client Maintainer, There is a small omission in dhclient-script, which may cause a IP address to be lost, with time to get an IP address again typically taking a 1-7 days (but possibly a year or more), unless action is taken to reset it, e.g. by ifdown/ifup on the interface. * Evidence from syslog sh[987]: DHCPDISCOVER on enp2s0 to 255.255.255.255 port 67 interval 9 (xid=0x93c8b31e) dhclient[1001]: No DHCPOFFERS received. sh[987]: No DHCPOFFERS received. sh[987]: Trying recorded lease 172.50.55.101 dhclient[1001]: Trying recorded lease 172.50.55.101 sh[987]: PING 172.50.55.1 (172.50.55.1) 56(84) bytes of data. sh[987]: --- 172.50.55.1 ping statistics --- sh[987]: 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms root: /etc/dhcp/dhclient-exit-hooks returned non-zero exit status 2 dhclient[1001]: bound: renewal in 477651 seconds. sh[987]: bound: renewal in 477651 seconds. * How it happens basically, "if/then/else" in bash eats return codes: The following script is instructive ---------- #!/bin/sh return_1() { return 1 } return_1 echo "RC is $?" if ! return_1; then echo "then RC is $?" else echo "else RC is $?" fi ---------- RC is 1 then RC is 0 ---------- Let's assume a machine has an active lease for $interface (whether or not the interface is already configured), and does not have a file called "/etc/dhcp /dhcp-exit-hooks". Then, dhclient restarts (e.g., because of a reboot or an apt-get upgrade) tries to renew the still valid lease, but times out because the dhcp server is not responding for whatever reason. What we expect to happen is a retry until an answer is received, with the interface maintaining its address (if it has one) until the original lease expires, or continuing to re- request the address until one is obtained. What actually happens is as follows: 1. dhclient (c side) calls dhclient-script with the TIMEOUT reason and all parameters reflecting the existing lease. 2. dhclient-script TIMEOUT logic tries to ping the 1st router in the lease and gets no reply 3. dhclient-script therefore flushes all IPs from interface using "ip -4 addr flush dev ${interface}", and runs "exit_with_hooks 2" 4. exit_with_hooks initializes exit_status=2, and calls "run_hook /etc/dhcp /dhclient-exit-hooks" 5. run_hook declares a local exit_status variable but does not initialize it, so it retains the value 2 from the calling scope. 6. there is no /etc/dhcp/dhcp-exit-hooks script to run, so exit_status=2 remains. 7. run_hook wrongly logs an error in the (nonexistant) /etc/dhcp/dhclient-exit- hooks using "logger -p" 9. run_hook still returns exit_status=2 10. back in exit_with_hooks, "run_hook" returned a non zero status 2. However, the "if" test kills the $? return code, and as a result, the "then" part sets exit_status=0 11. Let's assume that run_hookdir succeeds. 12. exit_with_hooks now exits with exit_status=0, indicating success 13. dhclient (c side) notes the success, assuming that the interface has been set up (it is not, and even if it were - step 3 above flushes it) and sleeps until the expiration of the original lease. Result: we have an interface with no IP address, no attempt to renew it until the original lease expires. Leases are typically 1-14 days, but some routers give leases for a year. * Temporary solution for an affected system (if you can issue commands to it, e.g. through a console or another interface) ifdown $interface && ifup $interface Worked well for the system in which this was discovered. * Potential Solution 1. In dhclient-script.linux, intiialize the local exit_status variable to 0, as shown in the diff below. ---------- # run given script run_hook() { local script - local exit_status + local exit_status=0 script="$1" if [ -f $script ]; then . $script exit_status=$? fi if [ -n "$exit_status" ] && [ "$exit_status" -ne 0 ]; then logger -p daemon.err "$script returned non-zero exit status $exit_status" fi return $exit_status } -------- 2. In exit_with_hooks, replace both occurences of the construct -------- if ! run_hook .... ; then exit_status=$? fi -------- with something like -------- $rc = run_hook .... if [ $rc ne 0 ] then ; exit_status=$rc; fi -------- * Note: Have not tried it yet in production. My bash/sh/dash is rusty, and I wanted to report while I test my setting. -- System Information: Debian Release: jessie/sid APT prefers trusty-updates APT policy: (500, 'trusty-updates'), (500, 'trusty-security'), (500, 'trusty'), (100, 'trusty-backports') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 4.4.0-47-generic (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages isc-dhcp-client depends on: ii debianutils 4.4 ii iproute2 3.12.0-2ubuntu1 ii isc-dhcp-common 4.2.4-7ubuntu12.8 ii libc6 2.19-0ubuntu6.9 isc-dhcp-client recommends no packages. Versions of packages isc-dhcp-client suggests: ii apparmor 2.8.95~2430-0ubuntu5.3 ii avahi-autoipd 0.6.31-4ubuntu1.1 ii resolvconf 1.69ubuntu1.1 -- no debconf information