Public bug reported: I am running the 3.13 series kernel on Ubuntu 14.04 LTS (Trusty Tahr).
A change introduced in version 3.13.0-66.108 of this kernel breaks UDP sockets under certain circumstances. The effect is that the recvfrom operation returns with an error, setting errno to EFAULT, even though the pointers passed to recvfrom are okay. Using bisection, I could track down this problem to a single change: 2dde51aa53393a531b493e3a8194e4d467e194a3 is the first bad commit commit 2dde51aa53393a531b493e3a8194e4d467e194a3 Author: Herbert Xu <herb...@gondor.apana.org.au> Date: Mon Jul 13 20:01:42 2015 +0800 net: Fix skb csum races when peeking BugLink: http://bugs.launchpad.net/bugs/1500810 [ Upstream commit 89c22d8c3b278212eef6a8cc66b570bc840a6f5a ] When we calculate the checksum on the recv path, we store the result in the skb as an optimisation in case we need the checksum again down the line. This is in fact bogus for the MSG_PEEK case as this is done without any locking. So multiple threads can peek and then store the result to the same skb, potentially resulting in bogus skb states. This patch fixes this by only storing the result if the skb is not shared. This preserves the optimisations for the few cases where it can be done safely due to locking or other reasons, e.g., SIOCINQ. Signed-off-by: Herbert Xu <herb...@gondor.apana.org.au> Acked-by: Eric Dumazet <eduma...@google.com> Signed-off-by: David S. Miller <da...@davemloft.net> Signed-off-by: Kamal Mostafa <ka...@canonical.com> Signed-off-by: Luis Henriques <luis.henriq...@canonical.com> :040000 040000 423debc59ddbc7424283e647e609289fd40dc494 2511e80df4c30a7309737f6b3cee0260269a0ef7 M net Steps to reproduce the problem: Install freeradius, and have a radius client connect to the RADIUS server. After a short amount of time, freeradius spins at 100% CPU, alternating between a select and recvfrom call. The recvfrom call fails every time with error EFAULT. As an alternative to freeradius, you can use the following minimal program that I wrote that also exhibits this problem: #include <stdio.h> #include <errno.h> #include <unistd.h> #include <fcntl.h> #include <sys/select.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netinet/ip.h> int prepare_socket(int port) { int sock = socket(AF_INET, SOCK_DGRAM, 0); if (sock < 0) { printf("Could not create socket.\n"); return -1; } int opt = 1; if (setsockopt(sock, SOL_IP, IP_PKTINFO, &opt, sizeof(opt)) < 0) { printf("setsockopt failed.\n"); return -1; } struct sockaddr_in bind_addr; bind_addr.sin_family = AF_INET; bind_addr.sin_port = htons(port); bind_addr.sin_addr.s_addr = INADDR_ANY; int rc = bind(sock, (struct sockaddr *) &bind_addr, sizeof(bind_addr)); if (rc < 0) { printf("Could not bind socket.\n"); return -1; } return sock; } int main(int argc, char **argv) { int sock = prepare_socket(1812); if (sock < 0) { return 1; } for (;;) { unsigned char buffer[4]; struct sockaddr src; socklen_t src_len = sizeof(src); ssize_t received_len = recvfrom(sock, buffer, sizeof(buffer), MSG_PEEK, &src, &src_len); if (received_len < 0) { if (errno == EAGAIN) { printf("EAGAIN\n"); continue; } printf("recvfrom failed.\n"); perror(NULL); return 1; } if (received_len == 4) { src_len = sizeof(src); received_len = recvfrom(sock, buffer, sizeof(buffer), 0, &src, &src_len); if (received_len != 4) { printf("Strange received length.\n"); return 1; } } } /* Never reached */ return 0; } However, I did not find out how to craft the traffic that triggers the bug. However, the traffic from a RADIUS client (a WiFi AP in my case) reliably triggers the bug after a few seconds. As this is perfectly legal code and the problem only appears with the change introduced earlier, I think that this is a regression and the change in question should be removed from the stable kernel tree. ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: linux-image-3.13.0-66-generic 3.13.0-66.108 ProcVersionSignature: Ubuntu 3.13.0-66.108-generic 3.13.11-ckt27 Uname: Linux 3.13.0-66-generic x86_64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Oct 25 19:23 seq crw-rw---- 1 root audio 116, 33 Oct 25 19:23 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.14.1-0ubuntu3.16 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory: 'iw' Date: Mon Oct 26 18:58:42 2015 HibernationDevice: RESUME=/dev/mapper/vg0-swap InstallationDate: Installed on 2015-01-02 (296 days ago) InstallationMedia: Ubuntu-Server 14.04.1 LTS "Trusty Tahr" - Release amd64 (20140722.3) IwConfig: eth0 no wireless extensions. lo no wireless extensions. virbr0 no wireless extensions. Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub MachineType: QEMU Standard PC (i440FX + PIIX, 1996) PciMultimedia: ProcFB: 0 cirrusdrmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-66-generic root=/dev/mapper/vg0-root ro RelatedPackageVersions: linux-restricted-modules-3.13.0-66-generic N/A linux-backports-modules-3.13.0-66-generic N/A linux-firmware 1.127.15 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 01/01/2011 dmi.bios.vendor: Bochs dmi.bios.version: Bochs dmi.chassis.type: 1 dmi.chassis.vendor: Bochs dmi.modalias: dmi:bvnBochs:bvrBochs:bd01/01/2011:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-trusty:cvnBochs:ct1:cvr: dmi.product.name: Standard PC (i440FX + PIIX, 1996) dmi.product.version: pc-i440fx-trusty dmi.sys.vendor: QEMU ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug trusty -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1510213 Title: Regression: Stable kernel update to 3.13.0-66 breaks UDP sockets Status in linux package in Ubuntu: New Bug description: I am running the 3.13 series kernel on Ubuntu 14.04 LTS (Trusty Tahr). A change introduced in version 3.13.0-66.108 of this kernel breaks UDP sockets under certain circumstances. The effect is that the recvfrom operation returns with an error, setting errno to EFAULT, even though the pointers passed to recvfrom are okay. Using bisection, I could track down this problem to a single change: 2dde51aa53393a531b493e3a8194e4d467e194a3 is the first bad commit commit 2dde51aa53393a531b493e3a8194e4d467e194a3 Author: Herbert Xu <herb...@gondor.apana.org.au> Date: Mon Jul 13 20:01:42 2015 +0800 net: Fix skb csum races when peeking BugLink: http://bugs.launchpad.net/bugs/1500810 [ Upstream commit 89c22d8c3b278212eef6a8cc66b570bc840a6f5a ] When we calculate the checksum on the recv path, we store the result in the skb as an optimisation in case we need the checksum again down the line. This is in fact bogus for the MSG_PEEK case as this is done without any locking. So multiple threads can peek and then store the result to the same skb, potentially resulting in bogus skb states. This patch fixes this by only storing the result if the skb is not shared. This preserves the optimisations for the few cases where it can be done safely due to locking or other reasons, e.g., SIOCINQ. Signed-off-by: Herbert Xu <herb...@gondor.apana.org.au> Acked-by: Eric Dumazet <eduma...@google.com> Signed-off-by: David S. Miller <da...@davemloft.net> Signed-off-by: Kamal Mostafa <ka...@canonical.com> Signed-off-by: Luis Henriques <luis.henriq...@canonical.com> :040000 040000 423debc59ddbc7424283e647e609289fd40dc494 2511e80df4c30a7309737f6b3cee0260269a0ef7 M net Steps to reproduce the problem: Install freeradius, and have a radius client connect to the RADIUS server. After a short amount of time, freeradius spins at 100% CPU, alternating between a select and recvfrom call. The recvfrom call fails every time with error EFAULT. As an alternative to freeradius, you can use the following minimal program that I wrote that also exhibits this problem: #include <stdio.h> #include <errno.h> #include <unistd.h> #include <fcntl.h> #include <sys/select.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netinet/ip.h> int prepare_socket(int port) { int sock = socket(AF_INET, SOCK_DGRAM, 0); if (sock < 0) { printf("Could not create socket.\n"); return -1; } int opt = 1; if (setsockopt(sock, SOL_IP, IP_PKTINFO, &opt, sizeof(opt)) < 0) { printf("setsockopt failed.\n"); return -1; } struct sockaddr_in bind_addr; bind_addr.sin_family = AF_INET; bind_addr.sin_port = htons(port); bind_addr.sin_addr.s_addr = INADDR_ANY; int rc = bind(sock, (struct sockaddr *) &bind_addr, sizeof(bind_addr)); if (rc < 0) { printf("Could not bind socket.\n"); return -1; } return sock; } int main(int argc, char **argv) { int sock = prepare_socket(1812); if (sock < 0) { return 1; } for (;;) { unsigned char buffer[4]; struct sockaddr src; socklen_t src_len = sizeof(src); ssize_t received_len = recvfrom(sock, buffer, sizeof(buffer), MSG_PEEK, &src, &src_len); if (received_len < 0) { if (errno == EAGAIN) { printf("EAGAIN\n"); continue; } printf("recvfrom failed.\n"); perror(NULL); return 1; } if (received_len == 4) { src_len = sizeof(src); received_len = recvfrom(sock, buffer, sizeof(buffer), 0, &src, &src_len); if (received_len != 4) { printf("Strange received length.\n"); return 1; } } } /* Never reached */ return 0; } However, I did not find out how to craft the traffic that triggers the bug. However, the traffic from a RADIUS client (a WiFi AP in my case) reliably triggers the bug after a few seconds. As this is perfectly legal code and the problem only appears with the change introduced earlier, I think that this is a regression and the change in question should be removed from the stable kernel tree. ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: linux-image-3.13.0-66-generic 3.13.0-66.108 ProcVersionSignature: Ubuntu 3.13.0-66.108-generic 3.13.11-ckt27 Uname: Linux 3.13.0-66-generic x86_64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Oct 25 19:23 seq crw-rw---- 1 root audio 116, 33 Oct 25 19:23 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.14.1-0ubuntu3.16 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory: 'iw' Date: Mon Oct 26 18:58:42 2015 HibernationDevice: RESUME=/dev/mapper/vg0-swap InstallationDate: Installed on 2015-01-02 (296 days ago) InstallationMedia: Ubuntu-Server 14.04.1 LTS "Trusty Tahr" - Release amd64 (20140722.3) IwConfig: eth0 no wireless extensions. lo no wireless extensions. virbr0 no wireless extensions. Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub MachineType: QEMU Standard PC (i440FX + PIIX, 1996) PciMultimedia: ProcFB: 0 cirrusdrmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-66-generic root=/dev/mapper/vg0-root ro RelatedPackageVersions: linux-restricted-modules-3.13.0-66-generic N/A linux-backports-modules-3.13.0-66-generic N/A linux-firmware 1.127.15 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 01/01/2011 dmi.bios.vendor: Bochs dmi.bios.version: Bochs dmi.chassis.type: 1 dmi.chassis.vendor: Bochs dmi.modalias: dmi:bvnBochs:bvrBochs:bd01/01/2011:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-trusty:cvnBochs:ct1:cvr: dmi.product.name: Standard PC (i440FX + PIIX, 1996) dmi.product.version: pc-i440fx-trusty dmi.sys.vendor: QEMU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1510213/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp