On 01/09/2017 04:04 PM, Jesper Dangaard Brouer wrote: > This patch split the global and per (inet)peer ICMP-reply limiter > code, and moves the global limit check to earlier in the packet > processing path. Thus, avoid spending cycles on ICMP replies that > gets limited/suppressed anyhow. > > The global ICMP rate limiter icmp_global_allow() is a good solution, > it just happens too late in the process. The kernel goes through the > full route lookup (return path) for the ICMP message, before taking > the rate limit decision of not sending the ICMP reply. > > Details: The kernels global rate limiter for ICMP messages got added > in commit 4cdf507d5452 ("icmp: add a global rate limitation"). It is > a token bucket limiter with a global lock. It brilliantly avoids > locking congestion by only updating when 20ms (HZ/50) were elapsed. It > can then avoids taking lock when credit is exhausted (when under > pressure) and time constraint for refill is not yet meet.
This patch removed the rate limit bypass for localhost. As a result, it is impossible to write deterministic UDP client tests tests which exercise failover behavior in response to unreachable servers. H.J. Lu noted that a glibc test started failing on kernel 4.11 and identified the regression: https://sourceware.org/ml/libc-alpha/2017-06/msg00167.html (I have more tests which are afflicted by this, but are not yet in glibc upstream.) This is particularly annoying because we already run such tests in a network namespace for isolation, but the rate limit counter is global, so that doesn't help here. I'm attaching a self-contained test case. It fails for me with: localhost-icmp: iteration 50: no ICMP message (poll timeout) On kernel 4.10, it passes and runs within just a few milliseconds. Would you please fix this in some way? Thanks. Florian
#include <arpa/inet.h> #include <err.h> #include <netinet/in.h> #include <stdio.h> #include <sys/poll.h> #include <sys/socket.h> #include <unistd.h> /* How many UDP packets to send to a non-responding part. */ enum { ITERATIONS = 1000 }; int main (void) { /* Pick a port number which is likely unused. */ unsigned short port; { int sock = socket (AF_INET, SOCK_DGRAM, 0); if (sock < 0) err (1, "socket"); struct sockaddr_in sin = { .sin_family = AF_INET }; if (bind (sock, (struct sockaddr *) &sin, sizeof (sin)) < 0) err (1, "bind"); socklen_t sinlen = sizeof (sin); if (getsockname (sock, (struct sockaddr *) &sin, &sinlen)) err (1, "getsockname"); if (sinlen != sizeof (sin) || sin.sin_family != AF_INET) errx (1, "wrong address information for socket"); if (close (sock) < 0) err (1, "close"); port = sin.sin_port; } for (int i = 0; i < ITERATIONS; ++i) { int sock = socket (AF_INET, SOCK_DGRAM, 0); if (sock < 0) err (1, "socket"); struct sockaddr_in sin = { .sin_family = AF_INET, .sin_addr = { ntohl (INADDR_LOOPBACK) }, .sin_port = port, }; if (connect (sock, (struct sockaddr *) &sin, sizeof (sin)) < 0) err (1, "connect"); if (sendto (sock, "", 1, 0, NULL, 0) < 0) err (1, "sendto"); struct pollfd fd = { .fd = sock, .events = POLLIN }; int ret = poll (&fd, 1, 5000); if (ret < 0) err (1, "poll"); if (ret == 0) errx (1, "iteration %d: no ICMP message (poll timeout)", i); if (close (sock) < 0) err (1, "close"); } }