Hello, apologies in advance for the silly question. We are having some stability issues with our squid farms after a recent upgrade from Centos/Squid 3.5.x to Ubuntu/Squid 5.7/6.9. I wonder if anyone here has seen something similar, and might have some suggestion about what we are obviously missing?
In short, after running for a certain period the servers run out of file descriptors. We see a slowly growing number of TCP or TCPv6 socket handles, that eventually hits the configured maximum. The handles do not get released until after squid is restarted (-k restart) It is somewhat similar to what reported under https://access.redhat.com/solutions/3362211 . They state that - If an application fails to close() it's socket descriptors and continues to allocate new sockets then it can use up all the system memory on TCP(v6) slab objects. - Note some of these sockets will not show up in /proc/net/sockstat(6). Sockets that still have a file descriptor but are in the TCP_CLOSE state will consume a slab object. But will not be accounted for in /proc/net/sockstat(6) or "ss" or "netstat". - It can be determined whether this is an application sockets leak, by stopping the application processes that are consuming sockets. If the slab objects in /proc/slabinfo are freed then the application is responsible. As that means that destructor routines have found open file descriptors to sockets in the process. "This is most likely to be a case of the application not handling error conditions correctly and not calling close() to free the FD and socket." For example, on a server with squid 5.7, unmodified package: list of open files; lsof |wc -l56963 of which 35K in TCPv6: lsof |grep proxy |grep TCPv6 |wc -l 35301 under /proc I see less objects cat /proc/net/tcp6 |wc -l 3095 but the number of objects in the slabs is high cat /proc/slabinfo |grep TCPv6 MPTCPv6 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 tw_sock_TCPv6 1155 1155 248 33 2 : tunables 0 0 0 : slabdata 35 35 0 request_sock_TCPv6 0 0 304 26 2 : tunables 0 0 0 : slabdata 0 0 0 TCPv6 38519 38519 2432 13 8 : tunables 0 0 0 : slabdata 2963 2963 0 I have 35K of lines like this lsof |grep proxy |grep TCPv6 |more squid 1049 proxy 13u sock 0,8 0t0 5428173 protocol: TCPv6 squid 1049 proxy 14u sock 0,8 0t0 27941608 protocol: TCPv6 squid 1049 proxy 24u sock 0,8 0t0 45124047 protocol: TCPv6 squid 1049 proxy 25u sock 0,8 0t0 50689821 protocol: TCPv6... We thought maybe this is a weird IPv6 thing, as we only route IPv4, so we compiled a more recent version of squid with no v6 support. The thing just moved to TCP4.. lsof |wc -l120313 cat /proc/slabinfo |grep TCPMPTCPv6 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0tw_sock_TCPv6 0 0 248 33 2 : tunables 0 0 0 : slabdata 0 0 0request_sock_TCPv6 0 0 304 26 2 : tunables 0 0 0 : slabdata 0 0 0TCPv6 208 208 2432 13 8 : tunables 0 0 0 : slabdata 16 16 0MPTCP 0 0 1856 17 8 : tunables 0 0 0 : slabdata 0 0 0tw_sock_TCP 5577 5577 248 33 2 : tunables 0 0 0 : slabdata 169 169 0request_sock_TCP 1898 2002 304 26 2 : tunables 0 0 0 : slabdata 77 77 0TCP 102452 113274 2240 14 8 : tunables 0 0 0 : slabdata 8091 8091 0 cat /proc/net/tcp |wc -l255 After restarting squid the slab objects are released and the open file descriptors drop to a reasonable value. This further suggests it is squid hanging on to these FDs. lsof |grep proxy |wc -l1221 Any suggestion? I guess it's something blatantly obvious, but it's a couple of days we look at this and we're not going anywhere... Thanks again
_______________________________________________ squid-users mailing list squid-users@lists.squid-cache.org https://lists.squid-cache.org/listinfo/squid-users