[squid-users] Socket handle leak?

paolo.pr...@gmail.com Fri, 12 Jul 2024 03:58:22 -0700

Hello,   apologies in advance for the silly question.
We are having some stability issues with our squid farms after a recent upgrade 
from Centos/Squid 3.5.x to Ubuntu/Squid 5.7/6.9. I wonder if anyone here has 
seen something similar, and might have some suggestion about what we are 
obviously missing?


In short, after running for a certain period the servers run out of file 
descriptors. We see a slowly growing number of TCP or TCPv6 socket handles, 
that eventually hits the configured maximum. The handles do not get released 
until after squid is restarted (-k restart)

It is somewhat similar to what reported under 
https://access.redhat.com/solutions/3362211 . They state that     
   - If an application fails to close() it's socket descriptors and continues 
to allocate new sockets then it can use up all the system memory on TCP(v6) 
slab objects.
   - Note some of these sockets will not show up in /proc/net/sockstat(6). 
Sockets that still have a file descriptor but are in the TCP_CLOSE state will 
consume a slab object. But will not be accounted for in /proc/net/sockstat(6) 
or "ss" or "netstat".
   - It can be determined whether this is an application sockets leak, by 
stopping the application processes that are consuming sockets. If the slab 
objects in /proc/slabinfo are freed then the application is responsible. As 
that means that destructor routines have found open file descriptors to sockets 
in the process.

"This is most likely to be a case of the application not handling error 
conditions correctly and not calling close() to free the FD and socket."


For example, on a server with squid 5.7, unmodified package:

list of open files;
lsof |wc -l56963
 
of which 35K in TCPv6:
lsof |grep proxy |grep TCPv6 |wc -l
    35301
under /proc I see less objects
    cat  /proc/net/tcp6 |wc -l
    3095
but the number of objects in the slabs is high    cat /proc/slabinfo |grep 
TCPv6     MPTCPv6                0      0   2048   16    8 : tunables    0    0 
   0 : slabdata      0      0      0    tw_sock_TCPv6       1155   1155    248  
 33    2 : tunables    0    0    0 : slabdata     35     35      0    
request_sock_TCPv6      0      0    304   26    2 : tunables    0    0    0 : 
slabdata      0      0      0    TCPv6              38519  38519   2432   13    
8 : tunables    0    0    0 : slabdata   2963   2963      0
I have 35K of lines like this    lsof |grep proxy |grep TCPv6 |more    squid    
    1049              proxy   13u     sock                0,8        0t0    
5428173 protocol: TCPv6    squid        1049              proxy   14u     sock  
              0,8        0t0   27941608 protocol: TCPv6    squid        1049    
          proxy   24u     sock                0,8        0t0   45124047 
protocol: TCPv6    squid        1049              proxy   25u     sock          
      0,8        0t0   50689821 protocol: TCPv6...

We thought maybe this is a weird IPv6 thing, as we only route IPv4, so we 
compiled a more recent version of squid with no v6 support. The thing just 
moved to TCP4..
lsof |wc -l120313
cat /proc/slabinfo |grep TCPMPTCPv6                0      0   2048   16    8 : 
tunables    0    0    0 : slabdata      0      0      0tw_sock_TCPv6          0 
     0    248   33    2 : tunables    0    0    0 : slabdata      0      0      
0request_sock_TCPv6      0      0    304   26    2 : tunables    0    0    0 : 
slabdata      0      0      0TCPv6                208    208   2432   13    8 : 
tunables    0    0    0 : slabdata     16     16      0MPTCP                  0 
     0   1856   17    8 : tunables    0    0    0 : slabdata      0      0      
0tw_sock_TCP         5577   5577    248   33    2 : tunables    0    0    0 : 
slabdata    169    169      0request_sock_TCP    1898   2002    304   26    2 : 
tunables    0    0    0 : slabdata     77     77      0TCP               102452 
113274   2240   14    8 : tunables    0    0    0 : slabdata   8091   8091      0

cat /proc/net/tcp |wc -l255
After restarting squid the slab objects are released and the open file 
descriptors drop to a reasonable value. This further suggests it is squid 
hanging on to these FDs.

lsof |grep proxy |wc -l1221

Any suggestion? I guess it's something blatantly obvious, but it's a couple of 
days we look at this and we're not going anywhere...
Thanks again

_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
https://lists.squid-cache.org/listinfo/squid-users

[squid-users] Socket handle leak?

Reply via email to