The underlying issue here is that the upstart script does not wait long
enough for squid3 to finish.

By default, when squid3 receives a SIGTERM it will close the socket for
incoming connections and will wait for existing connections to complet.
How long it waits before forcefully closing those connections is
configured by the shutdown_lifetime directive - 30 seconds by default.

Current setting, SIGKILL after 5 seconds:
=====
# strace -r -e trace=signal,file -p $(cat /var/run/squid3.pid)
Process 20865 attached - interrupt to quit
     0.000000 --- SIGTERM (Terminated) @ 0 (0) ---
     0.000440 rt_sigreturn(0xf)         = -1 EINTR (Interrupted system call)
     0.000418 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) 
= 0
     5.006483 +++ killed by SIGKILL +++

With "kill timeout 40" in upstart script, notice the pid file being unlinked 
(the ~30 seconds wait is due to the default value of shutdown_lifetime 
directive):
=====
# strace -r -e trace=signal,file -p $(cat /var/run/squid3.pid)
Process 20805 attached - interrupt to quit
     0.000000 --- SIGTERM (Terminated) @ 0 (0) ---
     0.000284 rt_sigreturn(0xf)         = -1 EINTR (Interrupted system call)
     0.000402 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) 
= 0
    31.611726 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) 
= 0
     0.001792 --- SIGCHLD (Child exited) @ 0 (0) ---
     0.000247 rt_sigreturn(0xffffffffffffffff) = 63
     0.006027 unlink("/var/run/squid3.pid") = 0
     0.000539 open("/proc/sys/kernel/ngroups_max", O_RDONLY) = 0
     0.000387 open("/etc/group", O_RDONLY|O_CLOEXEC) = 0
Process 20805 detached

Note that the current behavior is also quite bad and can corrupt cache
index files, which is a really bad thing as squid will take a long time
to start the next time when it detects a corrupted cache index file [1].

I'm willing to submit a SRU for this, but I wonder if we should replace
SIGTERM with SIGINT. Using SIGINT means that squid will drop outstanding
connections immediately and then proceed to write cache index files.
This will avoid squid slowing down shutdown times for users [2] and we
somewhat keep the current behavior (ie. the current 5-sec SIGKILL
already does that and I don't see anyone complaining).

With "kill signal SIGINT", which does not wait on outstanding connections:
=====
# strace -r -e trace=signal,file -p $(cat /var/run/squid3.pid)
Process 20891 attached - interrupt to quit
     0.000000 --- SIGINT (Interrupt) @ 0 (0) ---
     0.000312 rt_sigreturn(0x2)         = -1 EINTR (Interrupted system call)
     0.000237 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) 
= 0
     1.123564 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) 
= 0
     0.000754 --- SIGCHLD (Child exited) @ 0 (0) ---
     0.000103 rt_sigreturn(0xffffffffffffffff) = 0
     0.002031 unlink("/var/run/squid3.pid") = 0
     0.000154 open("/proc/sys/kernel/ngroups_max", O_RDONLY) = 0
     0.000142 open("/etc/group", O_RDONLY|O_CLOEXEC) = 0
Process 20891 detached

Still, let me reiterate that even with SIGINT enabled I believe we
should still increase "kill timeout" somewhat as we surely want to avoid
any cache corruption. According to [1] we don't need to wait that much,
so I guess anything between 20~40 should be fine. Again, since issuing a
SIGKILL during cache write *WILL* corrupt it, it might be sane to wait
even longer for cases when the system is under high load and/or has a
lot of cache dirs - this would/should be quite rare anyway: in most
cases squid shutdown time with SIGINT should be under 5 secs.

Of course, since there is still a chance that SIGKILL does get issued,
we will also need the fix proposed in #1 because there's a chance the
pid file was not removed.

[1] http://lists.squid-cache.org/pipermail/squid-users/2015-July/004752.html
[2] http://askubuntu.com/questions/18127/squid3-starts-and-stops-slowly

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1073478

Title:
  squid3 logrotate fails when squid is not running

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/squid3/+bug/1073478/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to