Hi, This are the logs i manage to capture during disconnection (I enabled 93 in debug_options)
2021/05/12 12:27:24.209| 93,5| ServiceRep.cc(378) noteTimeToUpdate: performs a regular options update [up] 2021/05/12 12:27:24.209| 93,6| ServiceRep.cc(611) startGettingOptions: will get new options [up] 2021/05/12 12:27:24.209| 93,5| AsyncJob.cc(34) AsyncJob: AsyncJob constructed, this=0x55a73f3bc468 type=Adaptation::Icap::OptXactLauncher [job31639] 2021/05/12 12:27:24.209| 93,5| AsyncCall.cc(26) AsyncCall: The AsyncCall AsyncJob::start constructed, this=0x55a73ef1ebc0 [call661700] 2021/05/12 12:27:24.209| 93,5| AsyncCall.cc(93) ScheduleCall: AsyncJob.cc(26) will call AsyncJob::start() [call661700] 2021/05/12 12:27:24.209| 93,5| AsyncCallQueue.cc(55) fireNext: entering AsyncJob::start() 2021/05/12 12:27:24.209| 93,5| AsyncCall.cc(38) make: make call AsyncJob::start [call661700] 2021/05/12 12:27:24.209| 93,5| AsyncJob.cc(123) callStart: Adaptation::Icap::OptXactLauncher status in: [ job31639] 2021/05/12 12:27:24.209| 93,4| Launcher.cc(49) launchXaction: launching first xaction #1 2021/05/12 12:27:24.209| 93,5| AsyncJob.cc(34) AsyncJob: AsyncJob constructed, this=0x55a73f3bca30 type=Adaptation::Icap::OptXact [job31640] 2021/05/12 12:27:24.209| 93,3| Xaction.cc(101) Xaction: Adaptation::Icap::OptXact constructed, this=0x55a73f3bc948 [icapxjob31640] 2021/05/12 12:27:24.209| 93,5| Xaction.cc(141) disableRepeats: Adaptation::Icap::OptXact from now on cannot be repeated because over icap_retry_limit [/ job31640] 2021/05/12 12:27:24.209| 93,5| AsyncCall.cc(26) AsyncCall: The AsyncCall AsyncJob::start constructed, this=0x55a73ef1ecb0 [call661701] 2021/05/12 12:27:24.209| 93,5| AsyncCall.cc(93) ScheduleCall: AsyncJob.cc(26) will call AsyncJob::start() [call661701] 2021/05/12 12:27:24.209| 93,5| AsyncJob.cc(154) callEnd: Adaptation::Icap::OptXactLauncher status out: [ job31639] 2021/05/12 12:27:24.209| 93,5| AsyncCallQueue.cc(57) fireNext: leaving AsyncJob::start() 2021/05/12 12:27:24.209| 93,5| AsyncCallQueue.cc(55) fireNext: entering AsyncJob::start() 2021/05/12 12:27:24.209| 93,5| AsyncCall.cc(38) make: make call AsyncJob::start [call661701] 2021/05/12 12:27:24.209| 93,5| AsyncJob.cc(123) callStart: Adaptation::Icap::OptXact status in: [/ job31640] 2021/05/12 12:27:24.209| 93,3| ServiceRep.cc(142) getConnection: got connection: 2021/05/12 12:27:24.209| 93,5| Xaction.cc(134) disableRetries: Adaptation::Icap::OptXact from now on cannot be retried [/ job31640] 2021/05/12 12:27:24.209| 93,3| Xaction.cc(189) openConnection: Adaptation::Icap::OptXact opens connection to skipper.proxy:14590 2021/05/12 12:27:24.209| 93,5| AsyncJob.cc(139) callEnd: AsyncJob::start() ends job [/ job31640] 2021/05/12 12:27:24.210| 93,5| Initiate.cc(64) swanSong: swan sings [/ job31640] 2021/05/12 12:27:24.210| 93,5| Initiate.cc(71) swanSong: swan sang [/ job31640] 2021/05/12 12:27:24.210| 93,3| Xaction.cc(113) ~Xaction: Adaptation::Icap::OptXact destructed, this=0x55a73f3bc948 [icapxjob31640] 2021/05/12 12:27:24.210| 93,5| AsyncJob.cc(40) ~AsyncJob: AsyncJob destructed, this=0x55a73f3bca30 type=Adaptation::Icap::OptXact [job31640] 2021/05/12 12:27:24.210| 93,6| AsyncJob.cc(149) callEnd: AsyncJob::start() ended 0x55a73f3bca30 2021/05/12 12:27:24.210| 93,5| AsyncCallQueue.cc(57) fireNext: leaving AsyncJob::start() 2021/05/12 12:27:24.210| 93,5| Launcher.cc(85) noteXactAbort: theXaction:0x55a73f3bc948/0x55a73f3bc948 launches: 1 2021/05/12 12:27:24.210| 93,9| Launcher.cc(127) canRepeat: 0 2021/05/12 12:27:24.210| 93,3| Launcher.cc(95) noteXactAbort: cannot retry or repeat a failed transaction 2021/05/12 12:27:24.210| 93,4| Answer.cc(20) Error: error: 0 2021/05/12 12:27:24.210| 93,5| AsyncCall.cc(26) AsyncCall: The AsyncCall Initiator::noteAdaptationAnswer constructed, this=0x55a73f27e950 [call661703] 2021/05/12 12:27:24.210| 93,5| AsyncCall.cc(93) ScheduleCall: Initiate.cc(83) will call Initiator::noteAdaptationAnswer(2) [call661703] 2021/05/12 12:27:24.210| 93,5| AsyncJob.cc(139) callEnd: Launcher::noteXactAbort(0,0) ends job [ job31639] 2021/05/12 12:27:24.210| 93,5| Initiate.cc(64) swanSong: swan sings [ job31639] 2021/05/12 12:27:24.210| 93,5| Initiate.cc(71) swanSong: swan sang [ job31639] 2021/05/12 12:27:24.210| 93,5| AsyncJob.cc(40) ~AsyncJob: AsyncJob destructed, this=0x55a73f3bc468 type=Adaptation::Icap::OptXactLauncher [job31639] 2021/05/12 12:27:24.210| 93,6| AsyncJob.cc(149) callEnd: Launcher::noteXactAbort(0,0) ended 0x55a73f3bc468 2021/05/12 12:27:24.210| 93,7| HttpRequest.cc(63) ~HttpRequest: destructed, this=0x55a73f3c9580 2021/05/12 12:27:24.210| 93,5| AsyncCallQueue.cc(55) fireNext: entering Initiator::noteAdaptationAnswer(2) 2021/05/12 12:27:24.210| 93,5| AsyncCall.cc(38) make: make call Initiator::noteAdaptationAnswer [call661703] 2021/05/12 12:27:24.210| 93,5| AsyncJob.cc(123) callStart: Adaptation::Icap::ServiceRep status in:[up,fetch] 2021/05/12 12:27:24.210| 93,3| ServiceRep.cc(554) noteAdaptationAnswer: failed to fetch options [up] 2021/05/12 12:27:24.210| 93,8| ServiceRep.cc(468) changeOptions: changes options from 0x55a73f3c9470 to 0 [up] 2021/05/12 12:27:24.210| optional ICAP service is down after an options fetch failure: icap://skipper.proxy:14590/response [down,!opt] 2021/05/12 12:27:24.210| 93,3| ServiceRep.cc(591) handleNewOptions: got new options and is now [down,!opt] 2021/05/12 12:27:24.210| 93,7| ServiceRep.cc(634) scheduleUpdate: raw OPTIONS fetch at 1620822474 or in 30 sec 2021/05/12 12:27:24.210| 93,9| ServiceRep.cc(636) scheduleUpdate: last fetched at 1620822444 or 0 sec ago 2021/05/12 12:27:24.210| 93,5| ServiceRep.cc(649) scheduleUpdate: will fetch OPTIONS in 30 sec 2021/05/12 12:27:24.210| 93,7| ServiceRep.cc(456) scheduleNotification: will notify 0 clients 2021/05/12 12:27:24.210| 93,5| AsyncCall.cc(26) AsyncCall: The AsyncCall Adaptation::Icap::ServiceRep::noteTimeToNotify constructed, this=0x55a73ef1ecb0 [call661704] 2021/05/12 12:27:24.210| 93,5| AsyncCall.cc(93) ScheduleCall: ServiceRep.cc(457) will call Adaptation::Icap::ServiceRep::noteTimeToNotify() [call661704] 2021/05/12 12:27:24.210| 93,5| AsyncJob.cc(154) callEnd: Adaptation::Icap::ServiceRep status out:[down,!opt] 2021/05/12 12:27:24.210| 93,5| AsyncCallQueue.cc(57) fireNext: leaving Initiator::noteAdaptationAnswer(2) 2021/05/12 12:27:24.210| 93,5| AsyncCallQueue.cc(55) fireNext: entering Adaptation::Icap::ServiceRep::noteTimeToNotify() 2021/05/12 12:27:24.210| 93,5| AsyncCall.cc(38) make: make call Adaptation::Icap::ServiceRep::noteTimeToNotify [call661704] 2021/05/12 12:27:24.210| 93,5| AsyncJob.cc(123) callStart: Adaptation::Icap::ServiceRep status in:[down,!opt] 2021/05/12 12:27:24.210| 93,7| ServiceRep.cc(397) noteTimeToNotify: notifies 0 clients [down,!opt,notif] 2021/05/12 12:27:24.210| 93,5| AsyncJob.cc(154) callEnd: Adaptation::Icap::ServiceRep status out:[down,!opt] 2021/05/12 12:27:24.210| 93,5| AsyncCallQueue.cc(57) fireNext: leaving Adaptation::Icap::ServiceRep::noteTimeToNotify() Thanks, Roie On Mon, Apr 19, 2021 at 12:07 PM Eliezer Croitoru <ngtech1...@gmail.com> wrote: > Hey Roie, > > > > From the output I assume it’s a dns resolution issue. > > In the past I remember that Docker was updating the hosts file with the > relevant names but it’s not working the same way now. > > Currently Docker is using a local network dns service which is being > accessed via 127.0.0.53. > > From I remember Squid is resolving the icap service name only at startup > or reload. > > Lately Alex published a testable patch that might fix specific issues with > icap services which are resolved by dns. ( sorry I don’t remember the bug > report) > > I assume you can try to test this patch first. > > If these services are static to some degree you might be able to create a > script that updates the hosts file and reload squid on each change. > > When using the hosts file it’s possible that some issues will disappear. > > > There is also another possibility which is a malformed ICAP response or > wrong sessions handling which cause this issue. > > You might be able to use tcpdump from either the host or the container > side to capture traffic when these goes down. > > Depends on your preference of debug level you might even be able to debug > specific debug_options like for ICAP services > and/or requests to the degree you might be able to see what happens on the > basic level of the ICAP encapsulation. > > If you really need help with a diagnosis and a solution you might be able > to use Alex and the measurement factory. > > All The Bests, > > Eliezer > > > > *From:* squid-users <squid-users-boun...@lists.squid-cache.org> *On > Behalf Of *roie rachamim > *Sent:* Monday, April 12, 2021 12:54 PM > *To:* squid-users@lists.squid-cache.org > *Subject:* [squid-users] All Adaptation ICAPs go down at the same time > > > > Hi, > > > > Our setup includes squid that runs in docker container with several ICAP > servers in additional containers. > > From time to time we see in cache.log the following messages: > 2021/04/12 00:22:39| optional ICAP service is down after an options fetch > failure: icap://icap1.proxy:14590/censor [down,!opt] > 2021/04/12 00:22:39| optional ICAP service is down after an options fetch > failure: icap://icap2.proxy:1344/request [down,!opt] > 2021/04/12 00:22:39| optional ICAP service is down after an options fetch > failure: icap://icap3.proxy:14590/response [down,!opt] > > 2021/04/12 06:10:45| optional ICAP service is down after an options fetch > failure: icap://icap1.proxy:14590/censor [down,!opt] > 2021/04/12 06:10:45| optional ICAP service is down after an options fetch > failure: icap://icap2.proxy:1344/request [down,!opt] > 2021/04/12 06:10:45| optional ICAP service is down after an options fetch > failure: icap://icap3.proxy:14590/response [down,!opt] > > > > We're trying to understand why it happens to all ICAPs at once. This > happens in 4.14 and in 5.0.4 > > Any thoughts about what might cause this ? > > Many Thanks, > > Roie >
_______________________________________________ squid-users mailing list squid-users@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-users