On Tue, May 1, 2012 at 7:26 AM, P J <pauljfli...@gmail.com> wrote: > On Tue, May 1, 2012 at 7:22 AM, P J <pauljfli...@gmail.com> wrote: > >> On Mon, Apr 30, 2012 at 10:37 AM, P J <pauljfli...@gmail.com> wrote: >> >>> >>> On Mon, Apr 30, 2012 at 9:13 AM, Alexandr Normuradov <norma...@gmail.com >>> > wrote: >>> >>>> cat /proc/$(pidof -s httpd)/limitsTo troubleshoot that you should have >>>> at least two additional outputs from >>>> >>>> netstat -pant, with connections states >>>> and >>>> service httpd fullstatus, listing current state of all the apache >>>> procs/threads. >>>> >>>> What applications your Apache is serving? >>>> PHP? is it mod_php, mod_python, mod_perl? >>>> >>>> What the vhost access log file for the most accessed vhost is showing? >>>> Any pattern of slow, connections consuming attack? >>>> If it is, and all tasks are in the Keep Alive wait then disable Keep >>>> Alive and lower the general timeout to just 7 seconds. >>>> >>>> The error "connect to listener on [::]:80" error is quite unusual. >>>> >>>> ETIMEDOUT >>>> Timeout while attempting connection. The server may be too busy to >>>> accept new connections. Note that for IP sockets the timeout may be >>>> very long when syncookies are enabled on the server. >>>> >>>> cat /proc/sys/fs/file-nr >>>> >>>> cat /proc/$(pidof -s httpd)/limits >>>> >>>> >>>> Sincerely, >>>> Alexandr Normalex >>>> >>> >>> Hi Alexandr, thanks for taking a look at this with me. >>> >>> The traffic pattern for this website is at certain times of the day it >>> receives huge spikes of traffic in very short periods of time, trying to >>> tune Apache to accommodate it the best we can. >>> >>> cat /proc/$(pidof -s httpd)/limits >>> >>> Limit Soft Limit Hard Limit >>> Units >>> Max cpu time unlimited unlimited >>> seconds >>> Max file size unlimited unlimited >>> bytes >>> Max data size unlimited unlimited >>> bytes >>> Max stack size 10485760 unlimited >>> bytes >>> Max core file size 0 unlimited >>> bytes >>> Max resident set unlimited unlimited >>> bytes >>> Max processes 55296 55296 >>> processes >>> Max open files 1024 1024 >>> files >>> Max locked memory 32768 32768 >>> bytes >>> Max address space unlimited unlimited >>> bytes >>> Max file locks unlimited unlimited >>> locks >>> Max pending signals 55296 55296 >>> signals >>> Max msgqueue size 819200 819200 >>> bytes >>> Max nice priority 0 0 >>> Max realtime priority 0 0 >>> >>> cat /proc/sys/fs/file-nr >>> 1530 0 560543 >>> >>> Looking at Max open files I see what is likely the problem :) >>> Max open files 1024 >>> >>> I swear I modified this to 4096! I've changed the limit to 4096 now, >>> I'll double check it tomorrow. Hopefully this will be the obvious fix! >>> >>> I will check service httpd fullstatus and netstat -pant tomorrow >>> morning when this happens again, it happens the same time every day - it is >>> not an attack, the customers application receives massive amounts of >>> connections at certain times of the day. >>> >>> I've been working with Apache for 15 years and I've never seen "connect >>> to listener on [::]:80" error message before, I hope it's related to >>> reaching Max open files. >>> >>> Thanks again for your help. >>> >>> -- >>> PJ >>> >>> >> I was hoping this would be fixed now that Max Open files has been >> updated, same issue this morning. >> >> cat /proc/$(pidof -s httpd)/limits >> Limit Soft Limit Hard Limit Units >> >> Max cpu time unlimited unlimited >> seconds >> Max file size unlimited unlimited bytes >> >> Max data size unlimited unlimited bytes >> >> Max stack size 10485760 unlimited bytes >> >> Max core file size 0 unlimited bytes >> >> Max resident set unlimited unlimited bytes >> >> Max processes 55296 55296 >> processes >> Max open files 1024 1024 files >> >> Max locked memory 32768 32768 bytes >> >> Max address space unlimited unlimited bytes >> >> Max file locks unlimited unlimited locks >> >> Max pending signals 55296 55296 >> signals >> Max msgqueue size 819200 819200 bytes >> >> Max nice priority 0 0 >> Max realtime priority 0 0 >> >> Once it reaches 1000 total children >> >> [info] server seems busy, (you may need to increase StartServers, or >> Min/MaxSpareServers), spawning 32 children, there are 17 idle, and 1002 >> total children >> >> After 1000 total children >> >> mpm_common.c(663): (70007)The timeout specified has expired: connect to >> listener on [::]:80 >> mpm_common.c(663): (70007)The timeout specified has expired: connect to >> listener on [::]:80 >> mpm_common.c(663): (70007)The timeout specified has expired: connect to >> listener on [::]:80 >> >> Until apache is restarted. >> >> I tried to run service httpd fullstatus during this time but it want able >> to connect: >> >> ELinks: Connection refused. >> >> I did capture the output of netstat -pant which shows many connections to >> the MySQL DB as well. >> I've double checked MySQL has not reached max connections and that it's >> still working during this time. >> >> netstat output is so big I have to put it up on pastebin: >> http://pastebin.com/0DjvDnJp >> >> I dont understand why this is happening at 1000 children, what limit is >> it hitting? >> >> Apache config: >> >> Timeout 30 >> >> KeepAlive On >> MaxKeepAliveRequests 10000 >> KeepAliveTimeout 3 >> >> <IfModule prefork.c> >> StartServers 80 >> MinSpareServers 50 >> MaxSpareServers 120 >> ServerLimit 3500 >> MaxClients 3500 >> MaxRequestsPerChild 4000 >> </IfModule >> >> >> Any help would be greatly appreciated. >> >> -- >> PJ >> >> > Haha, Max open files still says 1024!! I hardcoded it to 16384 yesterday, > something keeps resetting it! > > Let me figure this out before I keep bugging the list :) > > Thanks, > > -- > PJ > > Same issue this morning:
[Wed May 02 07:01:57 2012] [info] server seems busy, (you may need to increase StartServers, or Min/MaxSpareServers), spawning 32 children, there are 48 idle, and 1004 total children [Wed May 02 07:02:16 2012] [debug] mpm_common.c(663): (70007)The timeout specified has expired: connect to listener on [::]:80 [Wed May 02 07:02:23 2012] [debug] mpm_common.c(663): (70007)The timeout specified has expired: connect to listener on [::]:80 [Wed May 02 07:02:30 2012] [debug] mpm_common.c(663): (70007)The timeout specified has expired: connect to listener on [::]:80 --snip-- And the site was down. I've confirmed the Max open files setting has been fixed: Max open files 16384 16384 files Anyone else have any insight on what the "(70007)The timeout specified has expired: connect to listener on [::]:80" error is and why it happens every day after reaching 1000 children? Not sure where else to look. Thanks in advance. -- PJ