It was thus said that the Great Andr Warnier once stated:
>
> Another thing : it looks from your lsof list, that you are using the
> Apache "prefork" model.
> I don't remember precisely your configuration or the kind of load or
> processes you are running, but you might try the "worker" (threaded)
> model instead. I am not entirely sure, but I believe that this will
> also reduce the total number of files opened on your system, as threads
> may share a lot more in that respect, than individual child processes do.
Switching to the threaded model won't help if you are running out of open
files---the open file limit is *per process* (more on that below). And yes,
a child process will inherit all open files from the parent process, but
once the child process is created, it can close any open files without
affecting the parent (and by the same token, the parent can close files
without affecting any existing children).
> But I would wait a few hours for a real expert to comment, which I'm
> sure one will do if I wrote something really stupid above.
First off, "ulmit -n" will report back the number of open files a
*PROCESS* can have open---this isn't system wide, but per-proess. It's an
important distinction.
Second, lsof does report all files used by a process, but that isn't the
whole story. For instance, lsof (and one instance of apache from my
development server):
httpd 17155 apache cwd DIR 253,0 4096 2 /
httpd 17155 apache rtd DIR 253,0 4096 2 /
httpd 17155 apache txt REG 253,0 259488 5300352
/usr/sbin/httpd
httpd 17155 apache mem REG 253,0 50748 18187232
/lib/tls/librt-2.3.4.so
httpd 17155 apache mem REG 253,0 28544 18187230
/lib/libcrypt-2.3.4.so
httpd 17155 apache mem REG 253,0 1525004 18187219
/lib/tls/libc-2.3.4.so
httpd 17155 apache mem REG 253,0 81184 18187224
/lib/libresolv-2.3.4.so
httpd 17155 apache mem REG 253,0 213600 18187228
/lib/libssl.so.0.9.7a
httpd 17155 apache mem REG 253,0 7004 18186360
/lib/libcom_err.so.2.1
httpd 17155 apache mem REG 253,0 82320 5303493
/usr/lib/libsasl2.so.2.0.19
... a whole mess of output deleted
httpd 17155 apache mem REG 253,0 1261824 5345160
/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE/libperl.so
httpd 17155 apache mem REG 253,0 804084 6424738
/usr/lib/libstdc++.so.6.0.3
httpd 17155 apache DEL REG 0,6 422353 /dev/zero
httpd 17155 apache DEL REG 0,6 37541 /dev/zero
httpd 17155 apache DEL REG 0,6 422347 /dev/zero
httpd 17155 apache 0r CHR 1,3 2029 /dev/null
httpd 17155 apache 1w CHR 1,3 2029 /dev/null
httpd 17155 apache 2w REG 253,0 8426 2360724
/var/log/httpd/error_log.1
httpd 17155 apache 3u IPv6 37481 TCP *:http
(LISTEN)
httpd 17155 apache 4u IPv6 37483 TCP *:https
(LISTEN)
httpd 17155 apache 5r FIFO 0,7 422346 pipe
httpd 17155 apache 6w FIFO 0,7 422346 pipe
httpd 17155 apache 7w REG 253,0 8426 2360724
/var/log/httpd/error_log.1
httpd 17155 apache 8w REG 253,0 262156 2360561
/var/log/httpd/access_log.1
httpd 17155 apache 9w REG 253,0 15126 11783673
/etc/httpd/www.roswell.area51
httpd 17155 apache 10w REG 253,0 151095 11782872
/etc/httpd/wwwtest.roswell.area51
httpd 17155 apache 11w REG 253,0 3046 11781743
/etc/httpd/s-secure.roswell.area51
httpd 17155 apache 12w REG 253,0 4263 11783866
/etc/httpd/secure.roswell.area51
It's 146 lines, but it's not 146 "open" files. Since Linux can page
directly from executables and libraries, technically, they're "open" in the
sense that yes, the kernel is reading from them as the program runs, but
they're not counted against the "ulimit -n" limit (unless I'm terribly
mistaken, and I could be, but bear with me for a bit). You can see such
memory mappings if you check the maps file in the appropriate /proc'
directory:
[r...@lucy 17155]# pwd
/proc/17155
r...@lucy 17155]# more maps
00111000-00119000 r-xp 00000000 fd:00 18187232 /lib/tls/librt-2.3.4.so
00119000-0011a000 r--p 00007000 fd:00 18187232 /lib/tls/librt-2.3.4.so
0011a000-0011b000 rw-p 00008000 fd:00 18187232 /lib/tls/librt-2.3.4.so
0011b000-00125000 rw-p 0011b000 00:00 0
00125000-0012a000 r-xp 00000000 fd:00 18187230 /lib/libcrypt-2.3.4.so
0012a000-0012b000 r--p 00004000 fd:00 18187230 /lib/libcrypt-2.3.4.so
0012b000-0012c000 rw-p 00005000 fd:00 18187230 /lib/libcrypt-2.3.4.so
0012c000-00153000 rw-p 0012c000 00:00 0
00153000-00278000 r-xp 00000000 fd:00 18187219 /lib/tls/libc-2.3.4.so
00278000-0027a000 r--p 00124000 fd:00 18187219 /lib/tls/libc-2.3.4.so
0027a000-0027c000 rw-p 00126000 fd:00 18187219 /lib/tls/libc-2.3.4.so
0027c000-0027e000 rw-p 0027c000 00:00 0
0027e000-0028d000 r-xp 00000000 fd:00 18187224 /lib/libresolv-2.3.4.so
0028d000-0028e000 r--p 0000f000 fd:00 18187224 /lib/libresolv-2.3.4.so
0028e000-0028f000 rw-p 00010000 fd:00 18187224 /lib/libresolv-2.3.4.so
0028f000-00291000 rw-p 0028f000 00:00 0
00291000-002c2000 r-xp 00000000 fd:00 18187228 /lib/libssl.so.0.9.7a
002c2000-002c5000 rw-p 00031000 fd:00 18187228 /lib/libssl.so.0.9.7a
002c5000-002c7000 r-xp 00000000 fd:00 18186360 /lib/libcom_err.so.2.1
002c7000-002c8000 rw-p 00001000 fd:00 18186360 /lib/libcom_err.so.2.1
... rest snipped
For files that are
actually "opened" (as in, via the open() system call), you can check the
proc file system, and here:
[r...@lucy fd]# pwd
/proc/17155/fd
[r...@lucy fd]# ll
total 13
lr-x------ 1 root root 64 May 20 17:46 0 -> /dev/null
l-wx------ 1 root root 64 May 20 17:46 1 -> /dev/null
l-wx------ 1 root root 64 May 20 17:46 10 -> /etc/httpd/wwwtest.roswell.area51
l-wx------ 1 root root 64 May 20 17:46 11 -> /etc/httpd/s-secure.roswell.area51
l-wx------ 1 root root 64 May 20 17:46 12 -> /etc/httpd/secure.roswell.area51
l-wx------ 1 root root 64 May 20 17:46 2 -> /var/log/httpd/error_log.1
lrwx------ 1 root root 64 May 20 17:46 3 -> socket:[37481]
lrwx------ 1 root root 64 May 20 17:46 4 -> socket:[37483]
lr-x------ 1 root root 64 May 20 17:46 5 -> pipe:[422346]
l-wx------ 1 root root 64 May 20 17:46 6 -> pipe:[422346]
l-wx------ 1 root root 64 May 20 17:46 7 -> /var/log/httpd/error_log.1
l-wx------ 1 root root 64 May 20 17:46 8 -> /var/log/httpd/access_log.1
l-wx------ 1 root root 64 May 20 17:46 9 -> /etc/httpd/www.roswell.area51
These are the files that have actually been "open()"ed by the program (and I
see that logrotate on the development system is still borked, but I
digress). Files 0 and 1 are STDIN and STDOUT respecitively and since this
is the webserver, they're remapped to '/dev/null'. File 2 is STDERR and
that's redirected to '/var/log/httpd/error_log.1' (should be error_log, but
like I said, there's something wrong with logrotate on that system). Files
3 and 4 are the listening sockets (port 80 and 443---if you cross reference
the lsof output, you can find out which sockets are listening to which
ports), and well, I can go on (the two pipes are probably there for CGIs,
and you hve the various config and log files) but I think you get the idea.
In any case, this is what you should be concerned with. And there are
several ways you can proceed:
1. You can increase the open files limit with ulimit.
2. You can consolidate all log files to two (access_log and error_log).
3. You can consolidate all configuration files into one large file.
4. Check Apache modules for open files and modify accordingly.
5. Some combination of the above.
But what you actually do depends upon what you are doing with your
webserver (whenever I've enountered this problem, I first consolidate error
logs, then increase ulimit).
-spc (Hope this helps some ... )
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [email protected]
" from the digest: [email protected]
For additional commands, e-mail: [email protected]