We recently ran into a problem in our production environment in which the survey emperor couldn’t serve more requests because our system ran out of file descriptors. When we investigated, we found a very large number of fds per vassal (about 200 per vassal), and an even larger number for the emperor itself (starting fairly small after a restart, then creeping upwards to the thousands). Here’s a sample for a vassal:
# ls -l /proc/46248/fd total 0 lr-x------ 1 s1 s1 64 May 7 08:54 0 -> /dev/null lrwx------ 1 s1 s1 64 May 7 08:54 1 -> /home/log_intellisurvey/7.0/uwsgi/survey/rni05407104p.log lrwx------ 1 s1 s1 64 May 7 08:54 10 -> socket:[1281427374] lrwx------ 1 s1 s1 64 May 7 08:54 100 -> socket:[1281440771] lrwx------ 1 s1 s1 64 May 7 08:54 101 -> socket:[1281440772] lrwx------ 1 s1 s1 64 May 7 08:54 102 -> socket:[1281440773] lrwx------ 1 s1 s1 64 May 7 08:54 103 -> socket:[1281440774] lrwx------ 1 s1 s1 64 May 7 08:54 104 -> socket:[1281440775] lrwx------ 1 s1 s1 64 May 7 08:54 105 -> socket:[1281440776] lrwx------ 1 s1 s1 64 May 7 08:54 106 -> socket:[1281440777] lrwx------ 1 s1 s1 64 May 7 08:54 107 -> socket:[1281440778] lrwx------ 1 s1 s1 64 May 7 08:54 108 -> socket:[1281440779] lrwx------ 1 s1 s1 64 May 7 08:54 109 -> socket:[1281440780] lrwx------ 1 s1 s1 64 May 7 08:54 11 -> socket:[1281432897] lrwx------ 1 s1 s1 64 May 7 08:54 110 -> socket:[1281440781] lrwx------ 1 s1 s1 64 May 7 08:54 111 -> socket:[1281440782] lrwx------ 1 s1 s1 64 May 7 08:54 112 -> socket:[1281440783] lrwx------ 1 s1 s1 64 May 7 08:54 113 -> socket:[1281440784] lrwx------ 1 s1 s1 64 May 7 08:54 114 -> socket:[1281440785] lrwx------ 1 s1 s1 64 May 7 08:54 115 -> socket:[1281440786] lrwx------ 1 s1 s1 64 May 7 08:54 116 -> socket:[1281440787] lrwx------ 1 s1 s1 64 May 7 08:54 117 -> socket:[1281440788] lrwx------ 1 s1 s1 64 May 7 08:54 118 -> socket:[1281440789] lrwx------ 1 s1 s1 64 May 7 08:54 119 -> socket:[1281440790] lrwx------ 1 s1 s1 64 May 7 08:54 12 -> /home/log_intellisurvey/7.0/uwsgi/survey_emperor.log lrwx------ 1 s1 s1 64 May 7 08:54 120 -> socket:[1281440791] lrwx------ 1 s1 s1 64 May 7 08:54 121 -> socket:[1281440792] lrwx------ 1 s1 s1 64 May 7 08:54 122 -> socket:[1281440793] lrwx------ 1 s1 s1 64 May 7 08:54 123 -> socket:[1281440794] … and so on, for a total of 220, mostly sockets. For the emperor, again it is mostly socket connections. Reading the docs and searching for file descriptors, I see a recommendation for “close-on-exec”. We will try that, but I also thought I’d post in case anyone else has seen this kind of behavior and has any recommendations. We are running uwsgi 2.1 with perl and using a few advanced features such as the fork-server option, so perhaps that could have something to do with it? Thanks in advance for any comments or advice. Rob Messer
_______________________________________________ uWSGI mailing list uWSGI@lists.unbit.it http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi