I found same problem in IIS recently. When SSL connection established and read to eof,I can't write anything to stream again. It's trow :SSL operation failed with code 1. OpenSSL Error messages: error:140D00CF:SSL routines:func(208):reason(207). My php version is 5.2.9
2009/9/8 Eloy Bote Falcon <eloyb...@gmail.com>: > Hi everybody! > > Three months ago one of my PHP applications started to hang webserver > processes. The schema of the application is as follows (it's a webservice): > > -Main server, with LINUX, Apache and PHP, handles the requests from the > clients. > -Once the client is authenticated, and the request is valid, it relays the > request to one of the four slave servers (let's call them nodes). > -One of the nodes uses PHP4 (works fine), and the other three use LINUX, > Apache and PHP 5.1.6 with the following SSL configuration: > OpenSSL support => enabled > OpenSSL Version => OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008 > > -The nodes send the request to a remote webservice, and reply it back to the > main server. > -The main server processes the response and sends back the response to the > client. > > > The problem is that, eventually, the PHP5 nodes ends with blocked (sleeping) > processes from Apache. Finally the system runs out of resources because all > the blocked-sleeping processes have a ESTABLISHED socket. > First I noticed that, at least in that version of PHP, the blocking sockets > can have a send and recive timeout. Then I switched to non blocking sockets > and select. Now the application works fine in the upper level of the > communication with the other webservices, but I noticed that it still ends > with sleeping processes. I "straced" the Apache and this is what I found > (the bound IP and the remote webservice IP are not real; the second one is > from google.com): > > Block 1 - > > 5503 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 20 > 5503 bind(20, {sa_family=AF_INET, sin_port=htons(0), > sin_addr=inet_addr("10.6.0.11")}, 16) = 0 > 5503 fcntl64(20, F_GETFL) = 0x2 (flags O_RDWR) > 5503 fcntl64(20, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > 5503 connect(20, {sa_family=AF_INET, sin_port=htons(443), > sin_addr=inet_addr("74.125.127.100")}, 16) = -1 EINPROGRESS (Operation now > in progress) > 5503 poll([{fd=20, events=POLLIN|POLLOUT|POLLERR|POLLHUP}], 1, 1370327) = 1 > ([{fd=20, revents=POLLOUT}]) > > Block 2 - > > 5503 getsockopt(20, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 > 5503 fcntl64(20, F_SETFL, O_RDWR) = 0 > 5503 time(NULL) = 1251911098 > 5503 brk(0xb9e39000) = 0xb9e39000 > 5503 time(NULL) = 1251911098 > 5503 write(20, > "\200d\1\3\1\0K\0\0\0\20\0\0009\0\0008\0\0005\0\0\26\0\0\23\0\0\n\7\0\300"..., > 102) = 102 > 5503 read(20, "\26\3\1\0J\2\0", 7) = 7 > 5503 time(NULL) = 1251911098 > 5503 time(NULL) = 1251911098 > 5503 read(20, > "\0F\3\1J\236\245\273^\v\375\357<\205\223\334n}\226k\232\\\311\300\21Q\256\267\304\257o#"..., > 72) = 72 > 5503 read(20, "\26\3\1\6\252", 5) = 5 > 5503 read(20, > "\v\0\6\246\0\6\243\0\3\2400\202\3\2340\202\3\5\240\3\2\1\2\2\4<\260Eh0\r\6"..., > 1706) = 1376 > 5503 read(20, " 2 CA1\r0\v\6\3U\4\3\23\4CRL10+\6\3U\35\20\4$0\"\200"..., > 330) = 330 > 5503 read(20, "\26\3\1\1^", 5) = 5 > 5503 read(20, "\r\0\1v\3\...@\1p\0m0k1\v0\t\6\3u\4\6\23\2es1\0170\r\6"..., > 350) = 350 > 5503 write(20, > "\26\3\1\0\7\v\0\0\3\0\0\0\26\3\1\0\206\20\0\0\202\0\200\215\30 > \tT\312\313\255\210"..., 210) = 210 > 5503 read(20, "\24\3\1\0\1", 5) = 5 > 5503 read(20, "\1", 1) = 1 > 5503 read(20, "\26\3\1\0000", 5) = 5 > 5503 read(20, > "\364\35k\17\25\224\314]|\263Y\307z\310\223\314\210\235h\224t\345\305\373\267\241\377\t\1s\345\250"..., > 48) = 48 > > Block 3 - > > 5503 fcntl64(20, F_GETFL) = 0x2 (flags O_RDWR) > 5503 fcntl64(20, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > > Block 4 - > > 5503 select(21, [], [20], [], {44320, 0}) = 1 (out [20], left {44320, 0}) > 5503 gettimeofday({1251911098, 885029}, NULL) = 0 > 5503 write(13, "[Wed Sep 02 19:04:58 2009] [erro"..., 161) = 161 > 5503 write(20, > "\27\3\1\2\240\10\260\325k\v\37{\251\355t_R\274\312\16\354r\315\1\277\204b\342xU\30L"..., > 677) = 677 > 5503 select(21, [20], [], [], {44320, 0} <unfinished ...> > 5503 <... select resumed> ) = 1 (in [20], left {44319, 828000}) > 5503 read(20, "\27\3\1\0\240", 5) = 5 > 5503 read(20, > "\304\310\273s\370\304e\335\273\1\346\376N\234\376\226\276\0\fG9T\313\235\2&\31\340B=\f\""..., > 160) = 160 > 5503 select(21, [20], [], [], {44320, 0}) = 1 (in [20], left {44320, 0}) > 5503 read(20, "\27\3\1\4P", 5) = 5 > 5503 read(20, > "\333`\341\212},\215\3476\370\325\273\"\260\340\234\365\324\324\10a\25v\327\266'+\340\34\1\317\364"..., > 1104) = 1104 > > Block 5 - > > 5503 write(20, "\25\3\1\0 > \305\10\236\5\10\226JS\330\365\204\21\230\246\362\247R\332\220:\210o\2\333\10.\27"..., > 37) = 37 > 5503 brk(0xb9e2d000) = 0xb9e2d000 > 5503 close(20) = 0 > > Block 6 - > > 5503 brk(0xb9e2b000) = 0xb9e2b000 > 5503 chdir("/") = 0 > 5503 setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={0, 0}}, NULL) = > 0 > 5503 writev(19, > [{"\27\3\1\0\260n?\301\204\263\0341yi\204\v\273\332\335\263\277\335M\t\223\364$\266\214\212c\362"..., > 714}], 1) = 714 > 5503 gettimeofday({1251911099, 58549}, NULL) = 0 > 5503 write(14, "10.6.0.1 - noone [02/Sep"..., 97) = 97 > 5503 gettimeofday({1251911099, 58696}, NULL) = 0 > 5503 times({tms_utime=1, tms_stime=0, tms_cutime=0, tms_cstime=0}) = > 369262016 > 5503 gettimeofday({1251911099, 58799}, NULL) = 0 > 5503 writev(19, [{"\25\3\1\0 > MJi\215B\321Q\2657E\30cw\341n\264o\215\17\237{ej\342n\222\202"..., 37}], 1) > = 37 > 5503 brk(0xb9e2a000) = 0xb9e2a000 > 5503 shutdown(19, 1 /* send */) = 0 > 5503 poll([{fd=19, events=POLLIN}], 1, 2000) = 1 ([{fd=19, > revents=POLLIN}]) > 5503 read(19, "\25\3\1\0 > \230\220\315\210n\342zc\201\367H\r'xPt\277\243\346\227\320g\2223;U\352"..., > 512) = 37 > 5503 gettimeofday({1251911099, 59317}, NULL) = 0 > 5503 gettimeofday({1251911099, 59372}, NULL) = 0 > 5503 poll([{fd=19, events=POLLIN}], 1, 2000) = 1 ([{fd=19, > revents=POLLIN|POLLERR|POLLHUP}]) > 5503 read(19, 0xbf81c504, 512) = -1 ECONNRESET (Connection reset by > peer) > 5503 close(19) = 0 > 5503 read(7, 0xbf81c713, 1) = -1 EAGAIN (Resource temporarily > unavailable) > 5503 close(18) = 0 > 5503 close(8) = 0 > 5503 close(7) = 0 > 5503 exit_group(0) = ? > > As you can see I devided in 6 blocks this piece of strace output. In the > first block, the socket it's created, its bound to a network interface, and > it's connected to the remote host; fine. But in the block 2 the socket it's > switched back to blocking and then starts the SSL communication. In the > block 3 the socket it's switched to non blocking (this is the first time > that our application have chenged the socket options, because we don't have > the stream until the stream_socket_client function returns). Then, in the > block 4, the PHP that I wrote starts to work and the send / receive takes > place. In the block 5 there is a write without select... why? Well, I think > that is another "feature" of the underlaying socket communications, related > to the fclose of the stream. At last, the application ends in the block 6 > sending the result back to the main server and closing resources. > > After saying all of this, my question is... why the SSL implementation uses > potentially blocking code? The only work around that I found (inside > PHP/Apache) for facing this issue is setting a max execution time of the > apache processes because the PHP max_execution_time doesn't count the > sleeping time. I can't test right now this code in a latest version of PHP, > but I'm afraid that the strace output will be the same. > > > Regards, > > Eloy Bote. > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php