> From: [email protected] On Behalf Of Sam Jantz
> Sent: Monday, 30 August, 2010 13:50
> I have just now fixed the bug. The source of the problem
> was an SSL_read call on the client half of the proxy. This was triggering
This is ambiguous; do you mean the connection to the client (where the proxy
is acting as server) or the connection to the server (proxy acting as
client)?
The latter makes more sense to me, see below.
> an error SSL_ERROR_SYSCALL with a ret of zero. According to the
documentation
> this is normally caused by an improperly shutdown SSL connection, however
> rescheduling the read for when the socket was ready (using a select
statement)
> fixed this issue. I have tested it up to a 5MB file, and it works
perfectly.
This isn't entirely clear; are you saying SSL_read returned 0 (which
indicates
TCP disconnect aka "EOF") and *then* SSL_get_error returned _SYSCALL (and
not
_ZERO_RETURN)? That would mean the peer disconnected but without doing a
shutdown alert first. It's arguable whether this is "improper", but it's
at least "suboptimal" or "possibly worrisome".
You didn't say, and I didn't think to ask: how do you know you are sending
the whole request? If you are say SSL_read'ing only the first 32K or 100K
or 1M or such of the request from the client, SSL_write'ing that to the
server,
and then SSL_read'ing the server, of course you can't get a valid response.
In that case I would expect gmail to time-out and disconnect -- as a huge
public service, it can't afford to keep potentially huge numbers of
connections
from wedged clients. If they did just-disconnect, it might be politer to do
SSL-shutdown, but depending on the software structure that might be
difficult.
You might have been on the right track originally about keepalive.
For 1.0 you can simply do connection-oriented:
while (n=SSL_read(fromcli,buf)) > 0 SSL_write(tosvr,n);
if n==0 /* EOF=disconnect=request complete, now do response */
while (n=SSL_read(fromsvr,buf)) > 0 SSL_write(tocli,n);
close(tocli); /* indicate end */
close(tosvr); /* clean up */
elif n==-1 /* error, possibly incomplete request */ ??
But for 1.1 if keepalive is enabled (and all browsers I have used do,
although technically it is optional) you *won't* get EOF following
(and delimiting) the request, so in general you must either:
1. parse the request headers (at least if there is a body,
which there is for POST) and do:
while more_req() && (n=SSL_read(cli))>0 SSL_write(svr)
if error or incomplete ??
/* similarly for response if body, which there is for most requests,
with the additional complication of possible chunked transport */
loop for next request+response, until EOF on either side
or 2. do 'full-duplex' which works for any HTTP sequence:
while forever or until manually interrupted
when data available from cli read and write to svr
when data available from svr read and write to cli
in between do something that doesn't hog CPU
but if an error happens you don't know what the HTTP state is
and can't even try to recover. Using select-readable *on both sides*
gives you a good approximation to this, but in general SSL and thus
openssl may need to both send and receive even on a connection that
is logically write-only or read-only, so instead of just select'ing
for readable (or writable), the robust way is to use nonblocking
sockets, try SSL_read or SSL_write, let it return -1 and SSL_get_error
will tell you _WANT_READ or _WANT_WRITE, and (remember and) select
for that. This is described in both the SSL_read/write and SSL_get_error
manpages. Or, less efficient but simpler, just (re)try _read (and _write
when needed) every X milliseconds, and it will progress when it can.
> I am a little confused on why I was getting the error in the
first place
> still though. What would cause SSL_ERROR_SYSCALL to be flagged, and have
> an empty error queue if the socket was not closed improperly on the other
side?
First, EOF isn't really an error. Second, when SSL_read etc. (calls BIO_sock
which) gets a socket error, it returns -1 and SSL_get_error returns
_SYSCALL,
but the error is not (usually?) put in the ERR_ queue. You must instead use
errno on Unix or [WSA]GetLastError() on Windows. The manpage for
SSL_get_error
says this "may" be the case and in my experience it always is. (Note that
internally, at the OS level, [WSA]EWOULDBLOCK/etc. for nonblocking are
treated as errors, but openssl handles them internally so your code only
sees 'real' errors.)
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List [email protected]
Automated List Manager [email protected]