On Thu, Feb 19, 2004, Paul L. Allen wrote:

> Paul L. Allen wrote:
> >Dr. Stephen Henson wrote:
> >
> >>On Wed, Feb 18, 2004, Paul L. Allen wrote:
> >>
> >>
> >>>[ ... problem statement omitted ...]
> >>>
> >>
> >>
> >>Firstly I hope you are checking the return values from BIO_gets(), 
> >>BIO_puts()
> >>and BIO_flush().
> >
> >
> >Yes, I am.  All are OK up to the hang.
> >
> >>Presumably you are using a buffering BIO to get the BIO_gets() 
> >>functionality.
> >>It is possible something odd is happening in there. Are the lines you are
> >>using very long?
> >
> >
> >If I'm understanding it right, there's a socket BIO associated with the
> >underlying socket.  An SSL structure is associated with the input and 
> >output
> >of the socket BIO.  An SSL BIO is connected to the SSL struct, and finally
> >a buffered BIO is pushed on in front of the SSL BIO.  The server has a
> >setup that matches, and the whole contraption generally works fine.
> >
> >Some lines are long.  In one case, the client sends a request of 4534 
> >bytes.
> >The server responds with a query to the client of 32 bytes.  The client
> >sends back a response of 35 bytes.  The server then sends back a response
> >to the original request of 92 bytes.  The client then sends a new message
> >of four bytes and hangs in BIO_flush().  I've been sending long messages
> >like this since about this time last year, but it's possible that I've 
> >crept
> >up a bit in size over time.  Is there a boundary at 4096, or something?
> >
> >>It might be an idea to place some debugging printfs around the low level
> >>socket calls in crypto/bio/bss_sock.c to see if the hang is occurring 
> >>at that
> >>level.
> >
> >
> >I can do that.  Thanks!
> 
> OK, I've instrumented crypto/bio/bss_sock.c with fprintf/fflush pairs
> on entry and exit from the socket calls.  BIO_puts() ends up using
> sock_write() and the call to sock_write() is usually triggered by a
> call to BIO_flush().  When it hangs, it stops somewhere in BIO_flush()
> prior to the call to sock_write().  After a timeout of 60 seconds or
> so, sock_write() gets called and returns, and then the process dies with
> an Alarm Clock error.  If I catch SIGALRM with a null handler, the process
> hangs forever without calling sock_write().  While the process is hung,
> it consumes all available cpu and issues no system calls.  Increasing
> the read and write buffer sizes (to 8192 from the default 1024) of the
> buffered BIO's on each end has no effect.
> 
> I guess I'll start instrumenting the code that implements the buffered
> and SSL BIO's next.  Anybody have any other suggestions?
> 

OK, that seems to rule out the low level socket read and write calls being the
cause.

Have you tried this in the latest 0.9.7 snapshot BTW? IIRC some fixes have
been made to buffering BIOs.

Can you break out of the program when it hangs to see what it is doing?

I think the buffering BIO is a likely suspect.

Steve.
--
Dr Stephen N. Henson. Email, S/MIME and PGP keys: see homepage
OpenSSL project core developer and freelance consultant.
Funding needed! Details on homepage.
Homepage: http://www.drh-consultancy.demon.co.uk
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to