(kept HTML because otherwise too much status lost, but my Outlook tends to screw up formatting when editting HTML; sorry for any glitches)
_____ From: owner-openssl-us...@openssl.org [mailto:owner-openssl-us...@openssl.org] On Behalf Of Stéphane Charette Sent: Saturday, 21 April, 2012 04:14 To: openssl-users@openssl.org Subject: Re: a question about openssl sessions On Thu, Apr 19, 2012 at 19:45, Dave Thompson <dthomp...@prinpay.com> wrote: > From: owner-openssl-us...@openssl.org On Behalf Of Stéphane Charette > Sent: Sunday, 15 April, 2012 20:31 > I'm using Openssl to talk to a server that expects to re-use ssl > sessions when a client needs to open many SSL connections. I have > the same code working on Linux and Windows. Using classic resumption (sessionid) or RFC4507 ticket? Thanks for the reply, Dave. I believe this is using the classic resumption (sessionid). I did write up some sample code to demonstrate the problem. And using some Mac/iPhone/iPad app to establish SSL connections to FileZilla, this has been confirmed on many devices, so I'm almost certain it isn't just my code. Unless I happen to have made the exact same mistake in the sample code as the application has done. This isn't clear. Do you mean other FTP client apps work while yours doesn't? Or do you mean other apps also fail? Also hang, or any different kind of failure? Here is the sample application that works on Linux/Windows, but which hangs when the SSL connection is first established on the Mac: http://charette.no-ip.com:81/asio-openssl/ This code establishes the first SSL connection, then attempts to reuse the session ID to open up a 2nd connection. On a Mac, iPhone, and iPad, it hangs when the 2nd connection is established. This appears to involve a whole layer of boost stuff I know nothing about, so I comment only on the OpenSSL part. If that layer is doing something to your socket(s), especially if it's OS-dependent (which system-library type stuff sometimes is) that could be part of your problem. Your posted code below doesn't check for error from SSL_connect; if you do check what do you see? Note that my code does check for errors. In the e-mail and in the sample code, I did trim a lot of lines to try and make a more concise posting. Good. In general when posting code if you want to suppress irrelevant sections it's a good idea to leave a comment. But where your question actually involves handling an error, it's better to leave *that* part in. Specifically here: The name SSL_get_error may be misleading; its return isn't always an 'error', just a condition to which your code may need to respond differently. The man page calls it result code. When you get any return other than success from SSL_connect SSL_read etc. you should call SSL_get_error and if that returns SSL_ERROR_SSL you should look at the error-queue, simplest with ERR_print_errors[_fp] if you have a suitable FILE*, typically stdout or stderr, or a suitable BIO; or custom logic with ERR_get_error ERR_error_string et al. Note ERR_get_error != SSL_get_error. For SSL_ERROR_SYSCALL you should usually try both the error-queue and the OS-level socket error, which in Unix (including AFAIK MacOSX) is errno. The SSL_WANT_* returns should occur only(?) if you use nonblocking sockets (and boost::asio sounds to me like something that might use nonblocking) or certain unusual callbacks (not evident here), and your code needs to re-try the SSL_connect etc call at a suitable later time, which probably depends on how you manage your threads, which you say nothing about. You might be better off doing a single-thread program first before trying multithreading. Your comments say you got SSL_connect() != 1 but not what you got from SSL_get_error, and whether it's the same on different OSes, much less the error-queue and/or errno. And for non-protocol SSL* calls like _set_session _load_verify_locations _use_PrivateKey that have a 'failure' return (usually 0 or NULL), and (most?) libcrypto calls like EVP* BIO* RSA* etc. that do so, again you should also at the error-queue (skipping SSL_get_error). Can you recreate the problem with commandline s_client with -sess_out on the first connection and -sess_in on the second, with or without -no_ticket? If so, -debug and -state will probably be helpful. Can I re-create the problem with the command-line ssl tool since it requires copying and re-using a ssl sessionid while the first control ssl socket is still active and in use? Is this what you're saying with -sess_out and -sess_in, that I can export the ssl session and re-import it even though it is a different context in a different application? More exactly, it requires copying and reusing the whole 'session' which includes session-id, negotiated ciphersuite etc., mostly-exchanged master secret, and some other information. Within a process, including threads, you can just use pointers to a single session object as your code does, but across multiple processes you have to write the session out and read it back in. And that's exactly what commandline s_client -sess_out and -sess_in do. -sess_out connects with a full handshake, creating a session, and writes it to a file. -sess_in reads from the file and uses it for a resumed connection, subject to server agreement. The timing is manual; you can do -sess_in after the -sess_out process/connection completes or while it still exists. If by 'different application' you mean a different program (using OpenSSL) not just a different process running the same program, that is possible but you have to write it; OpenSSL distro only provides s_client. Specifically, prior to doing (any/all) SSL_new(ctx) I assume. And I assume you aren't changing settings like cipherlist and compression between connections. Sharing the session *should* override these, but maybe something might slip through a crack. Even if so, I don't see any reason it would differ on Mac. No, I'm not changing any of these. Please see the sample code I link to above. Both get1_session and set_session increment the refcount, so I believe your session object(s?) will not get cleaned up even if all connections using them go away and the cache times-out. But in the usage you describe this is probably just a quite small memory leak and doesn't matter. Ooh, thanks for pointing that out. I'll confirm with valgrind, should be obvious if I'm leaking as the application has the potential to create a lot of these secondary ssl connections. It's always nicer to clean up properly, but to be clear what matters here for memory usage is not the number of 'secondary' connections (all sharing an existing session) but the number of different sessions.