Hello Ryan,

On jeu., avr  09, 2009, Ryan NOVOSIELSKI wrote:
>baculal...@encambio.com wrote:
>> On mer., avr  08, 2009, Dan LANGILLE wrote:
>>> baculal...@encambio.com wrote:
>>>>   Director hostname back1.host.com: Solaris x86 11 (nv-b91)
>>>>   File daemon hostname back1.host.com: Solaris x86 11 (nv-b91)
>>>>   Errors seen on the director:
>>>>   08-Apr 09:36 bacsrv-dir JobId 40: Start Backup JobId 40, 
>>>> Job=Debut.2009-04-08_09.36.52.03
>>>>   08-Apr 09:36 bacsrv-dir JobId 40: Using Device "FileStorage"
>>>>   08-Apr 09:37 bacsrv-dir JobId 0: Error: openssl.c:86 Connect failure: 
>>>> ERR=error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong version number
>>>>   08-Apr 09:37 bacsrv-dir JobId 40: Fatal error: TLS negotiation failed 
>>>> with FD at "back1.host.com:9102".
>>> I Googled. I found:
>>> http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg04842.html
>>> Does that help?
>> Very little. I've checked that my certs are correct (permissions,
>> CN=, etc.) In the bacula config files I've added hostnames (matching
>> CN=) with 'TLS Allowed CN' in every possible place (according to th
>> '-t' option to check config files.)
>What documentation have you used to set up Bacula with TLS? I seem to
>recall, actually, that there was one source of documentation that
>mentioned one step that wasn't in another (I believe the best one was
>written by Landon Fuller -- I forget where I found it). Perhaps you
>might want to search the list archives for discussions I had on this
>subject maybe 6-9 months ago as I believe I was pointed in the right
Good ideas, I did see some configuarion advice from Landon. His
does help, as does http://www.devco.net/pubwiki/Bacula/TLS/ from

I trussed(1) the bacula-fd process and debugged the code to find
that the SSL logic reads(2) from a blocked socket (the same one
which the director CRAM-MD5 authorized with.) Because lib/tls.c
had set this socket to be nonblocking, the read(2) returns with
the error EAGAIN (errno 11.) The method openssl_bsock_session_start
in lib/tls.c is where this all happens, and finally returns false
(Socket Error Occured.) That is why the connection is rejected.

The trace:

  $ truss /pfx/sbin/bacula-fd -f ...
  /3:   read(6, "\0\0\0  ", 4)                          = 4
  /3:   read(6, " H e l l o   D i r e c t".., 32)       = 32
  /3:   read(6, " a u t h   c r a m - m d".., 52)       = 52
  /3:   read(6, " 1 0 0 0   O K   a u t h".., 13)       = 13
  /3:   fcntl(6, F_GETFL)                               = 2
  /3:   fcntl(6, F_SETFL, FWRITE|FNONBLOCK)             = 0
  /3:   time()                                          = 1239322002
  /3:   time()                                          = 1239322002
  /3:   time()                                          = 1239322002
  /3:   brk(0x081EE990)                                 = 0
  /3:   brk(0x081F4990)                                 = 0
  /3:   brk(0x081F4990)                                 = 0
  /3:   brk(0x081F8990)                                 = 0
  /3:   brk(0x081F8990)                                 = 0
  /3:   brk(0x081FC990)                                 = 0
  /3:   brk(0x081FC990)                                 = 0
  /3:   brk(0x081FE990)                                 = 0
  /3:   read(6, 0x081F34A0, 5)                          Err#11 EAGAIN

The code (in src/lib/tls.c):

static inline bool openssl_bsock_session_start(BSOCK *bsock, bool server)
   TLS_CONNECTION *tls = bsock->tls;


   /* Ensure that socket is non-blocking */
   flags = bsock->set_nonblocking();


   for (;;) { 
      if (server) {
         err = SSL_accept(tls->openssl);


      /* Handle errors */
      switch (SSL_get_error(tls->openssl, err)) {
      case SSL_ERROR_NONE:
         stat = true;
         goto cleanup;
         /* Socket Error Occured */
         openssl_post_errors(M_ERROR, _("Connect failure"));
         stat = false;
         goto cleanup;


   /* Restore saved flags */
   /* Clear timer */
   bsock->timer_start = 0;

   return stat;

If I remove the fnctl(2) where the socket is set to nonblocking,
things go further but in the end the client is unable to read
anything and the director reports 'Fatal error: FD gave bad response
to JobId command: No data available.'

Anybody familiar with the logic around openssl_bsock_session_start,
or have an idea of what might be going on? Is anybody besides me
using Solaris? Remember that Solaris has its own not the BSD
variant) socket API.


This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
Bacula-users mailing list

Reply via email to