On Wed, Mar 14, 2018 at 04:27:58PM +0100, Mischa wrote: > Hi Claudio, > > > On 25 Dec 2017, at 15:54, Mischa <obs...@high5.nl> wrote: > > > >> On 24 Dec 2017, at 19:07, Claudio Jeker <cje...@diehard.n-r-g.com> wrote: > >> On Sat, Dec 23, 2017 at 02:04:19PM +0100, Mischa Peters wrote: > >>>> On 23 Dec 2017, at 13:08, Claudio Jeker <cje...@diehard.n-r-g.com> wrote: > >>>>> On Sat, Dec 23, 2017 at 11:40:57AM +0100, Mischa wrote: > >>>>> Hi All, > >>>>> > >>>>> Since OpenBSD 6.2, just confirmed this in the latest snapshot > >>>>> (GENERIC.MP#305) as well, for some reason relayd stops processing > >>>>> traffic and starts flooding the log file with the following message: > >>>>> > >>>>> Dec 23 11:19:11 lb2 relayd[22515]: rsae_send_imsg: poll timeout > >>>>> Dec 23 11:19:12 lb2 relayd[52110]: rsae_send_imsg: poll timeout > >>>>> Dec 23 11:19:12 lb2 relayd[69641]: rsae_send_imsg: poll timeout > >>>>> Dec 23 11:19:12 lb2 relayd[22515]: rsae_send_imsg: poll timeout > >>>>> [snip] > >>>>> Dec 23 11:19:17 lb2 relayd[69641]: rsae_send_imsg: poll timeout > >>>>> Dec 23 11:19:18 lb2 relayd[22515]: rsae_send_imsg: poll timeout > >>>>> Dec 23 11:19:18 lb2 relayd[52110]: rsae_send_imsg: poll timeout > >>>>> Dec 23 11:19:18 lb2 relayd[69641]: rsae_send_imsg: poll timeout > >>>>> ...etc... > >>>>> > >>>>> Restarting the daemon "fixes" the problem. > >>>>> Not sure how to trouble shoot this but I am able to reproduce this > >>>>> consistently by pointing SSLLabs towards relayd. > >>>>> Would be great to get some pointers. > >>>>> > >>>> > >>>> I have seen this as well on our production systems. This is a problem in > >>>> the privsep part of the TLS code. I could not do more testing yet but my > >>>> assumption is that a new option / feature is freaking this code out. > >>> > >>> Anything I can do or collect to give you more information? > >> > >> So, I think I found the problem. The ca process did not handle errors from > >> RSA_private_encrypt correctly. So once you got a bad signature in the > >> system chocked and stopped. This diff seems to work for me (against > >> SSLlabs). > > > > Awesome! Can confirm that it continues processing traffic when hitting it > > with sslabs. > > Will also move it to a more bussier machine to see how that handles. > > > > I am seeing the following messages now: > > Dec 25 15:51:07 lb2 relayd[7541]: ca_dispatch_relay: error:04FFF06B:rsa > > routines:CRYPTO_internal:block type is not 02 > > Dec 25 15:51:08 lb2 relayd[27420]: ca_dispatch_relay: error:04FFF071:rsa > > routines:CRYPTO_internal:null before block missing > > Dec 25 15:51:17 lb2 relayd[7541]: ca_dispatch_relay: error:04FFF072:rsa > > routines:CRYPTO_internal:padding check failed > > Dec 25 15:51:33 lb2 relayd[73631]: ca_dispatch_relay: error:04FFF071:rsa > > routines:CRYPTO_internal:null before block missing > > Not sure if this is supposed to be taken care of, but I am still seeing the > following messages in 6.3-beta. > $ uname -a > OpenBSD lb2l 6.3 GENERIC.MP#58 amd64 > > Mar 13 23:43:38 lb2 relayd[96581]: ca_dispatch_relay: error:04FFF06B:rsa > routines:CRYPTO_internal:block type is not 02 > Mar 13 23:43:39 lb2 relayd[96581]: ca_dispatch_relay: error:04FFF072:rsa > routines:CRYPTO_internal:padding check failed > Mar 13 23:43:48 lb2 relayd[14775]: ca_dispatch_relay: error:04FFF06B:rsa > routines:CRYPTO_internal:block type is not 02 > Mar 13 23:44:03 lb2 relayd[96581]: ca_dispatch_relay: error:04FFF071:rsa > routines:CRYPTO_internal:null before block missing > > Any knobs that need to be turned?
Unsure. The errors are from OpenSSL and explain why the key exchange blew up. Currently this resets the connection. I wanted to log them so they are more visible. Somebody with deep knowledge of TLS and libcrypto may be able to answer if those are harmless or if somebody is probing. -- :wq Claudio