Hi Claudio,

> On 25 Dec 2017, at 15:54, Mischa <obs...@high5.nl> wrote:
> 
>> On 24 Dec 2017, at 19:07, Claudio Jeker <cje...@diehard.n-r-g.com> wrote:
>> On Sat, Dec 23, 2017 at 02:04:19PM +0100, Mischa Peters wrote:
>>>> On 23 Dec 2017, at 13:08, Claudio Jeker <cje...@diehard.n-r-g.com> wrote:
>>>>> On Sat, Dec 23, 2017 at 11:40:57AM +0100, Mischa wrote:
>>>>> Hi All,
>>>>> 
>>>>> Since OpenBSD 6.2, just confirmed this in the latest snapshot 
>>>>> (GENERIC.MP#305) as well, for some reason relayd stops processing traffic 
>>>>> and starts flooding the log file with the following message:
>>>>> 
>>>>> Dec 23 11:19:11 lb2 relayd[22515]: rsae_send_imsg: poll timeout
>>>>> Dec 23 11:19:12 lb2 relayd[52110]: rsae_send_imsg: poll timeout
>>>>> Dec 23 11:19:12 lb2 relayd[69641]: rsae_send_imsg: poll timeout
>>>>> Dec 23 11:19:12 lb2 relayd[22515]: rsae_send_imsg: poll timeout
>>>>> [snip]
>>>>> Dec 23 11:19:17 lb2 relayd[69641]: rsae_send_imsg: poll timeout
>>>>> Dec 23 11:19:18 lb2 relayd[22515]: rsae_send_imsg: poll timeout
>>>>> Dec 23 11:19:18 lb2 relayd[52110]: rsae_send_imsg: poll timeout
>>>>> Dec 23 11:19:18 lb2 relayd[69641]: rsae_send_imsg: poll timeout
>>>>> ...etc...
>>>>> 
>>>>> Restarting the daemon "fixes" the problem.
>>>>> Not sure how to trouble shoot this but I am able to reproduce this 
>>>>> consistently by pointing SSLLabs towards relayd.
>>>>> Would be great to get some pointers.
>>>>> 
>>>> 
>>>> I have seen this as well on our production systems. This is a problem in
>>>> the privsep part of the TLS code. I could not do more testing yet but my
>>>> assumption is that a new option / feature is freaking this code out.
>>> 
>>> Anything I can do or collect to give you more information? 
>> 
>> So, I think I found the problem. The ca process did not handle errors from
>> RSA_private_encrypt correctly. So once you got a bad signature in the
>> system chocked and stopped. This diff seems to work for me (against
>> SSLlabs).
> 
> Awesome! Can confirm that it continues processing traffic when hitting it 
> with sslabs.
> Will also move it to a more bussier machine to see how that handles.
> 
> I am seeing the following messages now:
> Dec 25 15:51:07 lb2 relayd[7541]: ca_dispatch_relay: error:04FFF06B:rsa 
> routines:CRYPTO_internal:block type is not 02
> Dec 25 15:51:08 lb2 relayd[27420]: ca_dispatch_relay: error:04FFF071:rsa 
> routines:CRYPTO_internal:null before block missing
> Dec 25 15:51:17 lb2 relayd[7541]: ca_dispatch_relay: error:04FFF072:rsa 
> routines:CRYPTO_internal:padding check failed
> Dec 25 15:51:33 lb2 relayd[73631]: ca_dispatch_relay: error:04FFF071:rsa 
> routines:CRYPTO_internal:null before block missing

Not sure if this is supposed to be taken care of, but I am still seeing the 
following messages in 6.3-beta.
$ uname -a
OpenBSD lb2l 6.3 GENERIC.MP#58 amd64

Mar 13 23:43:38 lb2 relayd[96581]: ca_dispatch_relay: error:04FFF06B:rsa 
routines:CRYPTO_internal:block type is not 02
Mar 13 23:43:39 lb2 relayd[96581]: ca_dispatch_relay: error:04FFF072:rsa 
routines:CRYPTO_internal:padding check failed
Mar 13 23:43:48 lb2 relayd[14775]: ca_dispatch_relay: error:04FFF06B:rsa 
routines:CRYPTO_internal:block type is not 02
Mar 13 23:44:03 lb2 relayd[96581]: ca_dispatch_relay: error:04FFF071:rsa 
routines:CRYPTO_internal:null before block missing

Any knobs that need to be turned?

Mischa

Reply via email to