Hi,

On Mon, Feb 5, 2018 at 7:52 PM, Jonathan K. Bullard <jkbull...@gmail.com> wrote:
> Hi, I'd like to reopen this patch -- it seems to have gotten lost.
>
> The patch is so old the line numbers are wrong but the code doesn't
> seem to have changed.
>
> I'm top-posting because this thread doesn't show up in the SourceForge
> archive[1] (???) so I've extracted the relevant parts below. There was
> some top-posting but nothing was really out-of-order, so I've tried to
> make the thread intelligible and include everything relevant below.
>
> On Tue, Apr 12, 2016 at 11:42 AM, Fish <fish.t...@gmail.com> wrote:
>> In `link_socket_init_phase2`, it used to be the case that any received
>> signal will be overwritten by SIGUSR1 if the socket cannot be created
>> (e.g. when DNS resolution fails), which consequently prevents OpenVPN from
>> responding to SIGTERM and exiting. This patch adds an additional check of
>> whether the received signal during DNS resolution is SIGTERM or not, and
>> prematurely exits from `link_socket_init_phase2` when receiving a SIGTERM.
>> ---
>>  src/openvpn/socket.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/src/openvpn/socket.c b/src/openvpn/socket.c
>> index 9bcf4d4..5e8abe1 100644
>> --- a/src/openvpn/socket.c
>> +++ b/src/openvpn/socket.c
>> @@ -1905,6 +1905,11 @@ link_socket_init_phase2 (struct link_socket *sock,
>>             }
>>         }
>>
>> +      if (sig_info->signal_received == SIGTERM)
>> +       {
>> +         goto done;
>> +       }
>> +
>>        /* Socket still undefined, give a warning and abort connection */
>>        if (sock->sd == SOCKET_UNDEFINED)
>>         {
>
> On Tue, Apr 12, 2016 at 11:46 AM, Gert Doering <g...@greenie.muc.de> wrote:
>> That is the "on DNS failure, the client loops forever and is not killable"
>> problem, right?
>>
>> (I'm not sure I'm reading the description right, to understand the
>> actual issue this is fixing - but if I'm reading it right, then this
>> makes sense :-)  - what about SIGINT?)
>
>
> On Tue, Apr 12, 2016 at 11:48 AM, Fish Wang <fish.t...@gmail.com> wrote:
>>
>> Right, it's for the "on DNS failure, the client loops forever and is not 
>> killable" problem.
>
>
> On Tue, Apr 12, 2016 at 12:25 PM, Jonathan K. Bullard
> <jkbull...@gmail.com> wrote:
>> Feature ACK; this should make DNS hangs much easier to deal with for
>> GUIs such as Tunnelblick.
>>
>> The "hang forever" problem can be avoided by using --resolve-retry.
>>
>
> On Tue, Apr 12, 2016 at 1:18 PM, Arne Schwabe <a...@rfc2549.org> wrote:
>> Am 12.04.16 um 17:42 schrieb Fish:
>>> In `link_socket_init_phase2`, it used to be the case that any received
>>> signal will be overwritten by SIGUSR1 if the socket cannot be created
>>> (e.g. when DNS resolution fails)
>>
>> That should not happen in the first place, doesn't register_signal keep
>> care of not overwriting higher priority signals?
>
> On Tue, Apr 12, 2016 at 10:07 PM, Fish Wang <fish.t...@gmail.com> wrote:
>>
>> No, the patch won't fix that issue. I may send out another patch later to 
>> fix that problem if I have free time this week.
>
> On Tue, Apr 12, 2016 at 10:11 PM, Fish Wang <fish.t...@gmail.com> wrote:
>> Check out the code in socket.c [1]. This is where the received signal is 
>> overwritten by SIGUSR1.
>>
>>       /* Socket still undefined, give a warning and abort connection */
>>       if (sock->sd == SOCKET_UNDEFINED)
>>         {
>>           msg (M_WARN, "Could not determine IPv4/IPv6 protocol");
>>           sig_info->signal_received = SIGUSR1;
>>           goto done;
>>         }
>> [1] https://github.com/OpenVPN/openvpn/blob/master/src/openvpn/socket.c#L1909
>
> Note that theses lines are immediately below the patch. The last three
> lines of the patch (which are not changed in the patch) are the same
> as the first three lines shown above.

This jogs my memory.

We do have signal overwrite issues in more than a couple of places in
the code and signal handling after getaddrinfo failure does need
fixing. I think this is a good opportunity to discuss whether we
continue tip-toeing around this or take a closer look at signal handling
and make some bolder changes.

When this patch was being discussed 2 years ago I had looked into our
signal handling code. A number of issues and some solutions were
identified but never finished for one reason or other. My stab at
improving signal handling is in this feature branch:
https://github.com/selvanair/openvpn/tree/signals-v5

I'd like to resurrect it but it would be useful to know whether there
is any interest in a revamping the signals code.

For those who may not want look through code based on a 2 years
old branch, with a longish commit _not_ yet split into bite sized pieces,
here is a gist:

1. In many places in the code, signal_received is directly assigned to:
always use register_signal() and/or provide similar abstractions for
setting signals

2. Low priority signals can currently write over high priority ones
like SIGTERM: prioritize signals and to avoid losing signals. Doing
this in a fool-proof way is hard, especially when we have many signal
sources (hard, soft, management).  But moving to POSIX signals would
help (see #3)

3. signal() follows BSD semantics on some systems, system V semantics
on others (including solaris, AIX). On Linux its more complex: signal() syscall
in the kernel apparently follows system V behaviour but probably is
not widely used: instead, glibc  (> version 2) provides its own
implementation with BSD or system V personality based on feature set
macros.
Its much better move to POSIX sigaction() for consistency, for
properly blocking signals in the handler, to mask signals etc. sigaction() is
supported by all unix-like OSes we support.

4. Windows "signals" (SetEvent, ctrl-C, ctrl-break) are not picked up
during pre-init phase or while in openvpn_sleep() unless management
interface is active. Note that on unix sleep() is interrupted by signals
but on windows ctrl-C doesnt interrupt sleep(). Reports like "impossible to
break out of dns resolution loop on Windows when run from command
line" is related to this. Not hard to fix.

That said, getaddrinfo() will continue to block and cannot be
interrupted by signals -- it never returns EINTR but continues trying
until an internal timeout expires. This is known issue and I know of
no fixes. But we can make sure that signal received during
getaddrinfo() is properly acted upon when it returns. Currently we
can't guarantee even that. This is important as that call can block for a
long time (15 seconds?)  and is a very likely place to receive user interrupt.

Finally, I never completed this mainly because the changes are not
"perfect": its only an attempt to tidy-up the existing signal
infrastructure. The use of signals is deeply entrenched in openvpn
code-base and is used for multiple purposes, has multiple origins
(hard, soft, management). And the code is hard to follow. Only
consolation is that there are no multiple threads to worry about.

I definitely do not know how to replace it with something clean. So
comments and suggestions would help.

Thanks,

Selva

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel

Reply via email to