Hi,

Here's the summary of the IRC meeting.

---

COMMUNITY MEETING

Place: #openvpn-meeting on libera.chat
Date: Wed 9th March 2022
Time: 10:30 CET (9:30 UTC)

Planned meeting topics for this meeting were here:

<https://community.openvpn.net/openvpn/wiki/Topics-2022-03-09>

Your local meeting time is easy to check from services such as

<http://www.timeanddate.com/worldclock>

SUMMARY

cron2, dazo, d12fk, mattock, MaxF, novaflash, plaisthos participated in this meeting.

---

Talked about the upcoming OpenSSL release (with security fixes) and agreed that'd be a good time to do an OpenVPN 2.5 release as well.

---

Talked about the multi-defer auth loophole. We have patches that fix the immediate problem, but "fix for good" needs a more massive rewrite of the deferred auth handling.

Agreed to fix the loophole now and do the proper fix later.

--

Full chatlog attached
(11.39.44) cron2: ah
(11.40.10) cron2: so, let's start
(11.40.53) cron2: a new openssl release is coming next tuesday (3.0.2 and 
1.1.1*) with a security thing.  So that would be a good time to do the upcoming 
2.5.x release
(11.41.38) mattock: +1
(11.41.39) dazo: +1 ... do we get the multi-defer auth stuff part of this?
(11.42.13) plaisthos: severity ofn the openssl release?
(11.42.43) cron2: dazo: we should - it's waiting for you
(11.44.08) cron2: there was a discussion on details and how to further improve 
the plugin for automated testing, and then it basically ended... but we have 
ACKs on the general patch ("it will fix the problem"), and we seem to agree 
that further improvement could be done (like, remove defer/simple.c)
(11.44.22) dazo: cron2: I might be at loss ... ordex acked it ... you had some 
concerns about the fatal behaviour, but that's what we agreed on was the right 
solution. For testing, lets not test that now in automated testing until we've 
done refinements
(11.45.09) dazo: kicking out defer/simple.c should not block this change.  
That's just another patch removing it, right?
(11.46.32) cron2: I'd remove it right away (so git can see "it got moved and 
then improved") but we can do this in two steps
(11.49.41) dazo: Okay, I see in the mail dialoge there were some complaints 
about NULL==var vs var==NULL ... I can fix those, rename multi-auth.c to 
defer.c and update testing doc?  Is that a plan to move forward?  I'll have 
that done today
(11.49.58) cron2: anyway, the other thing (ending openvpn instead of just 
returning AUTH FAILED) - I missed that part of the discussion, it seems.  Why 
end the process, instead of just fail auth (with a clear message)?
(11.50.09) cron2: the latter would enable automated testing
(11.50.25) cron2: which is something I do a lot, and it finds a lot of corner 
cases :-)
(11.51.02) dazo: Basically, because the behaviour was broken ... and to really 
lead admins into the right direction of trying to understand why auth failed.  
They would have to look at the log files for openvpn far more closely ("Why 
does openvpn stop running?")
(11.51.47) cron2: nah, not "rename multi-auth.c to defer.c" - the new name is 
good, but if you remove the old source in the same go, git will understand "ah, 
file has moved, and then was improved"
(11.51.52) dazo: and also to ensure we don't end up in other corner cases with 
various states which could lead to another circumvention of auth
(11.51.58) plaisthos: dazo: that feels like a very weak argument
(11.52.19) cron2: I think "the server always refuses login" is about as clear 
as "the server dies" - you need to look into the log
(11.52.21) plaisthos: the first part
(11.52.31) dazo: plaisthos: that was one of the arguments brought up, I don't 
recall from whom ... but it was to force admins to look into what's going on
(11.55.21) cron2: if the server always refuses login, don't you think the admin 
will look into it as well?
(11.56.10) dazo: many admins will start saying "have you reset your OTP tokens, 
have done this have you done that" to their users  .... and then they might 
start looking into logs.  that's my experience with a bit too many "admins"
(11.56.18) dazo: blame the user first
(11.57.34) dazo: And by stopping openvpn on an illconfigured authentication 
setup, from a security point of view, is safer than just kicking rejects.  It 
really ensures the misconfiguration is getting fixed properly.
(11.59.27) dazo: the only argument of not stopping, is automated testing ... 
which is understandable, but perhaps the test framework needs to be able to 
handle crashes of the processes as well?  Isn't that going to improve the 
testing possibilities better?
(12.00.05) cron2: given that systemd will restart openvpn right away, I wonder 
if that is really very useful - it will hide the error behind new startup 
messages
(12.00.40) dazo: systemd will not restart under any circumstances, only some 
.... but let me check what happens with our current systemd unit
(12.00.47) cron2: under normal conditions, openvpn on the server *never* exits, 
so if it does, it hints at "plaistos messed up something" :-)
(12.01.02) cron2: so that would be a severe bug, not something I want to add to 
regular testing
(12.07.48) d12fk: what about temporary auth failures?
(12.10.24) cron2: d12fk: not sure I understand your question
(12.13.54) d12fk: like when auth fails for other reasons than wrong credentials 
e.g. short network outage, lost packet, etc
(12.14.03) d12fk: will the process die then as well?
(12.14.09) cron2: never
(12.14.20) cron2: the openvpn process *never* ends itself, except on out of 
memory
(12.14.37) dazo: I've just done quick testing on the exit code when this 
happens, which is 1 ... systemd unit file has Restart=on-failure, which will 
trigger restart if exit code is not 0
(12.15.25) cron2: (some of the frame patches managed to introduce SIGSEGV "in 
some corner cases", which crashed the server and was a very obvious "something 
really bad is happening" signal)
(12.15.47) d12fk: cron2: so I have read this wrong then, that we'll start 
exiting on auth failed?
(12.16.17) cron2: d12fk: with the proposed patch, we exit if the auth handler 
notices that there are 2(+) active plugins that use deferred authentication
(12.16.38) cron2: we know that we can not handle this, so the quick "fix" is 
"detect that situation, and <dosomething>"
(12.16.50) cron2: I think "dosomething" could be AUTH_FAIL + a good log message
(12.16.56) cron2: dazo's patch is "abort server!"
(12.17.35) d12fk: ah sorry, got it wrong then
(12.23.29) mattock: running out of time - how would we condense the above 
discussion?
(12.23.53) d12fk: so, do we have a plan on how to fix this for good? if we do, 
I think the discussion is moot, as there will be follow up soonish anyways
(12.24.34) dazo: we don't have a plan for "fix this for good" ... but we need 
to close this loophole soonish.
(12.25.00) cron2: "fix for good" need a more massive rewrite of the deferred 
auth handling
(12.25.02) dazo: a proper fix will require a more intrusive re-factoring of the 
plugin.c code
(12.29.33) d12fk: hm, not sure what the current "fix" adds for ppl who run a 
server, their fix needs to be, don't run 2+ auth plugins, but they had a reason 
to have them in the first place
(12.30.21) cron2: 2+ auth plugins is fine
(12.30.34) cron2: 1 with deferred auth, 1+ with non-deferred is fine
(12.30.34) d12fk: not doing deferred auth is probably also no solution if one 
has a sufficiently large user base
(12.31.25) cron2: 2+ deferred means "depending on timing, if one plugin returns 
SUCCESS and the other returns FAIL, OpenVPN might let the user in, even though 
it *should* do FAIL"
(12.32.02) d12fk: but yeah, as this is just a quick hack, I think we need to 
work on the real solution instead of sugar coating this one
(12.32.21) cron2: +1
(12.32.37) dazo: yes, but we want to close this loophole asap first
(12.33.08) cron2: but now we want/need to make this clear "this is not a secure 
setup, it has a realistic chance of letting in unauthenticated users"
(12.33.28) dazo: ^^^that
(12.33.41) d12fk: agreed, lets get the code in even if it has smells, then 
revert and built the real thing
(12.34.26) mattock: +1
(12.34.34) dazo: the thing is .... deferred auth was not properly designed for 
use-cases with multiple auth plugins to start with
(12.36.28) d12fk: hence the dending re-design
(12.36.35) d12fk: *pending
(12.36.43) mattock: anything else?
(12.37.10) cron2: we still have no formal agreement on "end process" or "log 
warning, and return AUTH_FAIL"
(12.37.57) d12fk: my take is: it doesn't matter
(12.37.59) novaflash: why not take a vote
(12.38.16) cron2: d12fk: it does, as it makes automated testing much harder for 
me
(12.39.20) novaflash: then i guess the method that doesn't make it harder for 
you is the logical thing to do?
(12.40.28) d12fk: there are options: 1) don't test multiple deferred auth while 
it's worked around 2) don't exit, but retry instead
(12.40.56) d12fk: either one works for me, as it's only temporary
(12.41.05) cron2: d12fk: of course I can do "do not test things", that will 
make my life *much* easier
(12.41.26) d12fk: not *all the things* =)
(12.41.37) cron2: that is a nice suggestion actually :-)
(12.41.40) dazo: d12fk++
(12.42.20) cron2: but really - the point is, dazo wrote a plugin to enable 
testing this.  So it makes sense to test the case, to ensure we won't regress.  
But if the server dies, it's much more cumbersome (as in "manual work") to test
(12.43.10) dazo: lets fix that later on ... before this new plug-in we didn't 
test this case, so it's not a regression
(12.44.16) d12fk: what timeframe are we aiming at for the final fix to this 
issue?
(12.45.11) cron2: "we did not test this before" is not a very good argument
(12.45.19) cron2: but anyway, I give up
(12.45.23) cron2: too much time wasted on this already
(12.45.39) cron2: I maintain my stance "the server must never exit", but if 
dazo insists, I will just not test it
(12.49.27) d12fk: so, to put it another way: who can start to look into a fix 
that fixes the issue?
(12.51.14) cron2: plaisthos understands the whole "multiple authentication 
paths, some of them deferred, some not" best
(12.51.26) cron2: there's, I think, an easy way, and a hard way
(12.51.37) cron2: the easy way would be to sequentialize plugins
(12.51.53) cron2: so, if you have multiple plugins, and one of them defers, do 
not progress plugins until that one is done
(12.52.08) cron2: (so, 3 deferred plugins would run sequentially, each of them 
deferred)
(12.52.25) cron2: the hard way would be "make plugins fully independent, run 
multiple deferred plugins in parallel"
(12.53.00) cron2: (we do parallel deferred for plugin/script/management today, 
but keep track with 3 sets of state variable/files, and not "one per plugin")
(12.53.13) d12fk: sounds like the hard way is the right way [tm]
(12.53.35) plaisthos: yeah
(12.53.43) ***cron2 shrugs
(12.53.53) plaisthos: basically the code is really suited currently to handle 
multiple deferred plugins
(12.54.05) cron2: "not"
(12.54.11) plaisthos: yes
(12.54.15) cron2: :-)
(12.54.22) plaisthos: I am not going to going down the rabbit hole of 
implementing that
(12.54.53) plaisthos: that would require major refactoring and should then also 
make script/management just use the same mechanism
(12.54.55) cron2: this is why "sequentialize plugins" should be fairly easy 
(one auth control file, state variable, etc) while "all in parallel" means 
"change all that auth stuff to be fully table driven"
(12.55.00) cron2: yes
(12.55.21) plaisthos: even the sequentialise plugins is harder than it 
initially sounds 
(12.55.27) cron2: "should", "fairly" and "easy" in openvpn terms :-)
(12.56.33) cron2: damn, I have no time... but you nerdsniped me... I'll have a 
look ;-)
(12.56.55) cron2: mattock: can we do 2.5.x release next wednesday?
(12.56.56) d12fk: heh
(12.57.00) cron2: and 2.4.x release?
(13.01.22) mattock: that should be doable
(13.02.13) plaisthos: I am through the dns patch and have some comments I will 
send to the mailing list in a few minutes
(13.02.33) d12fk: cool
_______________________________________________
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel

Reply via email to