On Tue, 2013-07-09 at 22:36 +0200, Stefan (metze) Metzmacher wrote: > Am 09.07.2013 18:03, schrieb Stefan (metze) Metzmacher: > > Am 09.07.2013 17:33, schrieb Stefan (metze) Metzmacher: > >> Hi Andrew, > >> > >> Am 03.07.2013 09:44, schrieb Andrew Bartlett: > >>> On Thu, 2013-06-27 at 11:42 +1000, Andrew Bartlett wrote: > >>>> On Wed, 2013-06-26 at 20:39 +1000, Andrew Bartlett wrote: > >>>>> On Mon, 2013-06-24 at 15:26 +0000, philippe.simo...@swisscom.com wrote: > >>>>>> Hi Andrew, and by putting more num-callers : > >>>>>> > >>>>>> valgrind --num-callers=50 samba -i -M single > >>>>> > >>>>> Thanks for getting me that. I've managed to reproduce it here, but not > >>>>> under valgrind, and only when I hack the code to force a timeout. At > >>>>> least this should help me figure out why we process the winbind socket > >>>>> close, which is the crux of this issue. > >>>> > >>>> I think I've found the cause of the issue you are hitting. There is > >>>> still another issue with the nested event loop in the krb5 libs, but > >>>> these two patches should help significantly. > >>>> > >>>> As you have had more luck than I in reproducing this in a unaltered > >>>> setting, please let me know if this helps. > >>>> > >>>> Patches are for git master, but may apply to 4.0 as well. > >>> > >>> G'Day, > >>> > >>> The original reporter has confirmed to me that this removes the segfault > >>> for him. It changes it to a 105 sec hang, (due to the winbind client > >>> trying for 5 second at at a time many times). > >>> > >>> Can I get a review on it so we can rid master and eventually 4.0 of this > >>> nasty crash? > >> > >> I've looked through this patches and have some improvements. > >> The main problem is that we're not sure wbsrv_call_loop() is called again > >> on the terminated connection, when the last pending request is finished. > >> That's why I remember all broken connections and try to clean them up > >> before accepting a new connection or processing any new request on any > >> connection. > >> This way we're sure the connection gets removed eventually. > >> > >> I'm currently running some autobuild with the attached patches, > >> they might also fix the current flakey crashes, e.g. > >> https://git.samba.org/autobuild.flakey/2013-07-08-0055/samba.stderr > > > > Here's the next try, which hopefully don't crash in make test :-) > > Ok, it passed 4 times on master and 4 times on v4-0-test, > if you're ok with it I'll squash my changes and the missing > Pair-programmed-with:, Signed-off-by:, Reviewed-by: tags and push it... > > Are you fine with that?
Thanks, please do that. Signed-off-by: Andrew Bartlett <abart...@samba.org> Reviewed-by: Andrew Bartlett <abart...@samba.org> Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba