Re: crash

2020-04-14 Thread Udo van den Heuvel via devel
Hal, On 14-04-2020 05:07, Hal Murray wrote: > I just pushed a fix. Please test. With this fix the ntpd appears to be running a few hours now without issue. Udo ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel

Re: crash

2020-04-13 Thread Hal Murray via devel
> -rw--- 1 root root 1708 Dec 13 11:05 ./keys/_key-certbot.pem > Anything wrong in here? Your configure line includes early-droproot. Your command line includes -u ntp:ntp With that combination, it's probably trying to read the key after switching to user ntp. -- These are my opin

Re: crash

2020-04-13 Thread Udo van den Heuvel via devel
On 14-04-2020 07:22, Hal Murray wrote: > Given that you have tested most of the rest of your ntp.conf, my guess would > be file permissions on the certificate or key. The key is most likely since > there is no reason to hide the certificate. # cd /etc/letsencrypt/ # find . -exec ls -ld {} \; dr

Re: crash

2020-04-13 Thread Hal Murray via devel
udo...@xs4all.nl said: >> If you want the server side to support NTS, you need to add "nts enable" > With that in ntp.conf the ntpd does not start. Config needed I guess. The log file should have a useful message. It may take more than a few seconds to find due to all the cruft that is useful

Re: crash

2020-04-13 Thread Udo van den Heuvel via devel
On 14-04-2020 05:07, Hal Murray wrote: > >> # grep nts /etc/ntp.conf >> nts key /etc/letsencrypt/keys/_key-certbot.pem >> nts cert /etc/letsencrypt/csr/_csr-certbot.pem >> server time.cloudflare.com:1234 nts # TLS1.3 only > ... > > Thanks. > > I just pushed a fix. Please test. Will do

Re: crash

2020-04-13 Thread Hal Murray via devel
> # grep nts /etc/ntp.conf > nts key /etc/letsencrypt/keys/_key-certbot.pem > nts cert /etc/letsencrypt/csr/_csr-certbot.pem > server time.cloudflare.com:1234 nts # TLS1.3 only ... Thanks. I just pushed a fix. Please test. If you want the server side to support NTS, you need to add "

Re: crash

2020-04-13 Thread Udo van den Heuvel via devel
On 13-04-2020 20:18, Hal Murray wrote: > It's dying while trying to reload the certificate file. > > Is that happening after running for an hour? Yes. > > That turns into 2 questions. Why is it trying to reload the certificates, > and > why is it crashing? > > What's in your ntp.conf? I do

Re: crash

2020-04-13 Thread Udo van den Heuvel via devel
On 13-04-2020 19:39, Hal Murray wrote: >> Or will I do the debug build? > > Please do it again with symbols. > > How long does it run before it crashes? Seconds? Hours? ... (gdb) bt #0 use_certificate_chain_file (ctx=ctx@entry=0x0, ssl=ssl@entry=0x0, file=file@entry=0x555f9640 "/etc/let

Re: crash

2020-04-13 Thread Hal Murray via devel
I think I've found a way for that to happen. Were you missing a "nts enable" in your config file? but did have a "nts cert ..." pointing to a valid file? -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists

Re: crash

2020-04-13 Thread Hal Murray via devel
Thanks. It's dying while trying to reload the certificate file. Is that happening after running for an hour? That turns into 2 questions. Why is it trying to reload the certificates, and why is it crashing? What's in your ntp.conf? I don't need the whole thing, just the lines with "nts". Di

Re: crash

2020-04-13 Thread Hal Murray via devel
> Or will I do the debug build? Please do it again with symbols. How long does it run before it crashes? Seconds? Hours? ... -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listin

Re: crash

2020-04-13 Thread Udo van den Heuvel via devel
On 13-04-2020 16:01, Hal Murray wrote: > > udo...@xs4all.nl said: >> Started things this way. One gdb line worries me a bit: (No debugging symbols >> found in build/main/ntpd/ntpd) > >> Perhaps a different build is needed? > > I'm not sure how that stuff works. > > configure has an --enable-de

Re: crash

2020-04-13 Thread folkert via devel
> > udo...@xs4all.nl said: > >> Started things this way. One gdb line worries me a bit: (No debugging > >> symbols > >> found in build/main/ntpd/ntpd) > > > >> Perhaps a different build is needed? > > > > I'm not sure how that stuff works. > > > > configure has an --enable-debug-gdb option. T

Re: crash

2020-04-13 Thread Udo van den Heuvel via devel
On 13-04-2020 16:01, Hal Murray wrote: > > udo...@xs4all.nl said: >> Started things this way. One gdb line worries me a bit: (No debugging symbols >> found in build/main/ntpd/ntpd) > >> Perhaps a different build is needed? > > I'm not sure how that stuff works. > > configure has an --enable-de

Re: crash

2020-04-13 Thread Hal Murray via devel
udo...@xs4all.nl said: > I could disable NTSc for now to avoid crashes. Or if you have a patch I can > test with that one? Changing that may break (fix?) the crash. I'd like to understand that before we change anything else. Fixing Cloudflare will break all other NTS servers

Re: crash

2020-04-13 Thread Udo van den Heuvel via devel
On 13-04-2020 14:48, Hal Murray wrote: >> Apr 13 06:10:27 doos ntpd[204063]: EX-REP: Count=1 Print=1, Score=0.500, M4 >> V4 from [2606:4700:f1::1]:123, lng=84 > > That's saying the NTS stuff isn't working. 2606:4700:f1::1 is Cloudflare. > They have updated their servers to use the latest tweak

Re: crash

2020-04-13 Thread Hal Murray via devel
udo...@xs4all.nl said: > Started things this way. One gdb line worries me a bit: (No debugging symbols > found in build/main/ntpd/ntpd) > Perhaps a different build is needed? I'm not sure how that stuff works. configure has an --enable-debug-gdb option. That may do it. -- These are my op

Re: crash

2020-04-13 Thread Udo van den Heuvel via devel
On 13-04-2020 15:23, Hal Murray wrote: > when it crashes, you should get back to gdb > then > bt should give you a stack trace Started things this way. One gdb line worries me a bit: (No debugging symbols found in build/main/ntpd/ntpd) Perhaps a different build is needed? Udo

Re: crash

2020-04-13 Thread Hal Murray via devel
udo...@xs4all.nl said: > I did not find a core dump. How else can I get a stack dump? use gdb. You need to add -n to the command line args ot ntpd will detach itself. cd build dir gdb build/main/ntpd/ntpd run -n http://lists.ntpsec.org/mailman/listinfo/devel

Re: crash

2020-04-13 Thread Udo van den Heuvel via devel
On 13-04-2020 14:48, Hal Murray wrote: > Can you get a stack trace? I did not find a core dump. How else can I get a stack dump? > What were your configure options? CFLAGS="-O2" %{__python3} ./waf configure \ --prefix=/usr\ --enable-early-droproot\

Re: crash

2020-04-13 Thread Hal Murray via devel
> Apr 13 07:10:23 doos kernel: ntpd[204063]: segfault at 17f8 ip > 7f9d70252a70 sp 7ffe3665adc0 error 4 in libssl.so.1.1.1d[7f9d7022e000+ > 5] Can you get a stack trace? What were your configure options? > Apr 13 06:10:27 doos ntpd[204063]: EX-REP: Count=1 Print=1, Score=0.500, M

Re: crash

2020-04-13 Thread Udo van den Heuvel via devel
On 13-04-2020 14:13, Udo van den Heuvel via devel wrote: > All, > > This happens since yesterday: This is with a fairly recent 1.1.8 git build. Fedora is up to date. Udo ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/d

crash

2020-04-13 Thread Udo van den Heuvel via devel
All, This happens since yesterday: Apr 13 06:10:23 doos ntpd[204062]: INIT: ntpd ntpsec-1.1.8 2019-08-02T00:00:00Z: Starting Apr 13 06:10:23 doos ntpd[204062]: INIT: Command line: /usr/sbin/ntpd -u ntp:ntp -g -N -p /var/run/ntpd.pid Apr 13 06:10:23 doos ntpd[204063]: INIT: precision = 1.397 usec

Re: Usefuleness of noval (Was: Re: NTS crash...)

2019-03-27 Thread Gary E. Miller via devel
Yo Richard! On Wed, 27 Mar 2019 16:23:19 -0500 Richard Laager via devel wrote: > On 3/26/19 4:27 PM, Gary E. Miller via devel wrote: > > I added noval, still can not connect: > > > > server 204.17.205.23 maxpoll 5 nts noval # pi3 > > I wonder if we should revisit "noval". I think I originall

Usefuleness of noval (Was: Re: NTS crash...)

2019-03-27 Thread Richard Laager via devel
On 3/26/19 4:27 PM, Gary E. Miller via devel wrote: > I added noval, still can not connect: > > server 204.17.205.23 maxpoll 5 nts noval # pi3 I wonder if we should revisit "noval". I think I originally argued in favor of having it, as a standard TLS client knob. But IIRC, Daniel suggested it was

Re: NTS crash...

2019-03-26 Thread Gary E. Miller via devel
Yo Hal! On Tue, 26 Mar 2019 14:49:52 -0700 Hal Murray via devel wrote: > > Now it does not crash, anyway to make it work? > > I need to use some IPs for private, offgrid, networking. > > I use /etc/hosts, so that hasn't been a problem for me. Now I have two files to

Re: NTS crash...

2019-03-26 Thread Hal Murray via devel
> Now it does not crash, anyway to make it work? > I need to use some IPs for private, offgrid, networking. I use /etc/hosts, so that hasn't been a problem for me. > If you only need the name for the cert, and you are not checking the cert, it > should work. Yes, but I need

Re: NTS crash...

2019-03-26 Thread Gary E. Miller via devel
; But not my fast RasPi: > > > server 204.17.205.23 maxpoll 5 nts # pi3 > > That sure looks like at IP Address to me. Yup, I guess not enough coffee yet. Now it does not crash, anyway to make it work? I need to use some IPs for private, offgrid, networking. I added noval, sti

Re: NTS crash...

2019-03-26 Thread Hal Murray via devel
>> Are you trying to use NTS on an IP Address? Known bug. [That >> "(null)" happens on that case.] > Nope. Here is the line from ntp.conf that crashes my slow RasPi. But not my > fast RasPi: > server 204.17.205.23 maxpoll 5 nts # pi3 That sure looks like at IP Address to me. -- These a

Re: NTS crash...

2019-03-26 Thread Gary E. Miller via devel
Yo Hal! On Tue, 26 Mar 2019 13:44:41 -0700 Hal Murray wrote: > > Always fails for me. On seversl different RasPi. > > 2019-03-26T13:35:09 ntpd[26050]: DNS: dns_probe: (null), > cast_flags:1, flag= s:21001 > > Are you trying to use NTS on an IP Address? Known bug. [That > "(null)" happens

Re: NTS crash...

2019-03-26 Thread Hal Murray via devel
> Always fails for me. On seversl different RasPi. 2019-03-26T13:35:09 ntpd[26050]: DNS: dns_probe: (null), cast_flags:1, flag= s:21001 Are you trying to use NTS on an IP Address? Known bug. [That "(null)" happens on that case.] I thought I mentioned that case before but I guess I wasn't lou

Re: NTS crash...

2019-03-26 Thread Gary E. Miller via devel
Yo Hal! On Tue, 26 Mar 2019 13:29:45 -0700 Hal Murray via devel wrote: > > I applied today's NTPsec git head, with the mysyslog patches. > > Older and slower RasPi still crash on startup if the try to be an > > NTS client. > > Works for me. It's old e

NTS crash...

2019-03-26 Thread Hal Murray via devel
> I applied today's NTPsec git head, with the mysyslog patches. Older and > slower RasPi still crash on startup if the try to be an NTS client. Works for me. It's old enough that it has only 2 USB ports. -- These are my opinio

✘NTS crash...

2019-03-26 Thread Gary E. Miller via devel
Yo All! I applied today's NTPsec git head, with the mysyslog patches. Older and slower RasPi still crash on startup if the try to be an NTS client. RGDS GARY --- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E,

Re: Dave Morgan's report on the mystery crash

2017-09-06 Thread Eric S. Raymond via devel
Dave Morgan : > All, > I am at work at moment. If logs still needed I will send in about 10 > hours when back home. Thanks, we've found and fixed the problem. -- http://www.catb.org/~esr/";>Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com

Re: Dave Morgan's report on the mystery crash

2017-09-06 Thread Dave Morgan via devel
All, I am at work at moment. If logs still needed I will send in about 10 hours when back home. Dave On 05/09/2017, Eric S. Raymond via devel wrote: > Dave Morgan sent me a report on two instances of the mystery crash > tghat hapened to him last week (he also said the installation ha

Dave Morgan's report on the mystery crash

2017-09-05 Thread Eric S. Raymond via devel
Dave Morgan sent me a report on two instances of the mystery crash tghat hapened to him last week (he also said the installation had been stable since). Alas, I somehow fat-fingered my copy of that mail. Dave, please repost to the list so we can all stare at your logs and config

All hands alert - crash of unknown origin

2017-09-05 Thread Eric S. Raymond via devel
Everyone should read this thread: https://gitlab.com/NTPsec/ntpsec/issues/375 The only empirical clue we have is that it only seems to manifest under the kind of high load characterestic of pool service. I have a suspicion that somrthing is causing memory usage to spike and the OOM killer is rea

Re: ✘pyntpq crash

2016-10-27 Thread Eric S. Raymond
Gary E. Miller : > Yo Eric! > > Whoops: > > # ntpq/pyntpq -p > Traceback (most recent call last): > File "ntpq/pyntpq", line 1441, in > interpreter.onecmd(cmd) > File "/usr/lib64/python2.7/cmd.py", line 221, in onecmd > return func(arg) > File "ntpq/pyntpq", line 1051, in do_peers

✘pyntpq crash

2016-10-27 Thread Gary E. Miller
Yo Eric! Whoops: # ntpq/pyntpq -p Traceback (most recent call last): File "ntpq/pyntpq", line 1441, in interpreter.onecmd(cmd) File "/usr/lib64/python2.7/cmd.py", line 221, in onecmd return func(arg) File "ntpq/pyntpq", line 1051, in do_peers self.__dopeers(showall=False, mode

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-08-29 Thread Eric S. Raymond
Processing old mail... Hal Murray : > > I believe you're right that these platforms don't have it. The question is, > > how important is that fact? Is the performance hit from synchronous DNS > > really a showstopper? I don't know the answer. > > There are two cases I know of where ntpd does

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-29 Thread Matthew Selsky
On Tue, Jun 28, 2016 at 11:39:16PM -0700, Hal Murray wrote: > > matthew.sel...@twosigma.com said: > > "rlimit memlock 0" using Classic causes ntpd to died after 3 minutes with > > this error 2016-06-29T00:13:21.903+00:00 host.example.com ntpd[27206]: > > libgcc_s.so.1 must be installed for pthread

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Hal Murray
matthew.sel...@twosigma.com said: > "rlimit memlock 0" using Classic causes ntpd to died after 3 minutes with > this error 2016-06-29T00:13:21.903+00:00 host.example.com ntpd[27206]: > libgcc_s.so.1 must be installed for pthread_cancel to work What version of Classic are you running? I though t

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Matthew Selsky
On Tue, Jun 28, 2016 at 07:26:39PM -0400, Eric S. Raymond wrote: > Hal Murray : > > I think you have extrapolated from some modern systems to our whole target > > environment. I don't remember any discussion supporting memlock not being > > interesting/important. > > There were actually two thr

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Gary E. Miller
Yo Eric! On Tue, 28 Jun 2016 19:47:14 -0400 "Eric S. Raymond" wrote: > Gary E. Miller : > > Yo Eric! > > > > On Tue, 28 Jun 2016 19:26:39 -0400 > > "Eric S. Raymond" wrote: > > > > > (You should camp on #ntpsec. Also join our Signal channel - > > > because that's secured, most of the vuln

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Eric S. Raymond
Gary E. Miller : > Yo Eric! > > On Tue, 28 Jun 2016 19:26:39 -0400 > "Eric S. Raymond" wrote: > > > (You should camp on #ntpsec. Also join our Signal channel - because > > that's secured, most of the vuln discussions happen there.) > > Ah, how do we joing the Signal channel? Install Signal on

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Gary E. Miller
Yo Eric! On Tue, 28 Jun 2016 19:26:39 -0400 "Eric S. Raymond" wrote: > (You should camp on #ntpsec. Also join our Signal channel - because > that's secured, most of the vuln discussions happen there.) Ah, how do we joing the Signal channel? RGDS GARY --

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Eric S. Raymond
Hal Murray : > I think you have extrapolated from some modern systems to our whole target > environment. I don't remember any discussion supporting memlock not being > interesting/important. There were actually two threads about this attached to memlock-related bug reports in Classic. They ini

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Hal Murray
e...@thyrsus.com said: > After discussion with Daniel about the performance and security issues I > deleted the memlock code. As the comment explains: I think changes like that are worthy of a general announcement. > on modern systems, which swap so seldom > that many people don't bother wit

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-27 Thread Hal Murray
cbwie...@gmail.com said: > I was thinking of setting up associations using the DNS lookup code. If the > mechanism for adding new pool servers was blocking on the DNS call but > asynchronous to the rest of the daemon, I was figuring to call the lookup > with the name provided by the server direct

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-27 Thread Clark B. Wierda
On Mon, Jun 27, 2016 at 3:47 PM, Hal Murray wrote: > > cbwie...@gmail.com said: > > How are pool entries added when the service decides it needs more? > > There is some background stuff that roughly says "need more?", and if so > fires off the DNS lookup. > > > > Would it be possible to leverage

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-27 Thread Hal Murray
cbwie...@gmail.com said: > How are pool entries added when the service decides it needs more? There is some background stuff that roughly says "need more?", and if so fires off the DNS lookup. > Would it be possible to leverage this code for adding all servers specified > by name? Probably n

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-27 Thread Clark B. Wierda
A question: How are pool entries added when the service decides it needs more? Would it be possible to leverage this code for adding all servers specified by name? The DNS cost would be the same. This only difference is the name used for the query. Once a server is associated, the IP is used.

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Eric S. Raymond
Hal Murray : > > e...@thyrsus.com said: > > Ugh. Our options have just narrowed. I've just seen > > libgcc_s.so.1 must be installed for pthread_cancel to work Aborted (core > > dumped) > > > with memlock off in the build. > > Can you reproduce it? > > My guess is that you didn't really get me

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
e...@thyrsus.com said: > Ugh. Our options have just narrowed. I've just seen > libgcc_s.so.1 must be installed for pthread_cancel to work Aborted (core > dumped) > with memlock off in the build. Can you reproduce it? My guess is that you didn't really get memlock turned off. How about puttin

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
Possible crazy idea... How about we never kill the DNS helper thread. Just let it sit there in case it gets more work to do. The only cost is a bit of memory. Or maybe only do that if we are locking stuff into memory. -- These are my opinions. I hate spam.

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
e...@thyrsus.com said: >> We could try simplifying things to only supporting lock-everything-I-need >> rather than specifying how much. There might be a slippery slope if >> something like a thread stack needs a sane size specified. > I'm not intimate with mlockall, but it looks like it works

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Eric S. Raymond
Hal Murray : > If it uses threads, we still have the problem of not being able to load the > thread cleanup code. Maybe. We don't know if the libc implementation is vulnerable to that bug or not. I should do an experimental implementation on a branch and find out. -- http://www

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
e...@thyrsus.com said: >> Is getaddrinfo_a() in RTEMS? QNX? BSD? > It's not an OS thing, it's a toolchain thing. getaddrinfo_a() is > implemented using standard C and POSIX threads, it doesn't need OS-specific > support. Or it's in an optional extra library. > Linux has it because Linux uses

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Eric S. Raymond
Mark Atwood : > Is getaddrinfo_a() in RTEMS? QNX? BSD? It's not an OS thing, it's a toolchain thing. getaddrinfo_a() is implemented using standard C and POSIX threads, it doesn't need OS-specific support. Linux has it because Linux uses libc whether you're compiling with gcc or clang. Any of

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Mark Atwood
Is getaddrinfo_a() in RTEMS? QNX? BSD? On Sun, Jun 26, 2016 at 7:06 AM Eric S. Raymond wrote: > Eric S. Raymond : > > > What would you do if we discovered a case where we wanted it? > > > > Cry a lot. Then add logic to force synchronous DNS when memlocking is > > selected, and document this

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Eric S. Raymond
Eric S. Raymond : > > What would you do if we discovered a case where we wanted it? > > Cry a lot. Then add logic to force synchronous DNS when memlocking is > selected, and document this as a workaround for a bug we haven't fixed yet. Ugh. Our options have just narrowed. I've just seen libgc

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Eric S. Raymond
Hal Murray : > > e...@thyrsus.com said: > > In this case, we have two possible complexity-reducing fixes. One is to > > drop the memlock feature entirely. The other is to drop the buggy homebrew > > asynchronous-DNS lookup from Classic and use libc's. > > Dropping memlock is an interesting idea

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Hal Murray
e...@thyrsus.com said: > In this case, we have two possible complexity-reducing fixes. One is to > drop the memlock feature entirely. The other is to drop the buggy homebrew > asynchronous-DNS lookup from Classic and use libc's. Dropping memlock is an interesting idea. I can't think of any pla

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Eric S. Raymond
ad code not > getting locked into memory. I think that is what you are running into. > > The other is a tangle of error handling on out-of-memory issues by things > like pthread_create and DNS lookup. I think the latter end up with a retry > error code. I think I fixed som

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Hal Murray
a tangle of error handling on out-of-memory issues by things like pthread_create and DNS lookup. I think the latter end up with a retry error code. I think I fixed some/many of them to crash rather than retry on the assumption that memory wasn't going to get freed and I didn't know of

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Eric S. Raymond
Kurt Roeckx : > > This matches what I remember, except for "use more memory". There was a > > third > > workaround involved weird linker options to force early loading of the > > library. > > Like -WL,-z,now? That's not such a weird option. No, something related to the message I got when I cam

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Kurt Roeckx
On Sat, Jun 25, 2016 at 06:13:56PM -0400, Eric S. Raymond wrote: > Hal Murray : > > > > e...@thyrsus.com said: > > > 1. Apply Classic's workaround for the problem, which I don't remember the > > > details of but involved some dodgy nonstandard linker hacks done through > > > the > > > build syste

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Eric S. Raymond
g of the library. > > 2. Fix the actual problem. Well, that'd be nice, but Hal looked into it > > months ago and said he understood it but couldn't generate a fix. IIRC, he > > said it needed a full rewrite. That tells me the code is probably not > > salvage

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Eric S. Raymond
Kurt Roeckx : > On Sat, Jun 25, 2016 at 11:00:39AM -0400, Eric S. Raymond wrote: > > > > While this did enable me to recover from my errors, it also turned up > > a serious problem. The combination of the buggy async-DNS code we > > inherited from Classic and use of pool servers causes *very* fre

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Hal Murray
able. I don't remember that part. I use the pool command on several systems. I haven't seen a crash in ages. There was another interesting problem in this area. It was a bug in FreeBSD's trap handler. ntpd managed to trigger it consistently. . > I favor #4. I favor und

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Kurt Roeckx
On Sat, Jun 25, 2016 at 11:00:39AM -0400, Eric S. Raymond wrote: > > While this did enable me to recover from my errors, it also turned up > a serious problem. The combination of the buggy async-DNS code we > inherited from Classic and use of pool servers causes *very* frequent > crashes. Can yo

Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Eric S. Raymond
Yesterday I pushed some erroneous commits that got out because my smoke-test procedure was throwing false negatives. To deal with this, I've improved the way I test; everything now gets tried on snark before being pushed to the public repo so the test farm machines can see it. While this did enabl