Re: [gpsd-dev] refclock 28 gone wacky on me

2016-06-19 Thread Hal Murray
g...@rellim.com said: > Yes, that is expected. You need to tetll the Skytrazzq to force the top of > the second, and save to flash. What does that mean? bellyac...@gmail.com said: > Did that which is why I didn't understand the delivery coming at near the > end of the second. It appears tho

Fix for startup bug - please test

2016-06-19 Thread Hal Murray
Details in https://gitlab.com/NTPsec/ntpsec/issues/68 -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel

Fedora kernels missing hardpps

2016-06-20 Thread Hal Murray
Mumble. Long story. There are two parts of PPS processing in the kernel. One is RFC 2783 which describes an API for capturing the time when a pulse happens. The other is RFC 1589 which describes a PLL which basically moves all the timekeeping work into the kernel. If you turn on flag3 wit

Re: Documenting some progress - magic refclock addresses are almost gone

2016-06-25 Thread Hal Murray
e...@thyrsus.com said: > Does anyone on the list understand mode 6 well enough to answer questions? > My main one is: if I add a field to a mode 6 response, is it going to break > old ntpqs or will they silently ignore it? I think they ignore it, but try it to be sure. > (The response field I i

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Hal Murray
e...@thyrsus.com said: > 1. Apply Classic's workaround for the problem, which I don't remember the > details of but involved some dodgy nonstandard linker hacks done through the > build system. *However, I did not trust this method when I understood it.* > It seemed sure to cause porting difficul

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Hal Murray
e...@thyrsus.com said: > I think the hack is to force libgcc_s to be loaded early. I don't know how > to do that in waf. There are two problems in this area. One is the end-of-thread code not getting locked into memory. I think that is what you are running into. The other is a tangle of erro

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Hal Murray
e...@thyrsus.com said: > In this case, we have two possible complexity-reducing fixes. One is to > drop the memlock feature entirely. The other is to drop the buggy homebrew > asynchronous-DNS lookup from Classic and use libc's. Dropping memlock is an interesting idea. I can't think of any pla

Re: My first positive structural change to NTP

2016-06-25 Thread Hal Murray
> Here's how I think it should look: > -- > refclock shm unit 0 refid GPS > refclock shm unit 1 prefer refid PPS > -- I think you should start a list of that sor

Re: My first positive structural change to NTP

2016-06-26 Thread Hal Murray
strom...@nexgo.de said: > I think that's still perpetuating a mistake. This whole business of having > to specify two servers (or refclocks) for the same thing should go away. There is a fundamental issue. With a PPS, there really are two sources of time. Internally, ntpd needs two different

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
e...@thyrsus.com said: >> Is getaddrinfo_a() in RTEMS? QNX? BSD? > It's not an OS thing, it's a toolchain thing. getaddrinfo_a() is > implemented using standard C and POSIX threads, it doesn't need OS-specific > support. Or it's in an optional extra library. > Linux has it because Linux uses

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
e...@thyrsus.com said: >> We could try simplifying things to only supporting lock-everything-I-need >> rather than specifying how much. There might be a slippery slope if >> something like a thread stack needs a sane size specified. > I'm not intimate with mlockall, but it looks like it works

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
Possible crazy idea... How about we never kill the DNS helper thread. Just let it sit there in case it gets more work to do. The only cost is a bit of memory. Or maybe only do that if we are locking stuff into memory. -- These are my opinions. I hate spam.

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
e...@thyrsus.com said: > Ugh. Our options have just narrowed. I've just seen > libgcc_s.so.1 must be installed for pthread_cancel to work Aborted (core > dumped) > with memlock off in the build. Can you reproduce it? My guess is that you didn't really get memlock turned off. How about puttin

Re: Wonky NTP startup and the incremental-configuration problem

2016-06-26 Thread Hal Murray
An alternative option would be to implement rereading ntp.conf. For each line in ntp.conf, there are 3 possibilities. It's new or the value has changed, nothing has changed, or the item was dropped. The latter is the tricky case. The idea is to save a parsed copy of the old ntp.conf. As the

head broken if no refclocks

2016-06-26 Thread Hal Murray
after a simple ./waf configure [murray@fed raw]$ ./waf build --- building host --- Waf: Entering directory `/home/murray/ntpsec/raw/build/host' [1/5] Processing ntpd/ntp_parser.y [2/5] Compiling build/host/ntpd/ntp_parser.tab.c /home/murray/ntpsec/raw/ntpd/ntp_parser.y: In function ‘yyparse’:

Our testing sucks

2016-06-26 Thread Hal Murray
1007 ./waf configure --refclock=20,22 --enable-debug-gdb 1008 ./waf build 1009 gdb ./build/main/ntpq/ntpq (gdb) run -p Starting program: /home/murray/ntpsec/raw/build/main/ntpq/ntpq -p Missing separate debuginfos, use: dnf debuginfo-install glibc-2.21-13.fc22.x86_64 [Thread debugging using

waf list shouldn't need to be configured

2016-06-26 Thread Hal Murray
$ ./waf --list --- building host --- The cache directory is empty: reconfigure the project $ -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel

Re: Our testing sucks

2016-06-26 Thread Hal Murray
1010 ./waf configure 1011 ./waf build [ 74/206] Compiling ntpd/ntp_intercept.c ../../ntpd/ntp_control.c: In function ‘ctl_putpeer’: ../../ntpd/ntp_control.c:2319:8: error: ‘struct peer’ has no member named ‘procptr’ if (p->procptr != NULL) { ^ ../../ntpd/ntp_control.c:

waf --list needs to show old numbers as well as new names

2016-06-27 Thread Hal Murray
It's handy if you are updating a script. -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel

New ntpq peers chops refclocks to 6 characters

2016-06-27 Thread Hal Murray
But there is lots more room in that column. I think it will hold a worst case IPv4 numerical address. remote refid st t when poll reach delay offset jitter == HP5850 .GPS.0

Re: The new refclock directive is implemented and documented

2016-06-27 Thread Hal Murray
e...@thyrsus.com said: > and the > noun/verb "fudge" is reserved for the two time offset options. Why? What's the difference between a flag that gets set to 0 or 1 and a time that gets set to a number? > There will be a *limited* open period for bikeshedding about the driver > names. hp58

Re: waf --list needs to show old numbers as well as new names

2016-06-27 Thread Hal Murray
> Can you show me an example of this sort of script? How do you build things for your collection of systems? Do you really type the configuration in by hand each time? Do you use --refclock=all? Here is a fragment that I translated by hand: --refclock=irig,nmea,pps,hp58503a,shm,gpsd -- The

Re: New ntpq peers chops refclocks to 6 characters

2016-06-27 Thread Hal Murray
How does your new stuff handle multiple instances of a refclock type? For a test case, I suggest a USB driver in addition to a HAT. Try both NMEA/PPS as well as both SHM and various combinations. The JSON driver uses the high bit of the unit to enable/disable the PPS. The NMEA and HP driver

Re: The new refclock directive is implemented and documented

2016-06-27 Thread Hal Murray
e...@thyrsus.com said: > I don't think shm needs to change at all. It says what it is - data coming > over System V shm, which defines its own format by the shared structure I like SHM. I think there are non-gpsd sources of SHM data. I have no strong preferences for gpsd vs json. -- These a

Re: waf --list needs to show old numbers as well as new names

2016-06-27 Thread Hal Murray
e...@thyrsus.com said: >> --refclock=irig,nmea,pps,hp58503a,shm,gpsd > I'm not seeing a problem here. Isn't it trvial to get those names from, > e.g., https://docs.ntpsec.org/latest/refclock.html ? The problem is not to "get the names", it's to translate an old number to the new name. You may

Re: New ntpq peers chops refclocks to 6 characters

2016-06-27 Thread Hal Murray
e...@thyrsus.com said: > More suggestions like this, please. bps may not be enough. There is also the parity and stop bits, but I don't think they are fiddled much. The HP driver uses one mode bit to switch from whatever the default is to a different baud rate and parity. It may be simpler t

offset: time1 or time2

2016-06-27 Thread Hal Murray
e...@thyrsus.com said: > Which reminds me: an addition I'm considering is adding "offset" as a > synonym for time1 or time2, whichever one usually sets an offset for time > reported from the unit. Only., I'm not clear which it should be; either it > varies by driver or I'm not understanding the do

Re: The new refclock directive is implemented and documented

2016-06-27 Thread Hal Murray
e...@thyrsus.com said: >> hp58503a should probably be hpgps. It works for several devices. > OK. Can you enumerate some other devices so I can list them in the header > comment and on the driver page? The documentation already mentions the Z3801A. There are a lot of them in the ham/hacker co

Re: The new refclock directive is implemented and documented

2016-06-27 Thread Hal Murray
e...@thyrsus.com said: > An argument for "json", maybe. But not a really compelling one, because > GPSD defined the protocol and anything else emitting it would probably be > emulating GPSD deliberately. I think I prefer JSON for the same reason I like SHM. I think the real question is does the

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-27 Thread Hal Murray
cbwie...@gmail.com said: > How are pool entries added when the service decides it needs more? There is some background stuff that roughly says "need more?", and if so fires off the DNS lookup. > Would it be possible to leverage this code for adding all servers specified > by name? Probably n

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-27 Thread Hal Murray
cbwie...@gmail.com said: > I was thinking of setting up associations using the DNS lookup code. If the > mechanism for adding new pool servers was blocking on the DNS call but > asynchronous to the rest of the daemon, I was figuring to call the lookup > with the name provided by the server direct

ntpq mrulist: cpu hog

2016-06-28 Thread Hal Murray
I have a pool server. mru maxmem is set big enough to capture a whole day. Each midnight, a cron job fires off to capture everything to a file. The file is 100 megabytes. While that is going on, ntpq is using 95% of the cpu. If anybody is looking for a nice distraction, it would be interesti

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Hal Murray
e...@thyrsus.com said: > After discussion with Daniel about the performance and security issues I > deleted the memlock code. As the comment explains: I think changes like that are worthy of a general announcement. > on modern systems, which swap so seldom > that many people don't bother wit

Re: Device driver mode bits and other skulduggery.

2016-06-28 Thread Hal Murray
e...@thyrsus.com said: > One thing that jumps out at me is that several drivers have a clockstats > verbosity option, always flag4 (which, alas, is used for other things too). There may have been a general idea that flag4 would be used to enable clockstats from an individual driver instance. T

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Hal Murray
matthew.sel...@twosigma.com said: > "rlimit memlock 0" using Classic causes ntpd to died after 3 minutes with > this error 2016-06-29T00:13:21.903+00:00 host.example.com ntpd[27206]: > libgcc_s.so.1 must be installed for pthread_cancel to work What version of Classic are you running? I though t

Kernel PPS processing

2016-06-29 Thread Hal Murray
http://users.megapathdsl.net/~hmurray/ntpsec/glypnod-pps-kernel.png If you turn on flag3 for a PPS driver on a Linux system, you get this error message: 06-20T12:25:32 ntpd[988]: refclock_params: kernel PLL (hardpps, RFC 1589) not implemented I poked around a bit. Those options are in drivers/

Re: adns is looking plausible

2016-06-29 Thread Hal Murray
e...@thyrsus.com said: > I haven't looked at the code itself yet, but from reading the C header file > and the website, adns is looking like a plausible replacement for our > homebrew async-DNS. Good find! One feature that pushes me in that direction is being able to get at the TTL. > and redu

Re: Kernel PPS processing

2016-06-29 Thread Hal Murray
> Can you quantify the better? I would have expected identical... Did you look at the graph? http://users.megapathdsl.net/~hmurray/ntpsec/glypnod-pps-kernel.png I'm not sure why you would expect performance to be identical. Dave Mills and crew went to a lot of effort to get code into various

Re: Technical strategy and performance

2016-06-29 Thread Hal Murray
fallenpega...@gmail.com said: > Thank you Eric. Have read, am pondering, and welcome other people to weigh > in. The big picture question that comes to mind is why did we start by forking ntp classic? Why not start from scratch? Did anybody consider chrony? What other options are/were ther

Re: Kernel PPS processing

2016-06-29 Thread Hal Murray
matthew.sel...@twosigma.com said: > We tested booting with "nohz=off intel_idle.max_cstate=0" and it made a > difference in our production clocks. Interesting. Thanks. How did you decide to go there? Did you try those 2 changes separately? Was that with PPS or just a typical system? Are you

Re: Technical strategy and performance

2016-06-29 Thread Hal Murray
Thanks. I didn't see any surprises. I'm happy with the general idea, it's the details that get interesting. Removing cruft is good. Removing features is not. There is a trade off between the cruftiness of the code and the importance of any features it includes. This example gets tangled up

Re: Kernel PPS processing

2016-06-29 Thread Hal Murray
g...@rellim.com said: >> I'm not sure why you would expect performance to be identical. > Because thhey use the same kernel generated time stamp and PLL algorithm. There are two chunks of PPS code in the kernel with separate RFCs. One is getting the time stamp. The other is doing the PLL. Th

Re: Kernel PPS processing

2016-06-29 Thread Hal Murray
g...@rellim.com said: > Wow. I thought something was wrong. My local clock offset (peerstats file) > has always been hanging around 100ppm. Stable to ±1ppm so I figured > that was normal. > After reboot the local clock offset started at 9ppm and has been slowyly > going down, now under 2ppm.

Re: Kernel PPS processing

2016-06-29 Thread Hal Murray
> Local clock frequency offset, as opposed to local clock time offset. Most NTP documentation calls that drift. Its magnitude is not very interesting when discussing quality of time. Changes over time can be interesting. It's usually much more interesting to look at the clock offset. There a

Re: My task list

2016-06-30 Thread Hal Murray
> 1. Try replacing our buggy async-DNS code with the c-ares library. You keep calling the existing code "buggy". Is that correct, or are you just being sloppy since you don't like it (perhaps justifiably) and it has triggered bugs/quirks in other parts of the system. As far as I can tell, our

Re: Technical strategy and performance

2016-06-30 Thread Hal Murray
e...@thyrsus.com said: > In many cases, especially in governmant, they *can't* -- they have lengthy > certification requirements for new infrastructure components. If they are on the ball, they will have to do almost as much work to (re)certification after all the changes we have made. >> Wher

Re: Kernel PPS processing

2016-06-30 Thread Hal Murray
g...@rellim.com said: > I took another look, and realized I misunderstood the y axis. And that you > are plotting loopstats and I'm looking at offsets. So not the bad I > thought. I can't figure out what that means. I was plotting the offset column from loopstats. > To get apples and apples,

Re: Master Does Not Compile on Centos 6

2016-06-30 Thread Hal Murray
j...@rtems.org said: > This likely fails on other platforms since it is a mismatched brace: Yes. Anything without a refclock. Amar: Buildbot needs a few more build runs with various configurations to catch things like this. They don't need to be run on all systems but they should be run on

Re: Master Does Not Compile on Centos 6

2016-06-30 Thread Hal Murray
>> This likely fails on other platforms since it is a mismatched brace: > Yes. Anything without a refclock. Fix pushed. -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel

Re: Technical strategy and performance

2016-06-30 Thread Hal Murray
ja...@azze.org said: > This is why I try to make noise when things are broken on RHEL/CentOS 6.x. I > don't see a builder for that OS on buildbot.ntpsec.org. The Red Hat > Enterprise family (RHEL, CentOS, Scientific Linux, Oracle Enterprise Linux) > and SuSE Linux Enterprise Server are where we bo

Re: Technical strategy and performance

2016-06-30 Thread Hal Murray
e...@thyrsus.com said: > There are some prerequisites. Libraries need the library installed to run > and in addition, the development headers installed to build. > Python 2.x, x >= 5 > bison > libevent 2.x > libcap > OpenSSL > GNU readline > BSD libedit > sys/timepps.h > asciidoc, a2x It's (muc

asciidoc tables

2016-07-01 Thread Hal Murray
The table widths have things like: [width="100%",cols="<34%,<33%,<33%"] I find that makes a table that is ugly and hard to read. I could tune the widths, but I don't know how wide the viewer's display will be. Is there a better way to do things? I'd like to say "make this column as wide a

Add usestats to collect resource usage statistics

2016-07-01 Thread Hal Murray
I just pushed the code. You will get things like this: 57570 76638.360 3600 19.221 29.499 1541 0 0 0 2984 288288 2123 0 8428 57570 80238.357 3600 20.812 25.956 1062 0 0 0 3024 246608 2274 0 12652 57570 83838.357 3600 23.353 26.497 833 0 0 0 2992 255329 2556 0 16164 57571 1038.358 3600 31.154 31.3

Re: Avoiding merge bubbles

2016-07-02 Thread Hal Murray
Thanks. I hate that crap as much as anybody. > git pull --rebase I missed the --rebase part. Is there any way to set things up so --rebase is the default with pull? Is there any way to recover after I forget? Can we fix the push process to reject pushes if they have that type of comment? (I

Re: Avoiding merge bubbles

2016-07-02 Thread Hal Murray
e...@thyrsus.com said: >> Is there any way to set things up so --rebase is the default with pull? > Yes. If you look in your .git/config, adding the "rebase = true" line will > set --rebase for all pulls from master. Thanks. Where should that be documented? I think I set that when you sent ou

memory locking

2016-07-02 Thread Hal Murray
e...@thyrsus.com said: > BTW, I think I've knocked the mlockall/threads/async bug on the head. I > swiped some code from chrony that does memlocking after telling ntpd it can > have as much memory as it wants - ntpd's worst-case memory requirement ain't > much. I've had that version running contin

Re: memory locking

2016-07-02 Thread Hal Murray
e...@thyrsus.com said: > I'm not sure what the referent of "that" is. The statistics-gathering I've > seen seems to be all about writing line-at-a-time records to various stats > files; I can't see that generating a lot of memory pressure. > If there's somewhere in the code that is allocating mem

Is digest mode working for mailing lists?

2016-07-02 Thread Hal Murray
A few weeks ago, I signed up for bugs and vc in digest mode. I thought I got one message, maybe one each list, but I haven't seen anything since. I see stuff in the archives for vc but the archives for bugs is empty. -- These are my opinions. I hate spam.

Refclock quirk

2016-07-03 Thread Hal Murray
I'm seeing things like this: remote refid st t when poll reach delay offset jitter == +fe80::21e:c9ff: .PPS.1 u 86 1024 3770.483 -3.643 0.330 +fe80::226:2dff: .PPS.

Re: memory locking

2016-07-03 Thread Hal Murray
> esr@snark:~/software/ntp-rescue/ntpsec$ ntpq -c mrulist > ***Command `mrulist' unknown I don't know what's wrong on your end. When I cut/paste that line, I get things like this: Ctrl-C will stop MRU retrieval and display partial results. Retrieved 5 unique MRU entries and 0 updates. lstint av

Re: Refclock quirk

2016-07-03 Thread Hal Murray
e...@thyrsus.com said: > Doesn't show up in bisection, and now doesn't reproduce with the head > revision either. There is no point is bisecting unless you have a test case that fails on head. Please try using NMEA and PPS rather than SHM. You have to wait a while. I'm not sure how long. I th

Re: Refclock quirk

2016-07-03 Thread Hal Murray
hmur...@megapathdsl.net said: > You have to wait a while. I'm not sure how long. I think it's the normal > ramp-up on polling interval. It takes about 6 minutes. remote refid st t when poll reach delay offset jitter =

Re: Zero-configuration ntpd

2016-07-03 Thread Hal Murray
> Default servers should be the global NTP pool. In general, it's a very bad idea to wire names or addresses into code, especially if you don't own/control the resource being used. This case is less-bad than many others since it is possible (maybe even easy) to change/fix. The problem is that

Anti-DDoS

2016-07-03 Thread Hal Murray
Is there consensus on what we should be doing? Actually, I'm looking for a bigger picture of what all UDP services should be doing. DNS is the other obvious example. If you had asked me a year or two ago, I would have said "rate limiting" and thought that solved the problem. It does solve t

Re: Zero-configuration ntpd

2016-07-03 Thread Hal Murray
g...@rellim.com said: > Default for statistics can be no stats gathering. > Agreed. They just grow forever. Ditto ntp.log that should default to the > system syslog. The main log file does default to syslog > Off-topic: ntpd should have a max number of saved logs. The default is no log files.

Re: asciidoc tables

2016-07-03 Thread Hal Murray
e...@thyrsus.com said: > Sadly, proportional is all you can do in the table model of XML-DocBook > (which is what asciidoc uses as a back end). Can I specify the total width in characters? Can we assume the width is appropriate for a man page? That might look ugly with narrow or wide web page

Re: question about upgrading from Classic to NTPsec (packaging issue)

2016-07-04 Thread Hal Murray
j...@systemsartisans.com said: > I am in the process of trying to create an RPM package from the repo's > current head. Given that I would expect this to be used by sysadmins, etc. > who might already have installed the Classic version (very possibly from > their distro's package sources), ho

Re: Kernel PPS processing

2016-07-04 Thread Hal Murray
strom...@nexgo.de said: > On another tangent back to NTP, I'm wondering if it wouldn't make sense to > offload the timestamp filtering at least to the VC4. Most NTP boxes would > run headless anyway, so there'd be 16 processors sitting idle for that sort > of thing. Not likely. That sort of wo

Re: Kernel PPS processing

2016-07-05 Thread Hal Murray
g...@rellim.com said: > The big thing for NTP and gpsd would be the 64 bit math. Both do a lot of > 64 bit math. You can do 64 bit arithmetic without using 64 bit pointers. Somebody mentioned that the plan is have one boot file that runs on all Raspberry Pis. Are things setup so that user co

Re: Requesting code review on possible fix for nopeer/pool conflict

2016-07-05 Thread Hal Murray
dfoxfra...@gmail.com said: > The whole receive() function you're looking at is about to get blown away in > my ntp_proto refactor. Can you hold off on touching it until next week? Please don't push any big changes until Eric and/or I get the polling tangle fixed. dfoxfra...@gmail.com said: >

Re: Requesting code review on possible fix for nopeer/pool conflict

2016-07-05 Thread Hal Murray
dfoxfra...@gmail.com said: > What exactly is the "polling tangle" you're referring to? I talked to Eric > about this earlier today, and he mentioned something about the polling > interval drifting to 1024 seconds on a consistently reachable server. But > AFAIK, nothing has changed and that's alway

Refclock polling ramp up

2016-07-06 Thread Hal Murray
I just pushed a fix. Would you please sanity check... For servers, minpoll and maxpoll default to 6 and 10. There is also a check to make sure that minpoll isn't greater than maxpoll. For refclocks, minpoll defaults to 6 and maxpoll defaults to minpoll. The problem is that you were storing m

Do we have a list of user visible changes from ntp classic?

2016-07-06 Thread Hal Murray
It's probably all in NEWS (or should be), but that's chronological and seems hard to read. For example, the deleted refclocks are scattered all over the place. I think I'm suggesting something like CHANGES-form-ntp-classic -- These are my opinions. I hate spam. _

Re: Removing interleaved mode

2016-07-06 Thread Hal Murray
dfoxfra...@gmail.com said: > With Eric's permission, I have removed support for interleaved mode in my > proto-refactor branch. Here is its commit-message eulogy: Seems fine with me. I've never used it. We should test things to make sure nothing strange happens. I think that requires 4 syste

Linux capabilites check broken on NetBSD

2016-07-06 Thread Hal Murray
On NetBSD: 07-06T15:42:17 ntpd[4940]: root can't be dropped due to missing capabilities. -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel

Re: Weirdest bug yet.

2016-07-06 Thread Hal Murray
No rawstats or protostats either. e...@thyrsus.com said: > It was adding "subtype" as an alias for "mode" in the lexical analyzer. This > somehow confuses the crap out of the parser's FSM. ... Remember the saveconfigandquit stuff you ripped out? That would have caught this. (if we had used it

Re: Linux capabilites check broken on NetBSD

2016-07-07 Thread Hal Murray
matthew.sel...@twosigma.com said: > NetBSD should be using the clockctl interface: > http://netbsd.gw.com/cgi-bin/man-cgi?clockctl+4.i386+NetBSD-7.0 Thanks. Eric, I should probably fix it since I have a test case. Should we add HAVE_SYS_CLOCKCTL to waf, or just test for __NetBSD__? -- These

Re: Linux capabilites check broken on NetBSD

2016-07-07 Thread Hal Murray
> Attempted port fix pushed. Please test. Missing the _H on HAVE_SYS_CLOCKCTL Fix pushed. More testing in the pipeline. -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel

Re: Requesting review of "Eliminate some pointless gymnastics in the config parser."

2016-07-08 Thread Hal Murray
e...@thyrsus.com said: > About eight hours ago I removed some code that looked so stupid that I now > wonder if it was serving some purpose I don't understand. I don't know of any reason for the old code. Your change looks sane to me. I don't see how to fully test it. The iburst case seems t

Any important bugs/quirks?

2016-07-08 Thread Hal Murray
Things have been a bit, well, "interesting" the past few days. I think everything has been put back together. Is there anything that needs fixing that I/we have missed? (I'm not looking for new features we haven't implemented yet, just things that we broke or things we changed that don't wor

SIGHUP catcher, issue #78

2016-07-10 Thread Hal Murray
I just pushed code that catches SIGHUP and reopens the log file if it has changed and checks for a new leapseconds file. You can poke it by hand with killall -HUP ntpd We should get a chance to test the new leap file stuff soon. It's time for a new one. Besides, a day or two ago, the news

Sandboxing: How important is seccomp?

2016-07-11 Thread Hal Murray
[For those not familiar with it, seccomp gives the kernel a list of syscalls that the program is allowed to use. All others becomes illegal. So if a bad guy finds a stack overflow there is (hopefully) a good chance that any code he tries to run will crash.] I've got it working on Intel. It d

Anybody know how to debug things like this?

2016-07-14 Thread Hal Murray
I'm working on segcomp. I'm at the stage where things mostly work and I'm trying to find obscure code paths that use a syscall that isn't yet on the OK list. The SIGSYS means it tried to call something that wasn't on the list. Normally, a simple backtrace will let me can figure out what it is

Re: Anybody know how to debug things like this?

2016-07-14 Thread Hal Murray
> Seems like a situation made for investigating with Mozilla rr. Could you please say a bit more? I don't know anything about Mozilla rr. Why is that likely to help me in this case? I think I have tracked down the problem. It's trying to start a new thread. The clone syscall wasn't on the

Re: Anybody know how to debug things like this?

2016-07-14 Thread Hal Murray
e...@thyrsus.com said: > It's like a symbolic debugger that keeps an execution trace and lets you > step backwards in time. Under rr you could induce the crash, then step back > to the last syscall. I don't think that's going to help. I'm in a signal handler from the current attempted syscall.

Re: Anybody know how to debug things like this?

2016-07-15 Thread Hal Murray
e...@thyrsus.com said: > The only safe alternative would be to force the initial DNS lookups to be > synchronous. That doesn't work for the pool case. It wants to get more servers if some of the ones it is using stop responding. > A: get configuration (that's the early thread launch) We coul

Re: Anybody know how to debug things like this?

2016-07-16 Thread Hal Murray
e...@thyrsus.com said: > I'm in favor of cleaning up and fixing some of these order dependencies, but > I'd rather get us to a safe and functioning state first. Accordingly, > splitting out seccomp() implementation to do it early and keeping droproot > late is looking better and better. I found

Odds and ends...

2016-07-17 Thread Hal Murray
There is some ugly code in ntp_loopfilter that's setting up a signal handler in case ntp_adjtime doesn't work. It's the sort of stuff Eric loves to rip out. I can't figure out why that code would be useful. I expect we should figure that out at build time. I've commented it out. It's still

Re: adev.py

2016-07-21 Thread Hal Murray
> Are you saying the unix time stamp result in the output is wrong? I didn't look that far. -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel

Re: Removing the worst cruft

2016-07-23 Thread Hal Murray
e...@thyrsus.com said: > But AUSTRON/IRIG/CHU...I think there's a good (though not absolutely > dispositive) case for simply dropping them all. The Austron driver uses Loran. It was unplugged in the US several years ago. I think it's still used in Northern Europe. It may come back in the US

Re: Removing the worst cruft

2016-07-23 Thread Hal Murray
g...@rellim.com said: > Several commercial NTP products do it, we wantt them to convert from NP > Classic to NTPsec. Are they sending IRIG or listening to it? -- These are my opinions. I hate spam. ___ devel mailing list devel@ntpsec.org http://

Re: Removing the worst cruft

2016-07-23 Thread Hal Murray
e...@thyrsus.com said: > No, you were right the first time - and it's something I should have > noticed. That driver is designed for an obsolete class of sound card. I don't know much about audio. What is the right API to use? All we need is a batch of samples and the time they arrived. --

Re: Removing the worst cruft

2016-07-23 Thread Hal Murray
fallenpega...@gmail.com said: > What I am wishing for, would be for someone to write a standalone in its own > demon process IRIG driver, that then speaks GPSD or SHM to NTPsec. But > testing such a beast would be specialized task. I think things are much more complicated than it seems. That do

Re: Removing the worst cruft

2016-07-24 Thread Hal Murray
e...@thyrsus.com said: > According to Wikipedia LORAN is dead. The principal station chains shut down > in 1979-1980. Last live use was in China in the 1990s. > What you are probably thinking of is DECCA, which was a hyperbolic radio > navigation system (very similar operating principle to LORAN

Possible cleanup

2016-07-24 Thread Hal Murray
There is a SAVE_ERRNO macro that wraps around some code to preserve errno. It's only used in a few places. The first place I saw was calling msyslog. That would make sense if following code did something that depended on the error, but I checked all 4 cases and they never looked at errno. T

Re: Removing the worst cruft

2016-07-24 Thread Hal Murray
e...@thyrsus.com said: > Drivers that very well might fail the ten-year test: truetime, magnavox, > palisade, oncore, jupiter. Palisade is in use. It covers Trimble TSIP which includes the Thunderbolt which was widely available surplus only a few years ago and is popular with time-nuts. It s

Re: Removing the worst cruft

2016-07-31 Thread Hal Murray
> Can the palisade/trimble driver be replaced with a parse driver? I doubt it, but I'm far from familiar with the parse driver. Based on Eric's previous comments, the parse driver handles devices that provide the time in an easy to parse format. TSIP might fit that if all goes well. But there

Kernel PLL graphs

2016-08-01 Thread Hal Murray
There are two parts to PPS processing in the kernel. RFC 2783 describes an API for capturing time stamps. RFC 1589 describes a PLL that lives in the kernel. Most Linux distros don't support RFC 1589. The code is in the kernel, but it doesn't work with the shipped kernels. It requires !NO_H

Re: driftMime-Version: 1.0

2016-08-03 Thread Hal Murray
g...@rellim.com said: > 1. On startup chronyd checks the time stamp on the drift file. > if the timestamp > sysclock, the sysclock is set to the timestamp I vote that we don't do anything, not even make it optional behind a command line switch. We have more important things to do. The OS

Re: Kernel PLL graphs

2016-08-03 Thread Hal Murray
matthew.sel...@twosigma.com said: > I'm using maxpoll of 1 on my stratum 1 servers. And I have !NO_HZ set. My > offsets stay belong 1 microsecond as reported by ntpq. If we switched the > units to nanoseconds, that might be interesting. Time to make sure I've got the right number of negatives.

  1   2   3   4   5   6   7   8   9   10   >