Re: What's the best way to fix warnings from unused result

2019-04-07 Thread Hal Murray via devel


> Does a simple void cast work?  E.g.:
>   (void) strerror_r(...)

I haven't found the magic using that approach.

../../ntpd/nts.c:214:16: warning: ignoring return value of ‘strerror_r’, 
declared with attribute warn_unused_result [-Wunused-result]
 (void) strerror_r(errno, errbuf, sizeof(errbuf));
^


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: switch missing default case

2019-04-07 Thread Hal Murray via devel
> New warning on arm64:

Also happens on Fedora, both 64 and 32 bit.


Is it reasonable to fix the CI system to complain about warnings except for a 
(hopefully short) list of known ones that we can't fix?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: switch missing default case

2019-04-07 Thread Hal Murray via devel
> I'm unaware of any we can't fix, except the bison one.

There are several others.

>From old CentOS:
> ntp_parser.tab.c:389:6: warning: "YYENABLE_NLS" is not defined
> ntp_parser.tab.c:1323:6: warning: "YYLTYPE_IS_TRIVIAL" is not defined 

>From NetBSD on a Raspbery Pi:
> /usr/pkg/lib/libpython2.7.so: warning: warning: tmpnam()
> possibly used unsafely, use mkstemp() or mkdtemp()
> /usr/pkg/lib/libpython2.7.so: warning: warning: tempnam()
> possibly used unsafely, use mkstemp() or mkdtemp() 

>From old (but still supported) NetBSD:
> ../../ntpd/ntp_control.c:1305:17: warning: array subscript
> has type 'char' [-Wchar-subscripts]





-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: switch missing default case

2019-04-08 Thread Hal Murray via devel


Gary said:
> I fixed the libjsmn missing default one.

Thanks.

And thanks to whomever fixed the MacOS glitch.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


NST: update to ntpq -c nts

2019-04-10 Thread Hal Murray via devel


The old code had several cases where there were 2 counters for things like 
received NTS packets, total and bad.  I changed that to good and bad.

A mix of old/new ntpq/ntpd won't show the total or good.

--

There is a lot of crap out there on the big bad internet.

NTS KE serves good:  21
NTS KE serves_bad:   81

I assume most of that is bad guys looking for SSH running on alternate ports 
or something similar.

 1 Apr 10:09:05 ntpd[825]: NTSs: TCP accept-ed from 106.75.3.52:44784
 1 Apr 10:09:06 ntpd[825]: NTSs: SSL accept from 106.75.3.52:44784 failed, 
0.360 sec
 1 Apr 10:09:06 ntpd[825]: NTS: error:1408F09C:SSL 
routines:ssl3_get_record:http request

 1 Apr 10:09:16 ntpd[825]: NTSs: TCP accept-ed from 106.75.3.52:40560
 1 Apr 10:09:16 ntpd[825]: NTSs: SSL accept from 106.75.3.52:40560 failed, 
0.433 sec
 1 Apr 10:09:16 ntpd[825]: NTS: error:1408F10B:SSL 
routines:ssl3_get_record:wrong version number


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


ntpq flakiness

2019-04-10 Thread Hal Murray via devel


I'm seeing things like this when doing ntpq -p to a far away site with lots of 
opportunities for lost packets.
  ***No information returned for association 21216

Has anybody seen anything similar?

I only started seeing it recently.  It's probably because my DSL line has gone 
flaky.  I don't remember any recent changes to ntpq, but it seems worthwhile 
to inquire.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ntpq flakiness

2019-04-10 Thread Hal Murray via devel
Thanks.

There is another quirk that seems related to retransmissions.  I forget the 
details.  I'm pretty sure there is bug report on it.


> I do remember that there was a very old issue with flaky behavior of ntpq
> over WiFi that we thought might be due to a bug in the fragment reassembly

If I interpret "WiFi" as lost packets, that fits in with the other collected 
observations.

My guess is that this new quirk is just what happens with "peers" when lots of 
packets get lost and we just haven't seen it enough before now to get reported.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: shm refclock

2019-04-10 Thread Hal Murray via devel


g...@rellim.com said:
> I would go further and say that order matters not at all.  What matters is to
> start both as root.  Depending on whether I am working on gpsd of ntpd I will
> just keep restarting the one I am working on.  Never an issue. 

How do you configure ntpsec?

I think the order matters if you use --enable-early-droproot


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ntpq flakiness

2019-04-10 Thread Hal Murray via devel


> It's one of the few times I've gone on an expedition like that and completely
> failed.  Whatever it is, it's not going to be obvius. 

Here is an interesting possibility.  How about the code is working as designed 
but the parameters are set wrong.  Maybe not "wrong".  How about "not 
agressive enough for crappy conditions"?

I think you said it did one retransmission after 5 seconds.  Can you easily 
patch that to be 3 or adjustable from the command line?  It should double the 
time each retry, but you can start lower if you collect a few samples to learn 
what to expect.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Copyright

2019-04-10 Thread Hal Murray via devel


I just updated the NTS code to include a Copyright, copied from another module.

If this isn't appropriate, please tell me what it should be.

/*
 * nts_cookie.c - Network Time Security (NTS) cookie processing
 * Copyright 2019 by the NTPsec project contributors
 * SPDX-License-Identifier: BSD-4-Clause-UC
 *
...



Should we update all the copyright dates based on the most recent git checkin 
dates?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: shm refclock

2019-04-10 Thread Hal Murray via devel


Gary (on users) said:
> Sure feels like a droproot permission problem.

It's a feature, not a bug.  ;(

If gpsd runs first, it needs to set things up so user ntpd can write to the 
SHM it creates.  ntpd would have the same problem if gpsd had an 
early-droproot.

Can we fix this by putting users ntpd and gpsd in the same group?

--

We really should fix the SHM stuff so the reader can be read only.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


NTS: What next?

2019-04-11 Thread Hal Murray via devel


I'm close to finishing cleaning up all the FIXMEs I had left behind.

What's next?

There are 2 major items on my list:

More and/or alternate certificate checking.
  There are lots of possibilities in this area.  I haven't found one that 
looks clean and simple.  We can afford modest amounts of setup work since we 
know the target servers and they don't change often.  (as compared to a web 
client where you may see a new server on each mouse click, and the user wants 
it right-now)  We also have to consider getting started without knowing the 
time.  Eric: Have you looked at this area?

Cluster support.
  I carefully said "cluster" rather than "pool".  I'm interested in the case 
where all the NTP servers are run by the same organization or at least the 
server operators are well known and trusted by the cluster admin.

I think the current code is ready for a release and more testers.  We can't do 
a "Supports NTS" release until the draft turns into a RFC since some numbers 
haven't been assigned by IANA yet.  Call it pre-beta or some good weasel-word 
like that.

The current code has one interesting quirk.  The cookie key gets rotated every 
hour rather than every day.  If your polling interval ramps up enough, your 
cookies become invalid and that tests the retry NTS-KE logic.  We will want to 
remove that before a non-beta release.  Maybe sooner.

There is a potential item: We may need the NTP server to listen on some port 
in addition to 123 in order to get around various filtering rules leftover 
from NTP DDoS amplification attacks.  Need more data which a release would 
help.


Medium size items:

I think we need a couple of scripts for monitoring certificates.   Details 
TBD, but I'm thinking of things like when does my certificate expire and tell 
me ??? about the certificates on the servers I'm using.  This will get more 
interesting when we figure out what to do about additional certificate 
checking.  It would be half just a convenient reminder of how to do things and 
maybe a cron job to warn you that your certificates are timing out or some 
server's certs are busted.

Save a couple of cookies to disk so we can restart/reboot without going 
through NTS-KE.  They need to be refreshed occasionally.  That also means 
saving S2C and C2S.  Maybe we should go through the NTS-KE exchange 
occasionally to refresh S2C and C2S.

We need a good writeup on getting started when the system time may not be 
known.  That means we have to understand the issues.

We need to think about statistics.  This is more than a NTS issue.  Currently, 
"ntpq -c nts" does what I want, but nothing gets logged to disk.  Other 
log-to-disk chunks clear the counters that ntpq shows.  Sometimes, I want to 
see the totals.  Mumble.  Maybe we should save the totals and have ntpq show 2 
columns: totals and recent.  Or maybe a mode where it divides the columns by 
the time.

--

How is the documentation?

I thought there was work on a HOWTO level doc, but I can't find it.

--

Back burner:

I want to make 2 programs, a client and server for NTS-KE.  Stand alone.  
Simple.  Partly as an example of how to use OpenSSL, what I was looking for 
when I was getting started, and partly for debugging NTS-KE, just lots of 
printout to show what the other end is doing.  (Eric: This might be an 
interesting go exercise.)

Next is a NTP client using a cookie saved from above.  I want to hack this to 
be able to measure server performance.   Shared key too.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Certificate rollover

2019-04-11 Thread Hal Murray via devel


> I just realised something: LetsEncrypt certs are max 90 days.  When I renew
> them, will I need to restart NTPd? 

Interesting timing.  Richard's recent message reminded be of that issue.

Currently, you have to restart NTPD.

There is already code for doing things like that on SIGHUP.  We need to make 
it a bit smarter.

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-12 Thread Hal Murray via devel


The "JUNK" stuff is for debugging NTS.  The most important part is the length 
at the end.  It's rate limited so there shouldn't be any serious problems with 
clutter in the log file - just minor potential confusion like this.

Somebody on 2600:1700:6731:6c0:f2de:f1ff:fe20:1bbe is sending you packets that 
don't make sense.  Same for 68.75.8.147.


I haven't had time to look carefully at the CLOCK problems.  What sort of 
hardware is that running on?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-12 Thread Hal Murray via devel


Gary said:
>> Somebody on 2600:1700:6731:6c0:f2de:f1ff:fe20:1bbe is sending you
>> packets that don't make sense.  Same for 68.75.8.147.
> Those two hit my hackathon server as well.  But the connection is a normal
> NTPv4 exchange on UDP. 

Depends on what you mean by "normal".  How much did you investigate?

>From my sample:
 6 Apr 07:44:56 ntpd[10742]: JUNK: M3 V4 0/23 1 4ef 48/ 0 0 020 from 
68.75.8.147
:36693, lng=80
 6 Apr 07:45:47 ntpd[10742]: JUNK: M3 V4 0/23 1 4ef 48/ 0 0 030 from 
68.75.8.147
:34025, lng=96
...
The packet lengths are growing in steps of 16 bytes.  The 48/ stuff prints out 
the next 4 bytes in hex.  So that would be extension type 0 with lengths of 20 
(hex), 30, ...  20 hex is 32 decimal.  32+48 for the basic NTP packet is 80 as 
reported.  So there is a type 0 extension with 32 bytes.  Doesn't seem normal 
to me.  I'd bet on probing for a bug.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-13 Thread Hal Murray via devel
Here is a typical batch of the confusing CLOCK printout:

Apr 12 14:37:35 box.local ntpd[2109]: CLOCK: ts_prev 1555072655 s +
712154322 ns, ts_min 1555072655 s + 712153764 ns
Apr 12 14:37:35 box.local ntpd[2109]: CLOCK: ts 1555072655 s + 712154322 ns
Apr 12 14:37:35 box.local ntpd[2109]: CLOCK: sys_fuzz 1816 nsec, prior
fuzz 0.01594
Apr 12 14:37:35 box.local ntpd[2109]: CLOCK: this fuzz -0.01722
Apr 12 14:37:35 box.local ntpd[2109]: CLOCK: prev get_systime
0xe05b050f.b64fb1ce is 0.00942 later than 0xe05b050f.b64fa201

It looks like something is screwed up with your system clock.

This comes from an area of ntpd that I don't really understand.  I don't 
remember investigating anything like this before.

The general idea is that if your system clock goes tick, tick, tick, in great 
big steps, you want to fill in the bottom bits with randomness.  The code does 
that by assuming that the tick size is the time it takes to read the clock - 
difference in times between 2 back-to-back readings.  That's not right, but 
doesn't normally cause any troubles.  Maybe it skips samples that don't change.

That made sense a long time ago when the system clock was updated on a 10 ms 
clock interrupt.

The error messages say that filling in the low bits made clock go backwards.

I'd really like to rip out all that stuff, but I have at least one Raspberry 
Pi where I can read the clock faster than it ticks.


> Fedora 29 on x86_64 with Garmin gps18x on rs232.

x86_64 covers a lot of ground.  Anything interesting about that system?  Did 
it work correctly a while ago?  Does it work without any refclocks?  Did you 
update the kernel recently?  ...

What do we know about the system timekeeping?  I'm looking for two things.  
The first is from ntpd, probably in your syslog, before switching logging to 
the typical log file.  It should be something like this:
  Apr 12 22:51:29 hgm ntpd[724]: INIT: precision = 0.157 usec (-23)

The other comes from the Linux kernel at boot time.  I don't know a simple 
grep expression to get what I want.  Grep for clocksource is a good start, but 
poke around to see if there is anything interesting nearby.  Grep for MHz may 
be interesting.

Here is what is in my system:
Apr 12 22:50:01 hgm kernel: tsc: Refined TSC clocksource calibration: 3292.520 
MHz
Apr 12 22:50:01 hgm kernel: clocksource: tsc: mask: 0x 
max_cycles: 0x2f75b1eac7a, max_idle_ns: 440795210882 ns
Apr 12 22:50:01 hgm kernel: clocksource: Switched to clocksource tsc

That says it's using the TSC on a 3.3 GHz system.

TSC is Intel's TimeStampCounter.  It's a register that counts CPU clock 
cycles.  (or maybe it runs off a pseudo clock so timekeeping works when the 
CPU speed is adjusted to save power)

The basic idea is that you read and save the TSC when you set the clock.  
Then, when you want to know the time, read the TSC again, subtract to get the 
ticks since the last known time, multiply by the time per tick (inverse of 
clock frequency) and add that the the last known time.

You need to be careful with the arithmetic.  The ns per tick variable gets 
adjusted by the clock drift.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-13 Thread Hal Murray via devel


> clocksource is fixed at hpet since the previous situations where clock sync
> was weird/gone/etc.

> I never ever saw these before.

Something changed.  All we have to do is figure out what/when.

Was the switch to using HPET recent?

Did you do a recent git pull?  Do you know how to drive git bisect?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-13 Thread Hal Murray via devel
> I know about bisect but it is quite a task.

HPET works for me.

So far, you have the only test case.

Plese give it a quick try to see if ntpsec is the problem.  How about just 
trying the 1.1.3 release?

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Hal Murray via devel


udo...@xs4all.nl said:
> ntpsec 1.1.3's ntpd from ftp://ftp.ntpsec.org/pub/releases/ntpsec-1.1.3.tar.gz
>  gives me after startup:

> Apr 13 15:53:50 bla ntpd[12382]: CLOCK: ts_prev 1555163630 s + 594156272 ns,
> ts_min 1555163630 s + 594155713 ns 
...

Thanks.

So now we know it wasn't a recent change to ntpsec.

Were there more clusters like that, or only one?

I just pushed some tweaks.  Would you please try attic/clock and 
attic/backwards from a recent git.  clock should print some stuff and exit.  
backwards runs forever.  ^C when you get bored.

clock should show something like this for tsc:
  res   avg  min  dups  CLOCK
125   17CLOCK_REALTIME
  100 7  112-6  CLOCK_REALTIME_COARSE
118   17CLOCK_MONOTONIC
1   292  283CLOCK_MONOTONIC_RAW
1   298  289CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 1 ns per bucket, 100 samples.
ns  hits
1813
19 7
20 71131
21232969
22651647
23 38228
24  3609
25   130
26   164
27   149
1953 samples were bigger than 27.


and this for hpet:
  res   avg  min  dups  CLOCK
1   888  558CLOCK_REALTIME
  100 7   99-6  CLOCK_REALTIME_COARSE
1   878  558CLOCK_MONOTONIC
1   874  558CLOCK_MONOTONIC_RAW
1   888  489CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 1 ns per bucket, 100 samples.
ns  hits
   558 2
   55913
   62868
   62975
   698   178
   699   101
   768   272
   76996
   838 77929
   839  9227
   907 47895
   908858449
5695 samples were bigger than 908.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Hal Murray via devel


devel@ntpsec.org said:
> That's a fantastically wierd distribution.  Here's what my old single core
> Athlon64 does: 

Your sample is what I would expect from a system that isn't doing much.  If 
there is other activity going on, the clean bell curve gets spread out due to 
cache reloads and such.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Hal Murray via devel



> HPET is a travel out to ACPI system registers mapped into memory, this should
> never be never cached.

Yes.  But there is still the cache for code and data.

This sort of code is amazingly delicate.  Minor changes can make interesting 
changes in the results.

For example:
  for (i = 0; i < BATCHSIZE; i++) {
clock_gettime(type, &start);  /* warm up cache */
clock_gettime(type, &stop);
clock_gettime(type, &start);
clock_gettime(type, &stop);
...

The extra pair of clock_gettime-s cleaned things up a lot.  At least on one of 
my systems at some point in time.

At one point in time, I got different results when I reran a test.  Sometimes 
it would get duplicate samples.  Try again and they didn't happen.  That was 
on a Pi 1.  Ah.  I just got another example.

  res   avg  min  dups  CLOCK
1  1069  999 52564  CLOCK_REALTIME
 1000   820  898   -84  CLOCK_REALTIME_COARSE
1  1075  999 12920  CLOCK_MONOTONIC
1  1042 1000 62481  CLOCK_MONOTONIC_RAW
1  1521  999CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 1 ns per bucket, 100 samples.
ns  hits
   999  9577
  1000930001
  1999   149
  2000  6856
  2999 2
  3000   152
49105 samples were duplicated.
4158 samples were bigger than 3498.


  res   avg  min  dups  CLOCK
1  1347  999CLOCK_REALTIME
 1000  1028  890  -109  CLOCK_REALTIME_COARSE
1  1349  999CLOCK_MONOTONIC
1  1279 1000CLOCK_MONOTONIC_RAW
1  1838  999CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 1 ns per bucket, 100 samples.
ns  hits
   999  8357
  1000765081
  1999  4814
  2000215894
  299921
  3000   632
5201 samples were bigger than 3498.

I'm guessing that it's something like the cache conflicts changing because of 
address space randomization.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Hal Murray via devel
> Also the PLL goes up and offsets rise. (just like before)

Another way to maybe learn something.

Can you grab a copy of
  http://users.megapathdsl.net/~hmurray/time-nuts/60Hz/60Hz.py

It's a hack to measure line frequency using the PPS capture logic.  The idea 
is to turn that inside out and use it to measure the CPU frequency by watching 
a known-good PPS.

You may have to change the assert to clear - ntpd disables the one it isn't 
using.

If you stop ntpd, nobody should change the drift.  You can use ntptime -f 0 to 
clear it.

-

You don't actually need the program.  It's just what I use to log stuff so I 
can feed it to gnuplot.

You can get the data with:
  cat /sys/class/pps/pps0/assert
It will show something like:
  1555283247.999730528#84947
The number after the # is the number of pulses.  The number before is the time 
stamp of the last pulse.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Hal Murray via devel
> No, that description only holds for what are called "coarse" clocks.

Do you understand this area?

I think the term I've been missing is "dither".  I don't understand that area 
well enough to explain it to anybody.  Interesting timing.  I was at a talk a 
few weeks ago that covered dithering.  The context was audio processing and I 
wasn't smart enough to notice the NTP connection.

-

The idea with dithering is to add noise in bits below what you can measure.

There are several interesting quirks with the current code.

There isn't a sensible way to measure the step size of the system clock.  The 
current code fuzzes below the time to read the clock.  That has nothing to do 
with the clock tick size.  It's just a pipeline delay.  And the time-to-read 
that it uses includes lots more than just raw reading the clock: 150 ns vs 30 
ns on a reasonably modern Intel CPU.

You can see the actual clock step size in the histogram output of attic/clocks
I'm not sure how to automate that.

I haven't studied what ntpd does with coarse clocks.  I don't have a sample to 
test with.  The step size on Intel HPET  is ~70ns.  The time to read it is 
500-600 ns.

Step size on an Pi 1 is 1000 ns.  ~50 ns on a Pi 2 and Pi 3.


With the current code, get_systime() is fuzzed.  It's called from quite a few 
places.  The only ones that need fuzzing are the ones used for timekeeping.  
There are only 2 of those, one for sending requests and the other for sending 
replies.  The 2 packet receive time stamps don't get fuzzed.  Neither do the 
PPS time stamps.

There are several calls in the NTS code - just measuring elapsed times to do 
KE.  The API is convenient.  We should setup something to avoid fuzzing.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: libntpd contents and the build order of doom

2019-04-23 Thread Hal Murray via devel


Ian said:
> In trying to write tests for nts_client.c I have run into a problem I do  not
> know how to solve, as it involved much of the structure of the  codebase as
> well as the build system.

> Some of the code in nts_client.c calls the dns_take* series of  functions.
> These functions are defined in ntp_proto.c.

> nts_client.c is listed in the build system as part of the libntpd_obj
> target. ntp_proto.c is in the ntpd target.

> When ntpd is built this works because ntp_proto.c in being built, but  when
> the tests are built it is not, and the build fails as a result. 

As Gary suggested, your test environment wants to catch dns_take_*
You probably want them to be no-ops for now, but if you get far enough, that's 
how the main thread learns the answer from all the work that the DNS or NTS 
thread has done.

The flow goes like this...

The transmit routine that would normally send a client/request packet, notices 
the DNS or NTS flag and calls out to a routine that sets up a new thread to do 
the work.  There is only one DNS/NTS thread, so that has to get checked for.

Communication between main and DNS/NS worker thread is via global variables.

When the worker thread is finished, it leaves info in global variables and 
sends a signal to the main thread.  The signal handler sets a flag.  The main 
loop (eventually) notices that flag and calls over to the DNS/NTS module which 
looks at the global variables and calls back in to the dns_take_ routines.


There is at least one other place where the test routines have a routine to 
keep the linker happy, but I can't think of what it is.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


WWVB (from NANOG)

2019-05-10 Thread Hal Murray via devel
> An eBay search for "EverSet ES100 WWVB BPSK Phase Modulation Receiver Kit"
> should prove fruitful. I have one - but I haven't had time to tinker with
> it yet.
> 
> The kit comes with the double-antenna setup that appears to be key to the
> improved reception. In the clocks, the antennas are at 90 degrees relative
> to each other.

Alas. In concept, that is extremely interesting. But a bit too
bare-metal for me; first I'd have to recruit help to design and build
it into something one of my computers can talk to.

OTOH, I have written successful I2C code; if something like this
hardware were a Raspberry Pi HAT I'd have bought one before I finished
typing this reply and probably have a test system up in 24 hours. So
it's close.  Real close.  Relevant link:

--

You should be able to plug it into a Pi without a hat.  It takes 3 wires: 
ground, power, and signal.

Connect the signal to a PPS pin.

I've seen code to decode WWVB but don't remember where.  My first guess would 
be one of the drivers we dropped.

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: WWVB (from NANOG)

2019-05-10 Thread Hal Murray via devel


> You should be able to plug it into a Pi without a hat.  It takes 3 wires:
> ground, power, and signal. 

I screwed up.  It's 2 signal wires.

I got one when they were announced, but it fell through the cracks.  Time to 
clean up my desk.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Talk at Stanford: Nanosecond-level Clock Synchronization in a Data Center

2019-05-13 Thread Hal Murray via devel


https://www.youtube.com/watch?v=Opf9CBwP5R8

Several ideas.

One is the use the PTP style time-stamping available on many modern Ethernet 
interfaces.  Does anybody know how that works?  API?  I assume there is a 
counter in the Ethernet hardware.  How does that counter get converted into a 
time stamp?

Their other approach is to collect much more data.  There are a couple of good 
slides showing the offset with an empty band in the middle.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Talk at Stanford: Nanosecond-level Clock Synchronization in a Data Center

2019-05-13 Thread Hal Murray via devel

> I'm not going to look at that stuff on YouTube… any link to oldfashioned
> non-multimedia? 

Here is a Usenix paper that covers the same ground:
  https://www.usenix.org/system/files/conference/nsdi18/nsdi18-geng.pdf

The interesting graphs are on page 7 (86).


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: NTS and TLS

2019-06-13 Thread Hal Murray via devel
> I think the other end is on TLS 1.3 only, but my end only supports TLS 1.2

Well, if that's the setup, it's not going to work.  It should be possible to 
setup a test case.

We might be able to produce a better error message.  What was the surrounding 
log info?
(That stuff has fallen out of my cache.)
 

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: NTS and TLS

2019-06-13 Thread Hal Murray via devel
> I'm inclined to ignore TLS 1.3 until the openssl 1.1.1 bugs are worked out.

I've forgotten that stuff.  What is/was the problem?  OpenSSL is up to 1.1.1c
I have a vague memory of 1.1.1a fixing something we were interested in.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ˜ntpdig and NTS

2019-06-16 Thread Hal Murray via devel


>> Does ntpdig know about NTS?  Seems like it should be able to ue NTS.
> That would be pretty tricky, actually.  We'd have to expose the NTS crypto
> and key-exchange primitived as a C extension to Python. 

Another possibility...

On my back burner, is a pair of NTS-KE programs, one for server and the other 
for client.  The idea is simple code and lots of printout, sample code and 
such.  Mostly, the sort of thing I was looking for when I started trying to 
write code.

It would be reasonable to have the client side write out the keys and cookies. 
 Then you would have to expose the crypto routines.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ?ntpdig and NTS

2019-06-16 Thread Hal Murray via devel


> Yeah, but I'm not sure it would be efficient to pull the trigger on anything
> like this until we figure out how many months out the Go port is. 

We could also use the "simple" client/server NTS-KE samples as warm ups on the 
Go port.  I think the client side is pretty simple - OpenSSL does all the 
work.  The server side is a bit more complicated.  It needs the crypto 
routines to make the cookies.

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ?ntpdig and NTS

2019-06-17 Thread Hal Murray via devel


> Which means it's time for a serious on-list conversation about what our next
> major objective beyond wrapping up NTS is.

Other ideas to consider...

Randomize client side ports.  (big messy discussion on IEFT list)

We may want/need servers supporting NTS to support non standard port number, 
probably in addition rather than instead of 123.  That's a hack to bypass 
filtering in various places to prevent the DDoS amplification from ages ago.  
I gather it's not uncommon to filter packets to/from port 123 longer than 48 
bytes which drops NTP packets using NTS.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Release for NTS?

2019-06-17 Thread Hal Murray via devel


NTS has been working for a while without any serious problems.  We may have to 
tweak a few details when the actual RFC gets published but it seems unlikely 
there will be any major changes.

Is there any reason not to do a release now/soon?

If not, I'll tweak the NTS documentation to be post-release.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


docs/NTS-QuickStart.adoc

2019-06-20 Thread Hal Murray via devel


I made an update pass.  More eyeballs would be good.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Cloudflare announces public NTP/NTS servers

2019-06-21 Thread Hal Murray via devel


Introducing time.cloudflare.com
  https://blog.cloudflare.com/secure-time/amp/



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Polling interval, Allan Deviation

2019-06-26 Thread Hal Murray via devel


Consider something like the collection of samples from a PPS.  Assume ntpd is 
not running so the system clock is not bouncing around.

There is some noise with each sample.  If you collect several samples, you can 
average them to get a better number.  You probably want 2 numbers rather than 
1, a straight line fit, time offset and frequency offset, rather than just a 
simple average time offset.

But if you average over too many samples, the temperature will change or the 
crystal on your system will drift with age.  There is a sweet spot in the 
middle between too few samples and too many.

The graph of goodness vs averaging time is called the Allan Deviation.
  https://en.wikipedia.org/wiki/Allan_variance

http://www.leapsecond.com/pages/adev/adev-why.htm
http://www.leapsecond.com/pages/adev-avg/



You can stand on your head, or turn things inside out, or ...  If the PC clock 
is "good", you can collect data on a less good external time source.  If your 
external time source is good, you can collect data on your PC clock.



Another way of looking at this problem is that you want your PLL to filter out 
the high frequency noise, but let the low frequency drift through so the PLL 
can fix it.  There is a sweet spot on the filter bandwidth or time constant.  
That's at the bottom of the V of the ADEV graph.



That all assumes things are nice in a mathematical sense.   Normal 
distribution and such.

If you are working with a PC, the lab temperature can change or the PC 
temperature can change when the CPU does some work.  Optimizing the polling 
interval for good conditions may (will?) not work well when conditions are not 
nice.

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Anybody know anything about Windows?

2019-06-29 Thread Hal Murray via devel


Has anybody tried building ntpsec on Windows?  Cigwin?  I'm just curious.  How 
close are we?

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Compiler warnings

2019-07-11 Thread Hal Murray via devel


Is there a reason that warnings don't default to on?


When configured with --enable-warnings, I get this on an old gcc.
  gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)

../../ntpd/ntp_wrapdate.c: In function 'eval_gps_time':
../../ntpd/ntp_wrapdate.c:226: warning: declaration of 'refclock_name'™ 
shadows a global declaration
../../include/ntp_refclock.h:192: warning: shadowed declaration is here

It looks like a legitimate warning to me.

The question is why don't we get similar warnings on newer compilers?



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Anybody know anything about flatpak?

2019-07-11 Thread Hal Murray via devel


Description  : flatpak is a system for building, distributing and running
 : sandboxed desktop applications on Linux. See
 : https://wiki.gnome.org/Projects/SandboxedApps for more
 : information.

I'm guessing it's targeted at things more complicated than ntpq.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Testing

2019-07-12 Thread Hal Murray via devel
(Context is that I went to edit a config file to test something and I ran into 
some cruft leftover from testing something else.)

Handwave...

There are a zillion corner cases that I'd like to be able to test.  A typical 
example is something like: with configuration X, Y should happen.  You can 
check that by looking in the log file or using ntpq.

With something like a compiler that shouldn't be timing dependent, it's 
straightforward to setup a system for feeding a small chunk of code to the 
compiler and comparing the output to a known good pattern.  Once you have 
that, then you can collect all the small chunks used to diagnose bugs and add 
a wrapper to run the whole collection.

But NTP is all about timing.

Eric: What is the name/term for your attempt at capturing and replaying 
things?  Is there a good writeup of why it didn't work?

Anybody know anything about finding things in log files?  I think I could 
setup a fuzzy pattern to search for.  I can think of 3 results: pass, fail, 
and timeout.

Matching ntpq output gets more complicated.  I think we would need a way to 
identify slots and constrain the value.

If I had something like that working in my environment, how would we run as 
much as possible on your environment?  Maybe variables in the config file for 
server names?  So I could use my local servers and you could use yours.

Does any of that make sense?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Testing

2019-07-13 Thread Hal Murray via devel


e...@thyrsus.com said:
> https://blog.ntpsec.org/2017/02/22/testframe-the-epic-failure.html
> Read that and think about it for a while.  This is a very hard problem.  I
> hit it and bounced.

Thanks.

>From the blog page:
> In effect, the entire logic of the sync algorithms is a gigantic free
> parameter with no real equivalent in the simple, straight-line data
> transformations of gpsd, and only a weak analogy with the somewhat more
> complex but variable-free ones of reposurgeon. 

> Under these assumptions, there is some mutation rate threshold above which
> attempting deterministic replay simply stops being useful at all, because the
> gains from it are exceeded by the complexity costs of updating tests. GPSD
> and reposurgeon are well below that threshold; I now think ntpd is above it. 

I think I understand the ideas, but it's not making sense.

I agree that the core FSM is complicated, but how often do we change it?  I 
can't think of any changes, aka the mutation rate is close to zero.  Things 
like NTS are on the periphery.  It would be great to be able to run regression 
tests after adding NTS.  Yes, we would have to add new test cases in order to 
test NTS, but all the old tests should keep on working.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Testing

2019-07-13 Thread Hal Murray via devel


e...@thyrsus.com said:
> A lot of configuration options - even things like minsane - effectively
> change the FSM. 

Right.  But as you said, that's a configuration option.

> Sure, you can think of the config as part of the input state - this isn't a
> code mutation. But it also means you can only ever test very tiny parts of
> the input-state space, with no way to know when a config change might produce
> a boojum and tyically no way to have real confidence about how a test relates
> to behavior under any change in ...

Sure, but we can't currently test any of the state space.  I'd be very happy if 
we could test parts of the state space that are known to be interesting.

Your writeup focuses on code mutations rather than state space.  (Or maybe I 
didn't read what you intended.)

How do code mutations interact with the state space?  Do you have an example of 
a code mutation that would change things and that we wouldn't want to know 
about?

I expect changes in the logging would be the most common problem.  In most 
cases, I'd expect it would be a simple eyeball check on a diff and poke a 
button to accept the new version.  Did TESTFRAME have separate logging for the 
gettime call used for logging?



The "known to be interesting" phrase gets back to my query that started this 
thread.  I'm looking for a way to test corner cases.  Would TESTFRAME would 
have done that?

If we don't like TESTFRAME, what else can we do?

Can we look for patterns in the log file from a live run?This has the 
advantage of not requiring any changes to ntpd.

It gets complicated with lost packets and widely variable timing on the big bad 
internet.  A local net might be stable enough.

Suppose that works.  Can we describe the required configuration so that tests 
that start on my environment can be run on other environments?  I'm thinking of 
things like $BOB is a local stratum 2 server, $TED is local stratum 3.  ...

---

I'd like a way to test various OSes and hardware platforms.  I can think of two 
interesting areas.

One is the basic timekeeping - ntp_adjtime and friends.  Does normal ntpd work 
in some simple configuration.

The other is OpenSSL and friends.  That library gets a lot of attention, but 
it's large and we could be hitting a corner that doesn't get tested.  Various 
OSes/distros are running different versions and different patch levels and our 
code could have bugs in my attempts to dance around API changes.
 

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Testing

2019-07-14 Thread Hal Murray via devel


> Especially the idea of verifying key parts of the state space, even if we
> can't verify it all. And especially if there was a way to usefully log the
> relative timing of various important state transitions.  (That is something
> on the wishlist of the AWS NTP Kronos team.) 

What are they looking for?

I think all the major state transitions get logged in protostats.  (if enabled)

-

Eric:  Did your work on TESTFRAME capture the calls to the stats routines?  
(or their output, if enabled)


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Testing

2019-07-14 Thread Hal Murray via devel


> It's...hm...maybe a good way to put it is that the structure of the NTPsec
> state space and sync algorithms is extremely hostile to testing.

I still don't have a good understanding of why TESTFRAME didn't work.  I can't 
explain it to somebody.

We've got
  code mutations
  hidden variables in the FSM
  hostile

So what makes it hostile?  Is it more than just complexity?

Why isn't this sort of testing even more valuable when things get complex?

-

> If you try to do this kind of eyeballing in NTPsec it will make your brain
> hurt.  It's not just that the input and output packets are binary, that's
> superficial and fixable with textualization tools I can write in my sleep.
> Fine, let's say you've done that. You've got an interleaved stream of input
> and output timestamps.  How do you reason through the sync algorithms to know
> whether the relationships are correct? 

How do we tell that it is working without TESTFRAME?  I eyeball ntpq -p and/or 
graphs of loopstats and friends.  That's using the stats files as a summary of 
the internal state.

Did TESTFRAME capture the stats files?

With a bit more logging, we could probably log enough data so that it would be 
possible to do the manual verification of what is going on.  We would have to 
write a memo explaining how it works, maybe that would include chunks of pseudo 
code.

How much of the problem is that Eric didn't/doesn't understand the way the 
inner parts of ntpd work?  I've read the descriptions many times but I still 
don't understand it well enough to explain it to somebody.  Maybe I could work 
up a presentation with the code in one hand and the descriptions in the other 
hand.  It would take a while.  That is, I know the general idea and recognize 
all the pieces but don't have a good feel for how the pieces fit together to 
make up the big picture.


> Not only are there time-dependent hidden inputs to the computation from the
> kernel clock and PLL, but they're going to be qualitatively different
> depending on whether you have an adjtimex or not.

There wasn't supposed to be anything hidden.  TESTFRAME was supposed to 
intercept all the relevant calls like getting the time from the kernel.

I'm pretty sure we gave up on systems that don't support adjtimex.  OpenBSD 
doesn't have it, but does have enough to slew the clock.  We dropped support 
for OpenBSD when that shim was removed.



How far did you get with TESTFRAME?  Do you remember why you decided to give 
up?  Was there something in particular, or did you just get tired of banging 
your head against the wall?

How many lines of code went away when you removed it?

Would it be interesting for me to take a try?  Now isn't a good time and there 
may be more important things to work on, but I think we should explore and 
understand this option.



But back to the big picture.  How can we test corner cases?

Is it reasonable to look for patterns in the log file?

Is it reasonable to look for patterns in the output of ntpq -p?  Graphs?

When you do a Go port, what can you do to make testing easier?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Testing

2019-07-14 Thread Hal Murray via devel
> Can you get them to specify exactly what they want?

One thing to add to the list if you are going to collect NTP data...

If you know that the clocks at both ends are accurate, rawstats will give you 
the transit times in each direction.

NTP assumes the transit times in each direction are equal.  There are two 
common reasons that isn't true.  One is asymmetric routing.  The other is 
queuing delays.

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Stanford talk: Jupyter Notebooks, Fernando Perez and Guido van Rossum

2019-07-15 Thread Hal Murray via devel
Mark's comment about lots of data reminded me that I meant to send this a 
month ago.  I guess it fell through the cracks.

Most of the talk is about open systems, not much about Jupyter itself.  
Nothing NTP related.  The first 30 minutes is Guido then Fernando describing 
history and culture.  The last 40 minutes is Q&A with crappy audio pickup of 
the questions.

Anybody know anything about Jupyter Notebooks?  Can we use it to visualize NTP 
data?

--

Jupyter Notebooks and Academic Publication
  https://www.youtube.com/watch?v=BWlJFy3Waro

Fernando Perez and Guido van Rossum

https://ee.stanford.edu/event/seminar/ee380-computer-systems-colloquium-jupyter
-notebooks-and-academic-publication

---

Berkeley now has a major in Data Science.  Fernando has a good talk: 

When Jupyter Becomes Pervasive at a University?
  https://www.youtube.com/watch?v=Wd6a3JIFH0s
15 minutes.

The entry level undergraduate class is 1200 students each semester!

An interesting problem.  Undergraduates now know more about this area than some 
faculty or PIs.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Testing

2019-07-15 Thread Hal Murray via devel


tenterl...@gmail.com said:
> I come from a scientific background, where we compare results somewhat as
> analog values. If the test result is off the expected by 1000%, that's bad.
> If it's off 1%, better. If the error is .1%, probably within  achievable
> accuracy. 

There is a difference between running the same experiment again to get new 
data and running new software on old data.

Are the specs and implementation for IEEE floating point tight enough so that 
I should get the exact same result if I run a test on a different CPU chip?  
Or is there room for things like holding extra bits in temporary results so 
the bottom bits might be different due to round off or such?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Testing

2019-07-23 Thread Hal Murray via devel


Sorry for the delay.  I got distracted on other things.

While I think of it, did TESTFRAME dump floating point numbers in ASCII or 
hex?  If ASCII, there are likely to be round off errors when you read them 
back in.  Were there any floating point numbers that were read back in?  (as 
compared to loopstats where they are written out)


> How do you tell that any given capture represents correct operation? What
> check do you apply to the captured I/O history to verify that the sync
> algorithms were functioning as intended when the cature was taken?  *That's*
> the hard part - not the mechanics of TESTFRAME  itself, which is just
> tooling. 

OK, maybe we are getting someplace.

Is this a half full vs half empty problem?  I'm willing to assume that it is 
working correctly unless we get hints otherwise.

I'm trusting the guy who collects and contributes the sample data to verify 
that things are working correctly by looking at ntpq printout and/or graphs 
from log files and/or carefully reading ntpd.log and/or maybe other sources.  
This isn't a proof, but it's what we are currently using - the best we can do 
today.

In this context, there are two types of log files.  loopstats is probably best 
checked graphically.  ntpd.log may have single events or several related 
events that indicate that something happened.


> (You still don't know how to compose captures to trigger specied corner
> cases, but there's no point in worrying about that problem until you have
> your check procedure.)

Maybe "corner case" isn't the right term for the things I've been thinking 
about.  There are lots of cases that are reasonably easy to setup, but since 
there are a lot of them it would be nice if we could automate the procedure so 
I didn't have to go through the whole list every time we do a release.  
Testing at commit/push time is just a bonus.  Examples:
  Does "pool" work?  Do all the forms of crypto work?  All OSes/distros?  Does 
a single server work?  Does local clock work?  Can 2 truechimers outvote 1 
falseticker?

There may be actual obscure corner cases that are hard to generate.  I'd be 
willing to add logging for them so we can tell if an operational system hits 
them.

Do you remember anything about performance when collecting data?  I assume it 
would be reasonable to collect data on a client.  How much would it slow down 
a server?  How much disk space would a test require?

--

Things are likely to break if we change a constant deep in the system or add a 
sanity check for an obscure case.  We haven't done that in a long time, so I'm 
not too worried about that problem.  Breaking tests may be a feature rather 
than a bug.  It will make us think twice about making that change.  If we 
decide the change is worthwhile, then we have to start the collection over.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


FWD: [Ntp] NTS in Go

2019-07-25 Thread Hal Murray via devel


--- Forwarded Message

From: Michael Cardell Widerkrantz 
To: n...@ietf.org
Date: Thu, 25 Jul 2019 11:45:52 +0200
Message-ID: <877e86qwfj@tp1.hack.org>
Subject: [Ntp] NTS in Go

Martin Samuelsson, Daniel Lublin and I participated remotely during the
recent IETF hackathon. Our friend omni participated for a while as well.
Some results:

- - A friendly fork of beevik/ntp with NTS support:

  https://github.com/mchackorg/ntp

  Use it like this:

options := ntp.QueryOptions{ NTS: true, C2s: c2sKey, S2c: s2cKey }
resp, err := ntp.QueryWithOptions(server, opt)

  Authenticated time now available in resp.Time.
  
- - An NTS-KE library:

  https://gitlab.com/hacklunch/ntske

- - A small NTS client using the above libraries:

  https://gitlab.com/hacklunch/ntsclient/

This is still a work in progress but seems to work fine against for
example time.cloudflare.com:1234 and zoo.weinigel.se:4446.

The remote hackathon was sponsored by Netnod and held in its Malmö
office but most of the participants have no current relation to Netnod
and none of us work on this full time. Thanks to Netnod for sponsoring
our mini hackathon!

- -- 
MC, https://hack.org/mc/
XMPP OTR: f4c09b50 e6d7b04f 7afd37c1 bd3a077e 5ea94a64

___
ntp mailing list
n...@ietf.org
https://www.ietf.org/mailman/listinfo/ntp


--- End of Forwarded Message



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Interesting NMEA WNRO quirk

2019-07-26 Thread Hal Murray via devel
Anybody seen this yet?  (I assume not, or it would have been fixed.)

Has the new ntp_wrapdate.c been tested?  Should the warp be positive or 
negative?

This is on a 32 bit system.  The GPS chip is a MTK-3301

ntpq -p shows:
xNMEA(0)  .GPS.0 l   29   64  377   0. 5918409  
24.7495

ntpd.log shows:

26 Jul 13:43:10 ntpd[714]: NMEA(0) serial /dev/gps0 open at 9600 bps
26 Jul 13:43:11 ntpd[714]: PROTO: NMEA(0) 801b 8b clock_event clk_bad_format
26 Jul 13:43:11 ntpd[714]: NMEA(0) Changed GPS epoch warp to -4096 weeks
26 Jul 13:44:10 ntpd[714]: PROTO: NMEA(0) 8014 84 reachable

58691 2651.935 127.127.20.1 $GPRMC,004411.000,A,3726.1087,N,12212.2399,W,0.00,2
.07,111299,,,A*77  384 64 0 0 64 0

---

It was working May 22.
58625 67186.457 127.127.20.1 $GPRMC,183945.000,A,3726.0859,N,12212.2567,W,0.00,
339.73,220519,,,A*7E  448 64 0 0 64 0

I don't have clockstats for May 23.

It's broken on May 24:
58627 1967.044 127.127.20.0 $GPRMC,003249.561,V,8960.,N,0.,E,0.00,0
.00,081099,,,N*76  48 0 12 0 12 0
58627 86343.440 127.127.20.1 $GPRMC,235902.000,A,3726.0912,N,12212.2418,W,0.00,
173.06,081099,,,A*7E  384 64 0 0 64 0


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Does broadcast *server* mode still exist?

2019-08-18 Thread Hal Murray via devel


e...@thyrsus.com said:
> Go ahead.  Whatever broadcast code was left is pretty much a vermiform
> appendix, anyway. 

The only leftovers that I know about are packet types.  They may be used by 
ntpq.

I remember a comment about there being no way to do broadcast securely.  It 
would be good to include an expanded version of that in the documentation.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Does broadcast *server* mode still exist?

2019-08-18 Thread Hal Murray via devel


e...@thyrsus.com said:
> That's covered. In the page on NTPsec changes:
> * Broadcast- and multicast modes, which are impossible to
>   secure, have been removed. 

I was looking for more information.  Why can't we secure it?

What's wrong with using a public/private key to sign each broadcast packet?

(It's hard to prove a negative like "impossible to secure", but maybe security 
geeks know things that I don't.)

---

I'm not sad to see broadcast modes gone.  It was tangled up with a state 
machine which I never really understood.

In general, broadcasting is evil.  That's another reason to drop it.

But there might be good reasons to use it.  Maybe simplifying the config file 
for some deployment applications?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Does broadcast *server* mode still exist?

2019-08-18 Thread Hal Murray via devel
>> What's wrong with using a public/private key to sign each broadcast packet?

> However, there is not reasonable protection against delayed or replayed
> packets.

Thanks.  That's what I was looking for.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Does broadcast *server* mode still exist?

2019-08-18 Thread Hal Murray via devel


>> I was looking for more information.  Why can't we secure it?
> Daniel explained it to me once, but I've forgotten the details. Perhaps he'll
> speak up. 

The delay/replay problem is fatal, at least with a simple public key system 
like I proposed.

There is probably something like a FAQ entry that explains that if you want to 
get time relevant data from A to B, you have to start by sending something 
from B to A, a nonce if nothing else.

You could eliminate duplicates by having the sender include a sequence number. 
 You would have to add a dance to get started.

I don't see how to protect against delays without sending something from B to 
A -- or knowing the time.



>> I'm not sad to see broadcast modes gone.  It was tangled up with a
>> state machine which I never really understood.

> And may no longer exist since Daniel's massive refactor of the protcol
> engine! 

I removed the state machine after we had removed enough stuff (like broadcast 
and peers) so that the remaining cases were simple enough to understand.  That 
was a while ago.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Does broadcast *server* mode still exist?

2019-08-19 Thread Hal Murray via devel


> I was not sure if broadcast as a server was dropped for similar reasons. 

The text for broadcast and multicast are not in the keyword generating file.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ✘NTS and ALPN

2019-08-19 Thread Hal Murray via devel


> But, it will break existing NTPsec NTS.  So upgrade to git head now if you
> use NTS. 

What's the nature of the breakage?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re:˜NTS and ALPN

2019-08-20 Thread Hal Murray via devel
> Is there some plan to have more than one NTS protocol??

ALPN also lets you negotiate different versions of the same protocol.

I doubt there are any near-term plans for a new version but we will want the 
extra loop (or equivalent) if/when we ever implement a new version of NTS.

I'm happy to wait until we implement another version to add the second loop.  A 
comment in the code would be a good thing.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ✘NTS and ALPN

2019-08-20 Thread Hal Murray via devel
>>> But, it will break existing NTPsec NTS.  So upgrade to git head now
>>> if you use NTS.
>> What's the nature of the breakage?

> The ALPN changed to what the other NTS implementations are using.

I think I see what's going on.

Our NTS client doesn't check the ALPN string from the server.  So any 
combination of our old/new clients and servers can talk to eachother.  (new 
meaning git head and old meaning a week ago)

If other implementations of NTS client are checking the ALPN string from the 
server, they won't interoperate with our old server.

--

This issue had been going on for a long time.  I never got the word that other 
clients were having interoperability troubles.

In hindsight, it's obvious, but I probably assumed that other clients weren't 
(yet?) checking the ALPN string returned from the server just like ours 
doesn't.  Something like that is needed for backward compatibility while ALPN 
is implemented.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


_XOPEN_SOURCE in ntpd/refclock_gpsd.c => warnings on BSD

2019-08-21 Thread Hal Murray via devel


ntpd/refclock_gpsd.c has:
#define _XOPEN_SOURCE 600

I see the following warning:

NetBSD:
../../ntpd/refclock_gpsd.c:2118:6: warning: implicit declaration of function 
'strlcpy' [-Wimplicit-function-declaration]

FreeBSD:
../../ntpd/refclock_gpsd.c:2118:6: warning: implicit declaration of function 
'strlcpy' is invalid in C99 [-Wimplicit-function-declaration]
strlcpy(pp->a_lastcode, tc, sizeof(pp->a_lastcode));
^
../../ntpd/refclock_gpsd.c:2118:6: warning: this function declaration is not a 
prototype [-Wstrict-prototypes]
2 warnings generated.

I don't understand this area.  My tests build without warnings when I comment 
out that line.

I think somebody cleaned up that sort of #define a while ago.  Did they forget 
to remove this one?

--

./libparse/clk_sel240x.c also references _XOPEN_SOURCE
#ifndef _XOPEN_SOURCE
#define _XOPEN_SOURCE600
#endif

My tests also build without that.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


git head broken on NetBSD 7.2 - weird

2019-08-22 Thread Hal Murray via devel


I've traced it as far as it's compiling some code that I think should be 
ifdef-ed out.

./waf configure build says:
...
[37/94] Compiling ntpd/ntp_monitor.c
[38/94] Compiling ntpd/nts_server.c
[39/94] Compiling ntpd/nts_client.c
[40/94] Compiling ntpd/ntp_leapsec.c
...
[78/94] Linking build/main/ntpd/ntpd
ntpd/nts_server.c.1.o: In function `nts_server_init':
nts_server.c:(.text+0x229): undefined reference to `SSL_CTX_set_alpn_select_cb'
ntpd/nts_client.c.1.o: In function `set_hostname':
nts_client.c:(.text+0x3a7): undefined reference to `SSL_get0_param'
nts_client.c:(.text+0x3cd): undefined reference to 
`X509_VERIFY_PARAM_set1_host'
ntpd/nts_client.c.1.o: In function `make_ssl_client_ctx':
nts_client.c:(.text+0x9c6): undefined reference to `SSL_CTX_set_alpn_protos'

Note that it doesn't complain when it compiles nts_server or nts_client.

With different configure options, I get:
[88/98] Linking bob2/main/ntpd/ntpd
ntpd/nts_server.c.1.o: In function `nts_server_init':
/home/murray/ntpsec/raw/bob2/main/../../ntpd/nts_server.c:106: undefined 
reference to `SSL_CTX_set_alpn_select_cb'
ntpd/nts_client.c.1.o: In function `set_hostname':
/home/murray/ntpsec/raw/bob2/main/../../ntpd/nts_client.c:331: undefined 
reference to `SSL_get0_param'
/home/murray/ntpsec/raw/bob2/main/../../ntpd/nts_client.c:332: undefined 
reference to `X509_VERIFY_PARAM_set1_host'
ntpd/nts_client.c.1.o: In function `make_ssl_client_ctx':
/home/murray/ntpsec/raw/bob2/main/../../ntpd/nts_client.c:218: undefined 
reference to `SSL_CTX_set_alpn_protos'

Here is that chunk from nts_server:
#if (OPENSSL_VERSION_NUMBER > 0x1000200fL)
SSL_CTX_set_alpn_select_cb(server_ctx, alpn_select_cb, NULL);
#endif

-bash-5.0$ grep OPENSSL_VERSION_NUMBER /usr/include/openssl/ -r
/usr/include/openssl/crypto.h:# define SSLEAY_VERSION_NUMBER   
OPENSSL_VERSION_NUMBER
/usr/include/openssl/opensslv.h:# define OPENSSL_VERSION_NUMBER  0x1000115fL
-bash-5.0$ 

-bash-5.0$ grep SSL_CTX_set_alpn_select_cb /usr/include/openssl/ -r
-bash-5.0$

I'm pretty sure that 0x1000115fL is not bigger than 0x1000200fL
I can't see why it's compiling that line of code at all and I don't understand 
why it didn't give any undefined warnings at compile time.

It worked a month ago.  The system is a year old.

I probably updated something on that system recently, but I don't remember 
what.  gcc and cpp say:
gcc (nb2 20150115) 4.8.5
cpp (nb2 20150115) 4.8.5
-r-xr-xr-x  2 root  wheel  654499 Aug 29  2018 /usr/bin/gcc
-r-xr-xr-x  2 root  wheel  654543 Aug 29  2018 /usr/bin/cpp


Has anybody seen anything like this before?

Assuming "no", I'll try bisecting tomorrow.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Point release of NTPSec

2019-08-23 Thread Hal Murray via devel


>> Is there time to add an IPv4 only initialization option for NTS?
> Options are bad, in general. Explain why you want this> 

(I was going to poke you on this.)

There is a bug waiting for your comment - #666.  Looks like you found it.

> I hate options.  They add complexity and proliferate test path. The codebase
> has quite enough of them already.

> I'm going to go with "We can change the current code to give a warning but
> not crash." 

Should be easy to fix.  If nobody else gets to it, I will, but that may not 
happen as soon as you might like.

Note that in this case, an option might be easier to test.  To test the 
warning path we have to setup a no-IPv6 box.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Point release of NTPSec

2019-08-23 Thread Hal Murray via devel


> Does it make sense to call this 1.2.0 instead of 1.1.7? Especially since we
> have the ALPN compatiblity fix? 

I suggest 1.1.7 soon, then a round of testing and cleanup before 1.2.0.

Or maybe wait for the NTS RFC to come out so we have a good reason for the 
release.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Point release of NTPSec

2019-08-23 Thread Hal Murray via devel


devel@ntpsec.org said:
>> Or maybe wait for the NTS RFC to come out so we have a good
>> reason for the release.
> What is the official (or barring that, consensus) view on the quality of NTS
> in NTPsec? Is it something that is usable in production? If so, please cut a
> release sooner rather than later. If it's still experimental, a release is
> less critical. 

I think the code is in good shape.  I don't know of any reason not to use it 
in production.  (aside from the normal growing pains, like #606)

My comment was intended to suggest waiting for a big (1.2) release after doing 
a little release (1.1.7) soon.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: _XOPEN_SOURCE in ntpd/refclock_gpsd.c => warnings on BSD

2019-08-23 Thread Hal Murray via devel


> I see no changes related to _XOPEN_SOURCE since 2017.  Perhaps you're
> thinking of GPSD, where there was bunch of rework in that area just before
> the 3.19 release.  

Thanks.  You are probably right.

Eric:  This area just got more complicated.  See #614
strerror_r() has two modes depending on
   The XSI-compliant version is provided if:
   (_POSIX_C_SOURCE >= 200112L) && !  _GNU_SOURCE
   Otherwise, the GNU-specific version is provided.
Our code is (sometimes?) assuming the wrong one.  The GNU version (sometimes?) 
doesn't put the string into the buffer we are printing.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: git head broken on NetBSD 7.2 - weird

2019-08-23 Thread Hal Murray via devel


> Has anybody seen anything like this before?
> Assuming "no", I'll try bisecting tomorrow. 

My attempt at bisecting hit a brick wall.  I backed up many months and it 
still fails.

I guessed that something strange had happened to that system.  I setup a fresh 
version of 7.2 on a different box.  It also gets that error.

A quick attempt at making a test case didn't fail.

I'll put this on the back burner for a while.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Point release of NTPSec

2019-08-23 Thread Hal Murray via devel


> Would this be a better formulation? The NTS ALPN negotiation sequence now
> checks for length of the handshake string. This may break interoperability
> with other, non-compliant, NTS implementations.

> Basically, I wish to highlight that things may *break* with pre 1.2.0 

Do we have any hints that anybody else who interoperated with our old code 
depended on our bug?

It seems much more important to indicate that we have fixed a bug and things 
should work better.

I would suggest:
  The NTS-KE server now returns the correct ALPN negotiation string.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Point release of NTPSec

2019-08-23 Thread Hal Murray via devel


e...@thyrsus.com said:
> But doing the right thing is better than a switch.  And the test is a cost
> that only needs to be paid once. 

I think your no-switch approach is good for things where the choice is A or B, 
like picking the right baud rate.

But this isn't one of those cases.  This is an X doesn't work.  Did the user 
intend that or is something broken?  In the obscure case where IPv6 is not 
enabled on the system, I'm happy to add a "-4" to ntp.conf to tell the system 
I expect that.

I've got the code working, but it also ignores lots of other cases where I 
want it to crash.  I should be able to fix that, just more code and I need to 
get the internal interface right.

I also disagree with your only need to test once.  If we only need to test 
once, why are we maintaining a complicated test package?  I agree that this 
sort of code is not likely to break due to system upgrades so the need for 
continual testing is not high.  On the other hand, it would be nice to test it 
on all OSes.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Point release of NTPSec

2019-08-24 Thread Hal Murray via devel


I just changed the NTS key rotation timer from 1 hour to 1 day.

The spec is setup for 1 day.  1 hour enables testing.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


gitlab testing broken for Fedora

2019-08-24 Thread Hal Murray via devel
Stage: build
Name: fedora-rawhide-refclocks-gpsd
Trace:  GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-
31-x86_64
Public key for glibc-common-2.30.9000-1.fc32.x86_64.rpm is not installed. 
Failing package is: glibc-common-2.30.9000-1.fc32.x86_64
 GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-31-x86_
64
Public key for glibc-minimal-langpack-2.30.9000-1.fc32.x86_64.rpm is not 
installed. Failing package is: glibc-minimal-langpack-2.30.9000-1.fc32.x86_64
 GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-31-x86_
64
The downloaded packages were saved in cache until the next successful 
transaction.
You can remove cached packages by executing 'dnf clean packages'.
Error: GPG check FAILED
section_end:1566637579:build_script
section_start:1566637579:after_script
section_end:1566637580:after_script
section_start:1566637580:upload_artifacts_on_failure
section_end:1566637582:upload_artifacts_on_failure
ERROR: Job failed: exit code 1



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


ALPN checking

2019-08-24 Thread Hal Murray via devel


I just pushed the code for the NTS client to check the ALPN selection returned 
from the NTS server.

It logs one of 3 messages.  Here are samples of 2 of them:

24 Aug 13:18:38 ntpd[28519]: NTSc: No ALPN from spidey.rellim.com (TLSv1.2)
24 Aug 13:18:43 ntpd[28519]: NTSc: Good ALPN from: time.cloudflare.com:1234

The 3rd case is when it gets back something other than "ntske/1".
I haven't found a test case for that one yet.  If anybody still has a system 
still running our old/buggy code, please let me know the IP Address.

Note that many systems are still using old versions of OpenSSL which only 
support TLSv1.2 which doesn't support ALPN.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ALPN checking

2019-08-24 Thread Hal Murray via devel
> Hal, 203.123.48.1 has been downgraded to NTPsec_1_1_6-3-g8e3daaf0b

Thanks.

24 Aug 22:07:32 ntpd[6053]: NTSc: Strange ALPN returned: *ntske/1 (8)

The "*" is fixing non graphic characters.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: prep for point release of NTPSec, suggest 2019-07-31

2019-08-24 Thread Hal Murray via devel
> How does everyone feel about next Saturday, Aug 31   2019-07-31?

Looks good to me.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Interesting talk on Chronos

2019-08-25 Thread Hal Murray via devel
> Any updates or thoughts?

[I watched the talk but not all the Q&A.]

I think our efforts would be much more productive helping to deploy NTS.

They are trying to protect against MitM attacks, but they assume that only 
some of the servers can be attacked.  That misses the important case where the 
bad guy can attack all your packets.

It depends on a large pool.  From a security standpoint, keeping the bad guys 
out of the pool is impossible.  (or at least an interesting research project, 
probably for lawyers rather than geeks)

They are reinventing the wheel.  What should you do with good samples?  That 
area of our code is incredibly complicated, but it has a long track record.  
She said "average" without saying anything more.  Maybe the paper has more 
details.

We could start an interesting project on what to do with "good" samples.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


%m, #614

2019-08-25 Thread Hal Murray via devel
I think it should be fixed for the release, but I don't know how to do it.

There used to be code in the msyslog processing that handled %m if it wasn't 
included in the local printf.  I'm guessing it was removed to eliminate 
warnings on some systems that don't support %m.

All the %m cases were changed to use %s and strerror(errno).

But that doesn't work with threads.  The NTS code uses threads.  (I think the 
DNS code avoided the problem by not doing any logging from the worker thread 
but I'd have to check to be sure.)

Using strerror_r seems like the obvious solution, but there are 2 variations 
to the API.  BSD only supports the XSI version.  On Linux, we get the GNU 
version.

waf turns on _GNU_SOURCE
wscript says:
# FIXME: We'd like this to be -D_POSIX_C_SOURCE=200809L -D_XOPEN_SOURCE=600
# rather than -D_GNU_SOURCE, but that runs into problems in two places:
# (1) The ISC net handling stuff, where struct in6_addr’ loses a member
# named s6_addr32 that the macros need, and (2) three BSD functions
# related to chroot jailing in the sandbox code.

Is this the time to fix it?  This area is above my pay grade.

---

Otherwise, the best approach I see would be to make a my_strerror_r that has 
the API we want.  I think that needs a configure time test to determine which 
version we get and a simple #ifdef in the implementation of my_strerror_r().

Anybody got better ideas?

Would it be better to go back to using %m and tolerate the warnings?  Does 
anybody remember which systems generated the warnings?  Was the %m code in 
msyslog thread safe?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: git head broken on NetBSD 7.2 - weird

2019-08-25 Thread Hal Murray via devel


I've tracked down the problem, but don't know how to fix it.

NetBSD 7.2 has an optional newer version of OpenSSL.  pkgin has installed it.  
I didn't do that explicitly so I assume it was dragged in by something I did 
install.  Looks like python37-3.7.3nb1

waf is using the new headers.
  -I/usr/pkg/include
but but obviously not getting the new library.

Is anybody familiar with this area?  Is there a simple way to tell waf not to 
look there and/or to link with the right module?

/usr/lib/libssl.a
/usr/lib/libssl.so
/usr/lib/libssl.so.10
/usr/lib/libssl.so.10.6
/usr/lib/libssl_p.a
/usr/lib/libssl_pic.a

/usr/pkg/lib/libssl.a
/usr/pkg/lib/libssl.so
/usr/pkg/lib/libssl.so.1.0.0
/usr/pkg/lib/pkgconfig/libssl.pc


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


waf checking - fail on warnings?

2019-08-26 Thread Hal Murray via devel


How do I tell waf to fail on warnings?

I'm trying to use this to detect which API I'm getting.

STRERROR_FRAG = """
#include 
int main(void) {
  char buf [100];
  const char *foo = strerror_r(6, buf, sizeof(buf));
  return foo == NULL;
}
"""

ctx.check_cc(
fragment=STRERROR_FRAG,
define_name="STRERROR_CHAR",
features="c",
msg="Checking if strerror_r returns char*",
mandatory=False,
comment="Whether strerror_r returns char*"
)

Unfortunately, it gives a warning rather than an error so it looks like it 
works.
../../test.c:5:21: warning: initialization makes pointer from integer without a 
c
ast [-Wint-conversion]
   const char *foo = strerror_r(6, buf, sizeof(buf));

How to I tell waf to fail on warnings?
Plan B would be to run the test code and see if the answer is NULL.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: waf checking - fail on warnings?

2019-08-26 Thread Hal Murray via devel
> A relatively quick search suggests mandatory=True as an argument.

That makes waf fail if the code chunk doesn't work.

I'm trying to make the code chunk not-work if it gets a warning, but then let 
waf put a comment in config.h to indicate that a feature test didn't work 
rather than put a #define to indicate that it did work.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Fix for #614

2019-08-27 Thread Hal Murray via devel


I think fixing the _GNU_SOURCE is going to take a lot of work, and we should 
probably a change of that nature lots of time for extra testing.  So I 
implemented a local wrapper that uses a configure time test.

The code for mystrerror is in the bottom of libntp/msyslog.c
The waf code is in wafhelpers/check_strerror.py

Please feel free to improve them.

The waf code is ugly, but seems to work.
It adds -Werror to CFLAGS to turn a warning into an error, then removes it.

-

I've tested on Fedora and NetBSD.

My suggestions for testing are to change the mode on your NTS cookies file.  
To not readable.
Default is:
  include/nts.h:#define NTS_COOKIE_KEY_FILE "/var/lib/ntp/nts-keys"
Or wherever you put it via "nts cookie " in your ntp.conf

26 Aug 19:22:17 ntpd[20786]: NTSs: can't read old cookie file: 
/var/lib/ntp/nts-keys=>Permission denied


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: waf checking - fail on warnings?

2019-08-27 Thread Hal Murray via devel


matthew.sel...@twosigma.com said:
> And you want to pass "-Werror" (I'm not certain how off the top of my head)
> to the compiler so that warnings are fatal. waf sees the compiler exit zero
> with or without warnings, so they look the same. 

I put it into ctx.env.CFLAGS, then restored CFLAGS back to the original after 
running the test.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ntpsec | Converted stat_count struct to a module level global. (!1026)

2019-08-28 Thread Hal Murray via devel
Merge Request !1026 was merged

The Subject says "Converted stat_count struct to a module level global"

The code looks like it is un-struct-ing things.

Was that "a module level global" supposed to be "module level globals"?

What's the policy on this area?  I thought the general idea was to put things 
into structs, but I never saw a good story on why that's a good idea.

Is the problem one of name space structure/clutter?  If I'm reading the code, 
there isn't any difference between foo.counter and foo_counter.  The actual 
names used are critical.


+extern uptime_t stat_stattime(void);
+extern uint64_t stat_received(void);
+extern uint64_t stat_processed(void);
...

-struct statistics_counters {
-   uptime_tsys_stattime;   /* time since sysstats reset */
-   uint64_tsys_received;   /* packets received */
-   uint64_tsys_processed;  /* packets for this host */
...


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ntpsec | Converted stat_count struct to a module level global. (!1026)

2019-08-28 Thread Hal Murray via devel


Thanks.

Ahh...  Part of the problem is that I misread the diffs.  I didn't notice that 
the additions were procedures.  I thought it was restoring the individual 
counters.


ianbru...@gmail.com said:
> It is reducing unnecessary globals because globals are a good thing to
> reduce.

In general, I agree that reducing globals is a good idea.  It's not globals 
that are evil, it's the complexity they normally introduce.

It's not clear that replacing a global with a procedure to read it reduces 
that complexity when the counter is a simple counter.


> !1026 moves the variables from a struct defined globally for the entire
> program to a struct defined globally within a single module  (ntp_proto.c).
> Most usage of those fields is in that module, and almost  all of the
> exceptions are reads in ntp_control. Now all access from  outside of
> ntp_proto is done via functions. 

If I was trying to clean up this area, I think the important step would be to 
move the API for those counters out of ntpd.h into a new header file.  
(Mostly, that's reducing clutter in ntpd.h.  I may be more sensitive to 
clutter than other people.)

Are counters an important enough concept that they deserve special treatement?

--

This discussion reminds me that the counters for DNS and NTS can be bumped by 
various threads so they need a lock.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Code freeze

2019-08-28 Thread Hal Murray via devel


Sanjeev Gupta said:
> Gary, ALPN string checking.  The commit mentioned that it would break with
> previous NTPSec versions. 

No.  The client didn't check the returned ALPN string.  It didn't even look to 
see if there was a returned ALPN string.

I added that checking recently.  It doesn't bail on mismatch, just logs things.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: %m, #614

2019-08-29 Thread Hal Murray via devel


Gary said:
[API for strerror_r()]
> On Linux, yes.  But not on all distros.  For example, on Android, which gpsd
> supports, strerror_r() always returns an int.  No options. 

Same on NetBSD and FreeBSD.

The quirk that's not in the man page is that the GNU version doesn't use the 
buffer you provide, at least on some implementations and/or some conditions.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: disable_dynamic_updates report

2019-08-29 Thread Hal Murray via devel


>> By "floating", you mean uninnitialized?  In C that's going to mean it's 
false

> Yes. My understanding of C is that anything not explicitly set has  whatever
> random value happens to be in that memory location. Possibly  changed if
> certain unknown compiler options are chosen. 

I thought global variables were initialized to zero.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: ntpsec | WIP: Strerror (!1027)

2019-08-31 Thread Hal Murray via devel
Richard Laager said:
> This is an untested (beyond building cleanly) patch for cleaning up the 
> strerror_r() API issue.

What environment did you use?

I get several warnings on Fedora.



../../libntp/isc_net.c: In function ‘try_proto’:
../../libntp/isc_net.c:56:4: warning: implicit declaration of function 
‘ntp_strerror_r’; did you mean ‘strerror_r’? 
[-Wimplicit-function-declaration]
   56 |ntp_strerror_r(errno, strbuf, sizeof(strbuf));
  |^~
  |strerror_r

---

../../libaes_siv/aes_siv.c:5: warning: "_POSIX_C_SOURCE" redefined
5 | #define _POSIX_C_SOURCE 200112L
  |
: note: this is the location of the previous definition

In file included from /usr/include/python2.7/pyconfig.h:6,
 from /usr/include/python2.7/Python.h:8,
 from ../../libntp/pymodule.c:7:
/usr/include/python2.7/pyconfig-64.h:1229: warning: "_POSIX_C_SOURCE" redefined
 1229 | #define _POSIX_C_SOURCE 200112L
  | 
: note: this is the location of the previous definition
In file included from /usr/include/python2.7/pyconfig.h:6,
 from /usr/include/python2.7/Python.h:8,
 from ../../libntp/pymodule.c:7:
/usr/include/python2.7/pyconfig-64.h:1251: warning: "_XOPEN_SOURCE" redefined
 1251 | #define _XOPEN_SOURCE 600
  |
: note: this is the location of the previous definition

-

../../ntpd/refclock_gpsd.c:67: warning: "_XOPEN_SOURCE" redefined
   67 | #define _XOPEN_SOURCE 600
  |
: note: this is the location of the previous definition


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Cleanup opportunity - include/isc_error.h

2019-09-02 Thread Hal Murray via devel


I missed some uses of strerror_r() in the ISC routines.

I think all uses of UNEXPECTED_ERROR should switch to msyslog
Then we can delete include/isc_error.h and libntp/isc_error.c

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Does anybody understand waf's linking libraries?

2019-09-03 Thread Hal Murray via devel


Context is issue #615

The system is NetBSD 7.2, old but still supported.

It has a newer OpenSSL installed in /usr/pkg/
/usr/include/openssl/opensslv.h:# define OPENSSL_VERSION_NUMBER  0x1000115fL
/usr/pkg/include/openssl/opensslv.h:# define OPENSSL_VERSION_NUMBER  
0x1000210fL
The old version doesn't support ALPN.

The symptom is that build is finding the new includes but linking with the old 
libraries.

linking ntpd:
ntpd/nts_server.c.1.o: In function `nts_server_init':
/home/murray/ntpsec/raw/bob2/main/../../ntpd/nts_server.c:107: undefined 
reference to `SSL_CTX_set_alpn_select_cb'
(and several others)

wscript contains:
elif ctx.env.DEST_OS == "netbsd":
ctx.env.PLATFORM_INCLUDES = ["/usr/pkg/include"]
ctx.env.PLATFORM_LIBPATH = ["/usr/lib", "/usr/pkg/lib"]

from ./waf build -v
[79/98] Linking bob2/main/ntpd/ntpd
17:51:03 runner ['/usr/bin/gcc', '-pie', 'ntpd/ntp_config.c.5.o', 
'ntpd/ntp_io.c.5.o', 'ntpd/ntp_loopfilter.c.5.o', 'ntpd/ntp_packetstamp.c.5.o',
 'ntpd/ntp_peer.c.5.o', 'ntpd/ntp_proto.c.5.o', 'ntpd/ntp_sandbox.c.5.o', 
'ntpd/ntp_scanner.c.5.o', 'ntpd/ntp_signd.c.5.o', 'ntpd/ntp_timer.c.5.o', 
'ntpd/ntp_dns.c.5.o', 'ntpd/ntpd.c.5.o', 'bob2/host/ntpd/ntp_parser.tab.c.5.o',
 'ntpd/ntp_control.c.1.o', 'ntpd/ntp_filegen.c.1.o', 'ntpd/ntp_leapsec.c.1.o', 
'ntpd/ntp_monitor.c.1.o', 'ntpd/ntp_recvbuff.c.1.o', 
'ntpd/ntp_restrict.c.1.o', 'ntpd/ntp_util.c.1.o', 'ntpd/nts.c.1.o', 
'ntpd/nts_server.c.1.o', 'ntpd/nts_client.c.1.o', 'ntpd/nts_cookie.c.1.o', 
'ntpd/nts_extens.c.1.o', 'ntpd/ntp_refclock.c.3.o', 'ntpd/ntp_wrapdate.c.3.o', 
'ntpd/refclock_conf.c.3.o', 'ntpd/refclock_nmea.c.4.o', '-o', 
'/home/murray/ntpsec/raw/bob2/main/ntpd/ntpd', '-Wl,-Bstatic', '-Llibaes_siv', 
'-Llibntp', '-laes_siv', '-lntp', '-Wl,-Bdynamic', '-lcrypto', '-lssl', '-lm', 
'-lrt', '-lpthread', '-ldns_sd', '-Wl,-z,now', '-Wl,-z,relro']

There is no -L/usr/pkg/lib in there.

If I comment out the INCLUDES line is wscript ntpd builds.  ldd shows it using 
the old crypto.so
bob2/main/ntpd/ntpd:
-lcrypto.8 => /usr/lib/libcrypto.so.8
...

But now ntpd.so is broken:
  File "/home/murray/ntpsec/raw/wafhelpers/bin_test.py", line 11, in 
import ntp.util
  File "/home/murray/ntpsec/raw/bob2/main/tests/pylib/ntp/util.py", line 16, 
in 
import ntp.ntpc
ImportError: Shared object "libcrypto.so.1.0.0" not found

./bob2/main/pylib/ntpc.so:
-lpython2.7.1.0 => /usr/lib/libpython2.7.so.1.0
-lutil.7 => /usr/lib/libutil.so.7
-lgcc_s.1 => /usr/lib/libgcc_s.so.1
-lc.12 => /usr/lib/libc.so.12
-lm.0 => /usr/lib/libm.so.0
-lpthread.1 => /usr/lib/libpthread.so.1
-lrt.1 => /usr/lib/librt.so.1
-lcrypto.1.0.0 => not found

This time, it does have -L/usr/pkg/lib

[61/98] Linking bob2/main/pylib/ntpc.so
18:02:23 runner ['/usr/bin/gcc', '-shared', '-pthread', '-pthread', 
'-Wl,--export-dynamic', 'libntp/pymodule.c.2.o', 'libntp/assert.c.2.o', 
'libntp/clockwork.c.2.o', 'libntp/emalloc.c.2.o', 'libntp/hextolfp.c.2.o', 
'libntp/lib_strbuf.c.2.o', 'libntp/msyslog.c.2.o', 
'libntp/ntp_calendar.c.2.o', 'libntp/ntp_random.c.2.o', 
'libntp/prettydate.c.2.o', 'libntp/statestr.c.2.o', 'libntp/systime.c.2.o', 
'libntp/timespecops.c.2.o', '-o', '/home/murray/ntpsec/raw/bob2/main/pylib/ntpc
.so', '-Wl,-Bstatic', '-Wl,-Bdynamic', '-L/usr/pkg/lib', '-lpython2.7', 
'-lutil', '-lm', '-lpython2.7', '-lutil', '-lm', '-lm', '-lrt', '-lcrypto', 
'-Wl,-z,now', '-Wl,-z,relro']

If I change the LIBPATH line to:
ctx.env.PLATFORM_LIBPATH = ["/usr/lib"]
and rebuild, I get the same error.

The -L/usr/pkg/lib is still in the load command.

Anybody know where it is coming from?  I can't find the string /usr/pkg/lib 
anyplace else in our code.  I assume waf is doing some magic.

I'll be happy with either old or new version.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Does anybody understand waf's linking libraries?

2019-09-03 Thread Hal Murray via devel


I think I have figured out the big picture.  PLATFORM_INCLUDES and 
PLATFORM_LIBPATH are our variables rather than something waf knows about.  (I 
downloaded both source and book for waf, no hits.)

PLATFORM_LIBPATH is write only.

-bash-5.0$ grep PLATFORM_LIBPATH . -r
./bob2/c4che/main_cache.py:PLATFORM_LIBPATH = ['/usr/lib']
./wscript:ctx.env.PLATFORM_LIBPATH = ["/usr/local/lib"]
./wscript:ctx.env.PLATFORM_LIBPATH = ["/usr/lib", "/usr/pkg/lib"]
./wscript:ctx.env.PLATFORM_LIBPATH = ["/usr/lib"]
./wscript:ctx.env.PLATFORM_LIBPATH = ["/opt/local/lib"]
-bash-5.0$ 

PLATFORM_INCLUDES is used occasionally in places like this in ntpd/wscript:
includes=[ctx.bldnode.parent.abspath(), "../include", "../libaes_siv"] 
+
ctx.env.PLATFORM_INCLUDES,

I think all we have to do is fill in the places where PLATFORM_LIBPATH should 
get used.

--

Ahh...  I found some documentation for things like INCLUDES_NTPD
I'm guessing that PLATFORM_xxx is leftover from previous versions of waf.

In the waf book:
  10.3.3. Foreign libraries and flags

After removing the PLATFORM_ current problem is:
bob2/main/ntpd/ntpd:
-lcrypto.1.0.0 => not found
-lssl.1.0.0 => not found
-lm.0 => /usr/lib/libm.so.0
...

What flags do I have to add to the link step to get it to remember where it 
found the library files?


--

ctx.env.LINKFLAGS_NTPD looks unused.
same for ctx.env.CFLAGS_bin



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


FWD: Forthcoming OpenSSL Releases - Sep 10

2019-09-03 Thread Hal Murray via devel
The OpenSSL project team would like to announce the forthcoming release
of OpenSSL versions 1.1.1d, 1.1.0l and 1.0.2t.

These releases will be made available on 10th September 2019 between
approximately 1200-1600 UTC.

These are security fix releases. The highest severity security issue
fixed by these releases is rated as LOW.

Please note that this is expected to be the last release of 1.1.0 before
it goes
out of support on 11th September 2019.

Yours

The OpenSSL Project Team


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Certificates

2019-09-11 Thread Hal Murray via devel


Any openssl command line wizards?

What do I type to find out when my certificate expires?  We should make a 
script that can be called from cron.

What do I type to figure out which cert in the root collection for my 
OS/distro that a NTS-KE server is using?  I'd like some code I can cut-paste 
to do that and/or a script that will do that for all the servers in ntp.conf 
that are using nts.

I'm pretty sure their man pages have all the info and with enough work I can 
work out the details.  But I won't bother if somebody is familiar with that 
area.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Future directions

2019-09-14 Thread Hal Murray via devel
-* We intend to fully support Network Time Security and to be first or
-  second interop on that standard once it is finalized.  At that
-  point, older insecure authentication methods (MAC and MS-SNTP) may
-  be removed.
+* Now that we have full Network Time Security, a neasr-future
+  direction is to remove older insecure authentication methods (MAC
+  and MS-SNTP).

The old MAC mode in not insecure.  It's inconvenient to setup on a large scale 
since it requires manual intervention on the server for each new client.  It's 
a kludge since it doesn't use an extension.  But it's not insecure.

NIST supports it.

>From a code standpoint, it's not that ugly.  I think it should stay.


The MS-SNTP stuff is needed as a bridge to MS Active Directory.  I know next 
to nothing about MS.

It is a kludge in the sense that it calls out using TCP with associated waits 
that breaks the fundamental never-wait assumption of ntpd.  That's OK on a 
lightly loaded system.

I won't complain (much) if you remove it, but you will be cutting yourself off 
from some (potential?) MS users.  It's tangled up with Samba which I don't use.

from ntpd/ntp_signd.c
 * Dependency on NTP packet structure removed by ESR.
 * This code now only knows about the length of an NTP packet header,
 * not its content. Note that the signing technique never handled anything
 * but unextended and MACless packet headers, so it can't be used with NTS.

Using it with MAC or NTS doesn't make sense.  The MS end doesn't know about 
either.  The whole point of the MS-SNTP stuff is to support a type of 
authentication that MS does understand.

Has anybody tried using it with our code?  I wonder if it still works.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


waf confusion

2019-09-15 Thread Hal Murray via devel
wscript and friends have various things like:
if ctx.env.DEST_OS in ["freebsd", "openbsd"]:
ctx.env.PLATFORM_INCLUDES = ["/usr/local/include"]

I think the PLATFORM_ part is leftover from an old old version of waf.

ctx.env.PLATFORM_INCLUDES works because our code has things like:
includes=[
ctx.bldnode.parent.abspath(), "../include",
"%s/host/ntpd/" % ctx.bldnode.parent.abspath(), "."
] + ctx.env.PLATFORM_INCLUDES,

I think we should remove all the PLATFORM_ stuff in that area and remove all 
the ctx.env.PLATFORM_INCLUDES from all the includes.

I'll make the edits, but I'm not confident that I won't break something.  I 
have tested it with NetBSD which is how I got this far.



I'm not having much luck with the waf documentation.  Where is the section 
that explains what ctx.env.INCLUDES does?  How about others like LIBPATH, 
LDFLAGS, ...?

Another chunk of documentation I'm looking for is how libraries work.  
ntpd/wscript says:
use="libntpd_obj ntp M parse RT CAP SECCOMP PTHREAD NTPD "
"SSL CRYPTO DNS_SD %s SOCKET NSL SCF" % use_refclock,
I'd like to understand that area.



PS: We dropped support for OpenBSD a long time ago.  Should we leave things 
like that around as a reminder, or clean them up to reduce clutter?


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Cruft

2019-09-15 Thread Hal Murray via devel


There are various #ifdefs testing RLIMIT_MEMLOCK and friends

The Linux man page for setrlimit says:
   getrlimit(), setrlimit(): POSIX.1-2001, POSIX.1-2008, SVr4, 4.3BSD.
So I think we can assume it exists and remove the #ifdefs.

Any reason not to?



Why is the -D PYTHON stuff cluttering up the command line when compiling?  I 
can't find anyplace in the code that references any of them.

16:57:54 runner ['/usr/lib64/ccache/gcc', '-DUNITY_EXCLUDE_FLOAT_PRINT', 
'-fstack-protector-all', '-Wshadow', '-Wpacked', '-Wcast-qual', 
'-Wmissing-declarations', '-Wdisabled-optimization', 
'-Wimplicit-function-declaration', '-Winvalid-pch', '-Wpointer-arith', 
'-Wwrite-strings', '-Winit-self', '-Wfloat-equal', '-Wformat', 
'-Wformat-signedness', '-Wformat-security', '-Wsuggest-attribute=noreturn', 
'-Wimplicit-fallthrough=3', '-fPIC', '-O1', '-Wall', '-Wextra', 
'-Wmissing-prototypes', '-Wstrict-prototypes', '-Wundef', '-Wunused', 
'-Winline', '-Wswitch-default', '-g', '-std=c99', '-D_GNU_SOURCE', '-I..', 
'-Iinclude', '-I../../include', '-Ilibaes_siv', '-I../../libaes_siv', 
'-DPYTHONDIR="/usr/local/lib/python2.7/site-packages"', 
'-DPYTHONARCHDIR="/usr/local/lib64/python2.7/site-packages"', 
'-DHAVE_PYEXT=1', '-DHAVE_PYTHON_H=1', '../../ntpd/nts_extens.c', '-c', 
'-o/home/murray/ntpsec/play/hgm/main/ntpd/nts_extens.c.1.o']



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Future directions

2019-09-15 Thread Hal Murray via devel


Two areas to consider:

Port randomization:
  https://tools.ietf.org/html/draft-gont-ntp-port-randomization-04

The basic idea is for the client side to use a random port number when sending 
packets so bad guys have a harder time attacking with junk packets if they 
can't monitor the traffic to see the port number.  (Without this feature, they 
know the port number is 123.)

There are two ways to implement it: random per-packet, or random per-target.  
Some routers include the source port when hashing to select among alternate 
routes.  They suggest per-target which will hide the timing changes due to 
different paths.  I would have picked per-packet in order to get some good 
data at the cost of discarding a higher percentage of samples.  This area 
seems ripe for some experimentation and data collection.

I'd guess somewhere between a day and a week to implement this.

---

Better transmit timing...

We get good time stamps on the receive path.  The transmit path is messy.  We 
grab the time before the call to send the packet so it is early.  This gets 
worse with NTS or shared key MACs.

There is a draft in the works:

NTP Interleaved Modes
  https://tools.ietf.org/html/draft-ietf-ntp-interleaved-modes-02
It requires per-client state at the server.  That's ugly in principle, but not 
a major problem if you believe that modern systems have lots of memory.

I've skimmed it, but not studied it.

I've been thinking of measuring the timing and bumping the time stamp to 
correct for the delay.  Again, this seems appropriate for some experimentation 
and data collection.

I'd guess a week or two.  Maybe more if the data gets interesting.

--

We should check RFC 8633 - NTP Best Current Practices
  https://www.rfc-editor.org/rfc/rfc8633.html

  

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Future directions

2019-09-16 Thread Hal Murray via devel


devel@ntpsec.org said:
> I'd like to see "struct shmTimme" cleaned up and move into a header file for
> system use..  Right now it is not in any header file, so clients like gpsd
> need their own copies. 

Assume we had a nice header file.  Where would it live?  What are the 
mechanics of getting it to ntpsec or gpsd?

Maybe another project with the sample/test code?

-

I think we should put the current stuff on the back burner and make a new SHM 
interface where the clients are read only.

Is shmget/shmat the right API to use?  I remember discussions of there being a 
wrong API but don't remember any details.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Future directions

2019-09-16 Thread Hal Murray via devel


> I always liked the idea of moving to a shm or a local socket "clockd"
> interface.

My comment and Gary's was only to clean up the existing SHM interface.  No 
changes outside refclock_shm.c (and whatever it takes to support a new/clean 
header file)

>  (Under the hood, a UNIX domain socket or a 127/8 localhost socket is nothing
> more than merely a shm segment and two semiphore locks.)

Are UNIX domain sockets available on all OSes?  If so, that's worth 
considering.

Every now and then over the years, somebody comes along and wants us to clean 
up the SHM area to use a semaphore.  I've never figured out how to initialize 
that.  We have 2 independent processes that can start/stop/crash in any order.

> A clockd interface was, in fact, part of the original plan.  Maybe make it
> the plan again? 

I never saw a good proposal along those lines.  It would fit in with Eric's 
desire to rip stuff out - we could get rid of all the refclock_* code.  Not 
really, we/somebody would still have to support it, but it could be moved to a 
different project.  It's not high on my list.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Fwd: Future directions

2019-09-16 Thread Hal Murray via devel


> - a multicast DNS broadcaster for NTS.
> - additions to the DNS code to allow non-A/ pools. (cname/srv probably)
> - Additions to the DNS code to allow for mdns monitoring. 

I'm not a DNS wizard.  That area is slightly ugly in that the DNS work is done 
in a separate thread so there is some work to get to/from that thread.  But 
that code exists and once inside that thread you can do whatever is needed.

There are restrictions on DNS for NTS.  You start with a text string from a 
config file.  You need to turn that into an IP Address and then the host at 
that address has to have a certificate that matches that text.  That is all 
straightforward if the text is a host name.  I don't see how multicast helps.  
cnames are no problem as long as the host you contact has a certificate that 
matches the initial text/name.  If you want to do anything more complicated, 
then you are dragging DNS into the NTS security model.  That seems like a good 
area to avoid.


> - replace mode6 with a tcp service. (it was only IIRC in v2-3 RFCs)
> - - or work on the auth code for ntpq a bit.

The current mode6 has 2 levels of auth.  There is a simple cookie handshake to 
make sure you are not responding to a forged IP Address.  Then there is a 
password to enable writing.  I think that password is sent in the clear.  
restrict gets tangled up in here.  I'd have to check the details.  I'd be 
happy to fix the password in the clear.

TCP drags in a pile of complications.  You have to limit the number of 
connections and then worry about bad guys tying them up.  We already have 
those problems with the NTS-KE servers.

Using TLS rather than raw TCP seems like the way to go.  The mode6 TLS server 
could use UDP to talk to the old mode6 ntpd server.

I never use ntpq to write anything so none of that is high on my list.


> Given the increase in threading would it be possible to shove smb auth into a
> thread? 

Possible?  Sure.  But not worth much effort unless we find an interested user.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Future directions

2019-09-16 Thread Hal Murray via devel


> Are UNIX domain sockets available on all OSes?  If so, that's worth
> considering. 

One advantage of the read-only SHM mode is that it supports multiple readers.  
So you could run shmmon while ntpd is also running.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: Future directions

2019-09-16 Thread Hal Murray via devel


> Another is that you can allow anyone on the localhost to read the SHM without
> any risk of them messing up the SHM data.

Good point.

> It is also fast since it is memory mapped.  No system call overhead. 

CPU cycles aren't important for refclocks.  (at least within reason)

The disadvantage of SHM is that there is no way to wake up a reader when new 
data is available.  Readers have to poll.

-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


  1   2   3   4   5   6   7   8   9   10   >