I've spent the last week triaging and resolving items from the NTPsec issue tracker. We're making excellent progress; the count of unresolved issues has gone from 41 to 15.
I shall round up the remaining issues and discuss where I think our priorities need to be. Summary: * I need to work on #348: reverse function for restrict * unpeer should be made to fully work from ntpq :config. This one is mine too. * In my opinion, our only real blocker is #347: ntpd doesn't synchronize quickly. This is one for Gary or Hal (our guys with operations and measurement experience), and I'd appreciate it if one of you stepped up. * There are two waf recipe bugs that I'm completely blocked on, despite having stared at them a lot. We need a waf expert, but I don't know where to find one. * waf configure needs a --unitdir option. Matt Selsky was going to do it but it hasn't landed yet. Matt, can you schedule time to complete this? * We need RPM packaging. No volunteer has followed through on this yet. * We have a NetBSD port bug that should be easy to fix, but I can't do it; no test access. Matt Selsky is the logical person to tackle this. * I have written and documented an implementation of config directories that some of our other devs don't like. I don't think we'll have time to resolve that argument before 1.0, so I'm going to mark this feature unstable/experimental in the documentation and hope we don't get flamed if we change it. * We have a couple of serious issues with the GPSD_JSON driver, a half-baked experimental feature of Classic. Following the details section I have a summary of requests to our devs. Details: --------------------------------------------------------------------------- #356: RFE: reverse function for restrict https://gitlab.com/NTPsec/ntpsec/issues/356 Hans Meyer: "The current implementation of NTPsec allows to configure detailed restrictions. Command line tool "ntpq" can be used to define restrictions during runtime. But the current implementation doesn't allow to remove already defined restrictions. "restrict" can only add definitions even if the attributes define less permissions. Therefore I ask for a reverse function like 'release' or 'unblock'." I was going to let this RFE slide until after 1.0, but there are two reasons not to. One is that we're light on user-visible features for a 1.0. The other is that Meyer has been our most persistent outside beta tester, and making him happy to keep him engaged seems like a good idea. I have to do this one, nobody else knows the configuration machinery well enough. Difficulty seems moderate. Probably a couple days of work. --------------------------------------------------------------------------- #348: server statement not checking for valid IP to be resolvable https://gitlab.com/NTPsec/ntpsec/issues/348 Configuring a server with a typo in its name produces a bogus peer entry that (naturally) hangs in INIT state forever. It can't be removed with unpeer. There are two issues here. One is that unpeer is not doing what it should. That is a bug and needs to be fixed. whether ntpd should re-try failed peer name lookups. There's an argument in the bug thread over whether ntpd should retry failed peer-name lookups, and if so how often. Currently it does not Arguments for: (1) Allows recovery from temporary DNS failures, (2) deals with any possible boot-time race between DNS coming up and NTP coming up. (I note, however, that the latter seems to be only a theoretical problem; I've never seen a bug report that ckearly matches this scenario.) Arguments against: (1) Additional code complexity, (2) DDoS risk. In my mind, "against" wins. Here's why: The users of ntpsec will be divided into two cohorts. 99% will never use anything but a canned configuration that talks to pool servers. For these people, a new set of retry-policy knobs will be useless; they never even look at their configs! The other 1% is experienced time sysadmins who use ntpq and are quite capable of noticing an entry stuck in INIT or STEP state and dealing with it manually. At best, adding another policy knob could only help part of that 1% - and people in that group don't qualify new hosts very often, anyway. Conclusion: adding a retry facility Classic never had isn't a good idea. Making unpeer work, on the other hand, seems worth doing. (Anybody who wants to argue with this decision should do so in the issue thread, not here.) --------------------------------------------------------------------------- #347: ntpd doesn't synchronize quickly https://gitlab.com/NTPsec/ntpsec/issues/347 Expected time to first sync has increased since 0.9.7. I consider this an important place to not let the competition win. This is the only tracker bug I consider a release blocker. We need to bisect and figure out what change slowed us down, and fix it. Hal suspects his DNS changes of a few months ago might be implicated. He's the logical person to work this. --------------------------------------------------------------------------- #312: pyc generated files do not have matching timestamps https://gitlab.com/NTPsec/ntpsec/issues/312 Something is not quite right in our waf recipe. The three files in question are generated with some rather odd productions in pylib/wscript that tla helped me develop. The fix for this would almost certainly be trivial if we knew what it was. The real problem here is that waf is so badly documented that troubleshooting problems like this is extremely difficult. We need a waf expert. I don't know where to find one. I've stared at this problem a lot but gotten nowhere. --------------------------------------------------------------------------- #273: No repo or cache detected https://gitlab.com/NTPsec/ntpsec/issues/273 Another waf recipe problem I have not been able to gain a clue about. As before, we need a waf expert. --------------------------------------------------------------------------- #270: Loss of precision in step_systime() https://gitlab.com/NTPsec/ntpsec/issues/270 This isn't going to get done in 1.0. Gary and I need to have a design argument (with Hal pitching in) about how pivoting works, and should work. This is a particularly murky area of Mills's code - I'm not sure *any* of us understands it right. --------------------------------------------------------------------------- #269: Update and install systemd services if user requires them https://gitlab.com/NTPsec/ntpsec/issues/269 This one seems mostly resolved. Matt Selsky promised to add a --unitdir option that would do the rest. Matt, can you finish that? --------------------------------------------------------------------------- #252: Need an RPM package https://gitlab.com/NTPsec/ntpsec/issues/252 Yes, we do. Occasionally we get a volunteer surfacing on #ntpsec to do this, but nobody has followed up yet. I've put my apprentice Keane (Dr. Daemoneye) on this problem. He thinks he can have results this week. --------------------------------------------------------------------------- #251: Add fudge option to server config https://gitlab.com/NTPsec/ntpsec/issues/251 Gary and Daniel are having an argument over whether this is a good idea. Me, I'd rather not do it. Just to keep life simple. But they understand the terrain in ways I don't. --------------------------------------------------------------------------- #220: ntpc.so is unable to resolve libpython2.7.1.0 on NetBSD https://gitlab.com/NTPsec/ntpsec/issues/220 This appears to be a waf recipe problem, not passing -R/usr/pkg/lib to the linker as it should. Matt, you can test on NetBSD. Can you follow up on this? --------------------------------------------------------------------------- #204: Support /etc/ntp.d https://gitlab.com/NTPsec/ntpsec/issues/204 There is disagreement about how this should work. Probably not to be resolved before 1.0. --------------------------------------------------------------------------- #62: Refclock #20 behaves perversely on GPS signal loss. https://gitlab.com/NTPsec/ntpsec/issues/204 I see the problem Gary is describing, but I don't know if a fix is possible even in principle. Gary, if you have a problem analysis that suggests a fix, please describe in the issue thread. If you don't, tell me so we can document this as a known (unsolvable) problem. --------------------------------------------------------------------------- #57: Refclock #46, GPSD_JSON, bad NMEA time https://gitlab.com/NTPsec/ntpsec/issues/57 #55: ntpd refclock #46 just stops working. https://gitlab.com/NTPsec/ntpsec/issues/55 I've grouped these together because they are aspects of the same problem: the GPSD_JSON driver was a bad idea to begin with and is in pretty crappy shape internally. As the designer of GPSD_JSON, I am in a unique position to be able to say to the world "this was a bad idea and I'm killing it". I intend to to do exactly that before 1.0 if it doesn't get fixed. --------------------------------------------------------------------------- #44: Confusion with drift at the rail https://gitlab.com/NTPsec/ntpsec/issues/44 I don't fully undetand this issue. I need Hal, who raised it, to suggest at least a theoretical fix. --------------------------------------------------------------------------- Work requests: I don't normally like to try to hand out assignments or get people to commit to doing them, but coming up on a release I need to have some idea what we can realistically get done and where we need to somehow recruit extra help. Gary: Our top priority needs to be #347, slow startup. I need to know that either you or Hal is on this and will nail it down. Also it's up to you to save the GPSD_JSON driver. I don't think anyone else is invested in it, and I'd frankly prefer dropping it to trying to fix it. #57, #55. Also I need a better characterization of #62. If you can, please tackle these in roughly the order listed. Matt: You took on being our build-system expert a while back, which puts #312 #273 #269 #220 on your list. I hate to stick you with trying to decrypt the waf docs, but there isn't anyone obviously better equipped. Hal: Either Gary needs to be on #347 or you do. There's also #44, our oldest open bug. Keane: You've taken on #252. Myself: #355 and #358 are obviously mine. And I'm the backstop for everbody else, which is why I'm not assigning myself more up front. I've put corresponding assignments on the tracker issues. RSVP, everybody. I need to know what you can do and are willing to do. Remember, September 28th. If we get through these there are maybe some more fun things we can do before release. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> Gun Control: The theory that a woman found dead in an alley, raped and strangled with her panty hose, is somehow morally superior to a woman explaining to police how her attacker got that fatal bullet wound. -- L. Neil Smith _______________________________________________ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel