On Nov 20, 2012, at 2:28 PM, Jay Ashworth <j...@baylink.com> wrote: > ----- Original Message ----- >> From: "Leo Bicknell" <bickn...@ufp.org> > >> To protect against two falseticking servers (tick and tock, as we saw on >> the 19th) you need _FIVE_ servers minimum configured if they are both in >> the list. More importantly, if you want to protect against a source >> (GPS, CDMA, IRIG, WWIV, ACTS, etc) false ticking, you need a minimum of >> _FOUR_ different source technologies in the list as well. >> >> It's not hard, my box that I posted the logs from peers with 18 >> servers using 8 source technologies, all freely available on the Internet... > > I'm curious, Leo, what your internal setup looks like. Do you have an > internal pair of masters, all slaved to those externals and one another, > with your machines homed to them? Full mesh? Or something else? > > In my last big gig, it was recommended to me that I have all the machines > which had to speak to my DBMS NTP *to it*, and have only it connect to the > rest of my NTP infrastructure. It coming unstuck was of less operational > impact than *pieces of it* going out of sync with one another...
here's a sample ntp config from one of my systems. -- snip -- # Use public servers from the pool.ntp.org project. # Please consider joining the pool (http://www.pool.ntp.org/join.html). server 0.fedora.pool.ntp.org server 1.fedora.pool.ntp.org server 2.fedora.pool.ntp.org server 3.fedora.pool.ntp.org # server 0.us.pool.ntp.org iburst maxpoll 9 server 1.us.pool.ntp.org iburst maxpoll 9 server 2.us.pool.ntp.org iburst maxpoll 9 server 129.250.35.250 iburst maxpoll 9 server 129.250.35.251 iburst maxpoll 9 -- snip -- You can audit its operation like this: nat:~$ ntpq -p -n -c ass remote refid st t when poll reach delay offset jitter ============================================================================== -129.250.35.250 164.244.221.197 2 u 68 512 377 19.248 -0.135 3.195 +129.250.35.251 192.5.41.40 2 u 439 512 377 41.817 1.109 15.660 -206.57.44.17 204.123.2.5 2 u 126 512 377 37.133 -6.443 9.631 +4.53.160.75 209.81.9.7 2 u 48 512 377 25.209 1.551 8.804 -64.73.32.135 192.5.41.41 2 u 349 512 377 23.418 -0.703 1.721 *50.116.38.157 64.250.177.145 2 u 380 512 377 43.021 1.267 2.136 +208.87.221.228 10.0.22.49 2 u 517 512 377 92.000 0.974 0.678 -206.212.242.132 128.252.19.1 2 u 323 512 377 21.781 -2.873 1.304 +38.229.71.1 204.123.2.72 2 u 211 512 377 21.977 -0.055 2.274 ind assid status conf reach auth condition last_event cnt =========================================================== 1 39973 931a yes yes none outlyer sys_peer 1 2 39974 941a yes yes none candidate sys_peer 1 3 39975 9324 yes yes none outlyer reachable 2 4 39976 942a yes yes none candidate sys_peer 2 5 39977 931a yes yes none outlyer sys_peer 1 6 39978 961a yes yes none sys.peer sys_peer 1 7 39979 9414 yes yes none candidate reachable 1 8 39980 931a yes yes none outlyer sys_peer 1 9 39981 941a yes yes none candidate sys_peer 1 What you would have seen is a falseticker from the impacted clocks. This is a fairly reasonable setup. I've also been looking at an item like this: http://www.netburnerstore.com/ProductDetails.asp?ProductCode=PK70EX-NTP which is about $300 + misc parts. Should be well worth it to avoid a 'major outage' that some folks had with needing to reboot their servers, etc. - Jared