TL;DR: Here be nitpicks maquerading as answers, humor 
       in wisdom's drag.

On 2016-04-05 12:47, Edward Ned Harvey (lopser) wrote:
> It's clear that the industry best practice is to use a real
> time source, but there is a flip side: If you deviate from the
> industry best practice, how much do you risk?

Hi Edward,

>From your posts here, I've got tons of respect for your opinions
and insights, some of which I've snagged for posterity. Below,
when I say "you" I'm really addressing the List -- who as a rule
are less subtle, less skilled, and more in need of a sharp push.
I'm saddened by the irrational optimism often expressed here.
Please bear with me.

  | As the IT person, I never trust anything.  Every piece of hardware,
  | every software, every system, is expected to fail, and it's my
  | job as IT person, to minimize harm caused by these failures. So
  | right now, I'm transitioning from "never trust anything" to
  | "holy crap, look at that Corillian Death Ray.
  | -- Edward Ned Harvey, 2013-05-12

The biggest problems facing sysadmins in general are risk
related. A previous employer saw backups as a time-consuming
predator that was never going to have a pay-back. The problem
was not that they hated backups, but they couldn't see the
existential risk to the business they help reduce. The ops
manager's predecessor had a career-limiting incident with a
failed system for which there was no backup. And yet nothing was
learned.

> When instructed to
> deviate from best practice, should you be upset and insist that
> your boss put the request in writing, creating friction between
> yourself and him/her? Should you refuse to do it completely?
> Should you just roll with it?

There's an over-arching game being played, the rules are baked
into our primate behavior. If your boss is supportive and you
have a warm and reciprocal relationship, mention that you need
resources to keep the lights on and wait for them to make it
rain.

  | When the customer has beaten upon you long enough, give him what
  | he asks for, instead of what he needs. This is very strong
  | medicine, and is normally only required once.
  | --Vadim Vygonets, a.s.r

Otherwise, carefully transact Vadim's advice and discreetly
carry an umbrella. You can use the tip to prod the most suitable
orifice if funds are delayed. If your manager misunderstands
their role and think they're a problem-solver or maybe
a dominatrix, you've lost the game. 

  | "It's called a shovel," said the Senior Wrangler. "I've seen
  | the gardeners use them. You stick the sharp end in the ground.
  | Then it gets a bit technical." -- Terry Pratchett, "Reaper Man"

> If you have a specific need, then it's important to cater to
> that need, and you might have to insist against deviation from
> best practice.

Paul Culmsee and K. Awati wrote a book-length screed about
best practices that I highly recommend, ISBN 1938908406.
Spoiler alert: they don't work.

> But if instead, all you have to worry about is AD
> remaining functional, and approximate correct timestamps on
> files and such, and users knowing the correct time to show up at
> meetings, then your need is much less strict.

Timestamp accuracy requirments are relative. AD's +/- 5 minutes
(based, I think on the Kerberos requirements for ticket expiry)
is the guard-rail of timekeeping. Maybe you didn't plunge off
the highway down the embankment but you're definitely not
driving within the lane. At the other end of the scale, when
systems have gone pear-shaped and 1 msec clock uncertainty has
sundered your ability to separate cause from effect, the cost
savings of having run a loose ship will melt away into the lake
of overtime, downtime, and lost sales. If you're lucky you can
force the bill onto the DBAs and developers who will mop up the
mess. Just don't sit with your backs to them.

> I hear people on this list referring to "unstable time source"
> and "false ticker." This is a valid concern, sort-of. In
> reality, your guest machines are no better at tracking time than
> the VM time server is, so when there's any deviation between the
> client and your time server, it doesn't result in a false
> ticker. The way you get a false ticker is when you have
> configured several time servers, and some of them agree with
> each other by quorum, but there is an outlier. In that case, the
> clients still get the correct time, but the one outlier server
> needs to be corrected.

Break some canary server loose from the tyranny of accurate
timestamps and wait for the alarm. If alarms don't sound have
a talk with your PFY about the nature of systems, geometry,
causal chains, overtime, risk management, and so on, until one
day they come (metaphorically) running over waving a printout
saying "holy sh*t the clock on canary666 is way out of spec
how do we fix that?". Until that moment arrives keep 3 to 6
month's wages in highly liquid form. Single malt is good.

> There is something to be said for anecdotal evidence, when
> there's a lot of it. I've seen many dozens of environments where
> AD is run in virtual servers, which get their time from the
> internet (non-local stratum 2), and AD serves time to the
> clients on the LAN without noticeable problems. Sure the time
> may drift a few seconds in either direction, but again - unless
> you have a specific need, that's good enough for general use.

The key thing is to have very tight control over clocks within
the blast radius of your troubleshooting domain. One time the
above reasonable assumptions went wrong was when a junior SA
misconfigured a subdomain in a way that got hundreds of systems
pulling time from the first workstation to boot in the morning.

   | The ticket was closed with 'Colonel Mustard, in the
   | datacenter, with the keyboard.'

> Most environments don't have a dedicated stratum 1 time
> source, and most AD is run on VM's.

Is this the right place to remind that VMs experience time-
travel? One moment it's 10 after two, and then there was a
snapshot revert, and suddenly it's twenty 'till. Oh my what is
the luckless bastard of an NTP client going to make of the giant
jumping timeslop? (Hint: answer varies depending on OS make,
model, and vintage. Some exceptions apply. Results not
guaranteed in all states. When in doubt ask your doctor). 

> In those many dozen environments, I've seen many hundreds of
> VM's running as guests on peoples' laptops. Most common is the
> windows VM running inside someone's macbook. This introduces yet
> another level of VM time-fuzziness, and yet again, I've never
> seen it cause a problem, because the only requirements in those
> environments have been to keep AD running, and clients
> operational, and users showing up to meetings at the right time.

Right -- the real world, of users and desktops. System
administration of the back end is less forgiving.

> I have some further anecdotal evidence: I have actually seen a
> situation or two, where the AD server was no longer able to get
> time from the internet (due to firewall change - once hardware,
> and once software). So the AD time source slowly drifted off,
> and all the clients slowly followed. We discovered when a human
> noticed, "Why does my laptop say 4:05, when my phone says 4:13?"
> So then we restored the ability for AD to follow the internet,
> and AD slowly adjusted back to the right time, and all the
> clients slowly followed.

> Nevermind stratum 1. That's a VM client, following a VM
> server, following a non-local stratum 2 time source. It's about
> as bad as you can get, but the problem was caused by firewall
> blockage, and the behavior of the system was about as ideal as
> you can get, once the firewall problems were corrected.
> 
> Unless you have a specific need, in this case, the risk of
> deviation from best practice is pretty low.

> Further evidence: Despite trying to demonstrate a problem with
> this, to prove your boss wrong, you couldn't demonstrate a
> problem.

-- 
Charles Polisher

_______________________________________________
Tech mailing list
Tech@lists.lopsa.org
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to