Client Command Hooks Proposal, v2

Mark Chu-Carroll Thu, 03 Apr 2014 14:25:17 -0700

A while ago, I sent around a proposal for how to do hooks in the v2 command
line for Aurora. Since then, I've been talking to a variety of people about
what they'd like to be able to do with hooks, and I've revised the
proposal. Please respond with any comments, criticisms, praise, or
brickbats.


    -Mark

# Command Hooks for the Aurora Client

## Introduction/Motivation

We've got hooks in the client that surround API calls. These are
pretty awkward, because they don't correlate with user actions. For
example, suppose we wanted a policy that said users weren't allowed to
kill all instances of a production job at once.

Right now, all that we could hook would be the "killJob" api call. But
kill (at least in newer versions of the client) normally runs in
batches. If a user called killall, what we would see on the API level
is a series of "killJob" calls, each of which specified a batch of
instances. We woudn't be able to distinguish between really killing
all instances of a job (which is forbidden under this policy), and
carefully killing in batches (which is permitted.) In each case, the
hook would just see a series of API calls, and couldn't find out what
the actual command being executed was!

For most policy enforcement, what we really want to be able to do is
look at and vet the commands that a user is performing, not the API
calls that the client uses to implement those commands.

So I propose that we add a new kind of hooks, which surround noun/verb
commands. A hook will register itself to handle a collection of (noun,
verb) pairs. Whenever any of those noun/verb commands are invoked, the
hooks methods will be called around the execution of the verb. A
pre-hook will have the ability to reject a command, preventing the
verb from being executed.

## Registering Hooks

These hooks will be registered three ways:
* System hooks file. There will be an global configuration file, much like
the
  current `clusters.json`, which can define hooks.
* Project hooks file. If a file named `AuroraHooks` is in the project
directory
  where an aurora command is being executed, that file will be read,
  and its hooks will be registered.
* Configuration plugins. A configuration plugin can register hooks using an
API.
  Hooks registered this way are, effectively, hardwired into the client
executable.

The order of execution of hooks is unspecified: they may be called in
any order. There is no way to guarantee that one hook will execute
before some other hook.


### Global Hooks

Commands registered by the python call are called _global_ hooks,
because they will run for all configurations, whether or not they
specify any hooks in the configuration file.

In the implementation, hooks are registered in the module
`apache.aurora.client.cli.hooks`, using the class `HookRegistry`.  A
global hook can be registered by calling `HookRegistry.registerHook`
in a configuration plugin.

### Hook Files

A hook file is a file containing Python source code. It will be
dynamically loaded by the Aurora command line executable. After
loading, the client will check the module for a global variable named
"hooks", which contains a list of hook objects, which will be added to
the hook registry.

The global hooks file will, by default, be located in
`/etc/aurora/hooks`. A project hooks file will be named `AuroraHooks`,
and will be located in either the directory where the command is being
executed, or one of its parent directories, up to the nearest git
repository base.

### The API

    class Hook(object)
      @property
      def id(self):
        """Returns an identifier for the hook."

      def get_nouns(self):
        """Return the nouns that have verbs that should invoke this hook."""

      def get_verbs(self, noun):
        """Return the verbs for a particular noun that should invoke his
hook."""

      @abstractmethod
      def pre_command(self, noun, verb, context, commandline):
        """Execute a hook before invoking a verb.
        * noun: the noun being invoked.
        * verb: the verb being invoked.
        * context: the context object that will be used to invoke the verb.
          The options object will be initialized before calling the hook
        * commandline: the original argv collection used to invoke the
client.
        Returns: True if the command should be allowed to proceed; False if
the command
        should be rejected.
        """

      def post_command(self, noun, verb, context, commandline, result):
        """Execute a hook after invoking a verb.
        * noun: the noun being invoked.
        * verb: the verb being invoked.
        * context: the context object that will be used to invoke the verb.
          The options object will be initialized before calling the hook
        * commandline: the original argv collection used to invoke the
client.
        * result: the result code returned by the verb.
        Returns: nothing
        """

    class HookRegistry(object):
      @classmethod
      def register_hook(self, hook):
        pass

## Skipping Hooks

In a perfect world, hooks would represent a global property or policy
that should always be enforced. Unfortunately, we don't live in a
perfect world, which means that sometimes, every rule needs to get
broken.

For example, an organization could decide that every configuration
must be checked in to source control before it could be
deployed. That's an entirely reasonable policy. It would be easy to
implement it using a hook. But what if there's a problem, and the
source repos is down?

The easiest solution is just to allow a user to add a `--skip-hooks`
flag to the command-line. But doing that means that an organization
can't actually use hooks to enforce policy, because users can skip
them whenever they want.

Instead, we'd like to have a system where it's possible to create
hooks to enforce policy, and then include a way of building policy
about when hooks can be skipped.

I'm using sudo as a rough model for this. Many organizations need to
give people the ability to run privileged commands, but they still
want to have some control. Sudo allows them to specify who is allowed
to run a privileged command; where they're allowed to run it; and what
command(s) they're allowed to run.  All of that is specified in a
special system file located in `/etc/sudoers` on a typical unix
machine.

### Specifying when hooks can be skipped

The sudoers file has a terrible syntax, so I'm not going to try to
adopt it; instead, I'm going to stick with the Pystachio-based
configuration syntax that we use in Aurora. A rule that permits a
group of users to skip hooks is defined using a Pystachio struct:

    class HookRule(Struct):
      roles = List(String)
      commands = Map(String, List(String))
      arg_patterns = List(String)
  hooks = List(String)

* `roles` is a list of role names, or regular expressions that range over
role
  names. This rule gives permission to those users to skip hooks.
* `commands` is a map from nouns to lists of verbs. If a command `aurora n
v`
  is being executed, this rule allows the hooks to be skipped if
  `v` is in `commands[n]`. If this is empty, then all commands can be
skipped.
* `arg_patterns` is a list of regular expressions ranging over parameters.
  If any of the parameters of the command match the parameters in this list,
  the hook can be skipped.
* `hooks` is a list of hook identifiers which can be skipped by a user
  that satisfies this rule.

The hooks file defines a global variable `hook_rules`, which is a list of
`HookRule` objects. If any of the hook rules matches, then the command
can be run with hooks skipped.

For example, the following is a hook rules file which allows:
* The admin (role admin) to skip any hook.
* Any user to skip hooks for test jobs.
* A specific group of users to skip hooks for jobs in cluster `east`
* Another group of users to skip hooks for `job kill` in cluster `west`.

    allow_admin = HookRule(roles=['admin'])
    allow_test = HookRule(roles=['.*'],  arg_patterns=['.*/.*/test/.*'])
    allow_east_users = HookRule(roles=['john', 'mary', 'mike', 'sue'],
        arg_patterns=['east/.*/.*./*'])
    allow_west_kills = HookRule(roles=['anne', 'bill', 'chris'],
      commands = { 'job': ['kill']}, arg_patterns = ['west/.*/.*./*'])

    hook_rules = [allow_admin, allow_test, allow_east_users,
allow_west_kills]

## Skipping Hooks

To skip a hook, a user uses a command-line option, `--skip-hooks`. The
option can either
specify specific hooks to skip, or "all":

* `aurora --skip-hooks=all job create east/bozo/devel/myjob` will create a
job
  without running any hooks.
* `aurora --skip-hooks=test,iq create east/bozo/devel/myjob` will create a
job,
  and will skip only the hooks named "test" and "iq".


## Changes

Major changes between this and the last version of this proposal.
* Command hooks can't be declared in a configuration file. There's a simple
  reason why: hooks run before a command's implementation is invoked.
  Config files are read during the commands invocation if necessary. If the
  hook is declared in the config file, by the time you know that it should
  have been run, it's too late. So I've removed config-hooks from the
  proposal. (API hooks defined by configs still work.)
* Skipping hooks. We expect aurora to be used inside of large
  organizations. One of the primary use-cases of hooks is to create
  enforcable policy that are specific to an organization. If hooks
  can just be skipped because a user wants to skip them, it means that
  the policy can't be enforced, which defeats the purpose of having them.
  So in this update, I propose a mechanism, loosely based on a `sudo`-like
  mechanism for defining when hooks can be skipped.

Client Command Hooks Proposal, v2

Reply via email to