Hi,

In working on the implementation of the encrypted secrets feature of
Zuul v3, I have found some things that warrant further discussion.  It's
important to be deliberate about this and I welcome any feedback.

For reference, here is the relevant portion of the Zuul v3 spec:

http://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html#secrets

And here is an implementation of that:

https://review.openstack.org/#/q/status:open+topic:secrets+project:openstack-infra/zuul

The short version is that we want to allow users to store private keys
in the public git repos which Zuul uses to run jobs.  To do this, we
propose to use asymmetric cryptography (RSA) to encrypt the data.  The
specification suggests implementing PKCS#1-OAEP, a standard for
implementing RSA encryption.

Note that RSA is not able to encrypt a message longer than the key, and
PKCS#1 includes some overhead which eats into that.  If we use 4096 bit
RSA keys in Zuul, we will be able to encrypt 3760 bits (or 470 bytes) of
information.

Further, note that value only holds if we use SHA-1.  It has been
suggested that we may want to consider using SHA-256 with PKCS#1.  If we
do, we will be able to encrypt slightly less data.  However, I'm not
sure that the Python cryptography library allows this (yet?).  Also, see
this answer for why it may not be necessary to use SHA-256 (and also,
why we may want to anyway):

https://security.stackexchange.com/questions/112029/should-sha-1-be-used-with-rsa-oaep

One thing to note is that the OpenSSL CLI utility uses SHA-1.  Right
now, I have a utility script which uses that to encrypt secrets so that
it's easy for anyone to encrypt a secret without installing many
dependencies.  Switching to another hash function would probably mean we
wouldn't be able to use that anymore.  But that's also true for other
systems (see below).

In short, PKCS#1 pros: Simple, nicely packaged asymmetric encryption,
hides plaintext message length (up to its limit).  Cons: limited to 470
bytes (or less).

Generally, when faced with the prospect of encrypting longer messages,
the advice is to adopt a hybrid encryption scheme (as opposed to, say,
chaining RSA messages together, or increasing the RSA key size) which
uses symmetric encryption with a single-use key for the message and
asymmetric encryption to hide the key.  If we want Zuul to support the
encryption of longer secrets, we may want to adopt the hybrid approach.
A frequent hybrid approach is to encrypt the message with AES, and then
encrypt the AES key with RSA.

The hiera-eyaml work which originally inspired some of this is based on
PKCS#7 with AES as the cipher -- ultimately a hybrid approach.  An
interesting aspect of that implementation is that the use of PKCS#7 as a
message passing format allows for multiple possible underlying ciphers
since the message is wrapped in ASN.1 and is self-descriptive.  We might
have simply chosen to go with that except that there don't seem to be
many good options for implementing this in Python, largely because of
the nightmare that is ASN.1 parsing.

The system we have devised for including encrypted content in our YAML
files involves a YAML tag which specifies the encryption scheme.  So we
can evolve our use to add or remove systems as needed in the future.

So to break this down into a series of actionable questions:

1) Do we want a system to support encrypting longer secrets?  Our PKCS#1
system supports up to 470 bytes.  That should be sufficient for most
passwords and API keys, but unlikely to be sufficient for some
certificate related systems, etc.

2) If so, what system should we use?

   2.1a) GPG?  This has hybrid encryption and transport combined.
   Implementation is likely to be a bit awkward, probably involving
   popen to external processes.

   2.1b) RSA+AES?  This recommendation from the pycryptodome
   documentation illustrates a typical hybrid approach:
   
https://pycryptodome.readthedocs.io/en/latest/src/examples.html#encrypt-data-with-rsa
   The transport protocol would likely just be the concatenation of
   the RSA and AES encrypted data, as it is in that example.  We can
   port that example to use the python-cryptography primatives, or we
   can switch to pycryptodome and use it exactly.

   2.1c) RSA+Fernet?  We can stay closer to the friendly recipes in
   python-cryptography.  While there is no complete hybrid recipe,
   there is a symmetric recipe for "Fernet" which is essentially a
   recipe for AES encryption and transport.  We could encode the
   Fernet key with RSA and concatenate the Fernet token.
   https://github.com/fernet/spec/blob/master/Spec.md

   2.1d) NaCL?  A "sealed box" in libsodium (which underlies PyNaCL)
   would do what we want with a completely different set of
   algorithms.
   https://github.com/pyca/pynacl/issues/189

3) Do we think it is important to hide the length of the secret?  AES
will expose the approximate length of the secret up to the block size
(16 bytes).  This is probably not important for long secrets, but for
short ones, it may at least indicate the order of magnitude of a
password, for instance.  If we want, we can pad the secret further
before encrypting it.

4) If we adopt a system for longer secrets, do we still want to include
PKCS#1 for shorter ones?  The PKCS#1 ciphertext will be shorter than the
same secret would be in a hybrid system.  But either way, we're talking
about a fairly big blob (about 9 lines of base64 for PKCS#1).

5) If we keep PKCS#1, are we okay using SHA-1 or should we see about
using SHA-256?

6) Considering all of that, should we:

  A) Stick with PKCS#1 for now (we can always add something else later)
  B) Keep PKCS#1 and add something else now
  C) Drop PKCS#1 and replace it with something else

Thanks,

Jim


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to