Hi, In working on the implementation of the encrypted secrets feature of Zuul v3, I have found some things that warrant further discussion. It's important to be deliberate about this and I welcome any feedback.
For reference, here is the relevant portion of the Zuul v3 spec: http://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html#secrets And here is an implementation of that: https://review.openstack.org/#/q/status:open+topic:secrets+project:openstack-infra/zuul The short version is that we want to allow users to store private keys in the public git repos which Zuul uses to run jobs. To do this, we propose to use asymmetric cryptography (RSA) to encrypt the data. The specification suggests implementing PKCS#1-OAEP, a standard for implementing RSA encryption. Note that RSA is not able to encrypt a message longer than the key, and PKCS#1 includes some overhead which eats into that. If we use 4096 bit RSA keys in Zuul, we will be able to encrypt 3760 bits (or 470 bytes) of information. Further, note that value only holds if we use SHA-1. It has been suggested that we may want to consider using SHA-256 with PKCS#1. If we do, we will be able to encrypt slightly less data. However, I'm not sure that the Python cryptography library allows this (yet?). Also, see this answer for why it may not be necessary to use SHA-256 (and also, why we may want to anyway): https://security.stackexchange.com/questions/112029/should-sha-1-be-used-with-rsa-oaep One thing to note is that the OpenSSL CLI utility uses SHA-1. Right now, I have a utility script which uses that to encrypt secrets so that it's easy for anyone to encrypt a secret without installing many dependencies. Switching to another hash function would probably mean we wouldn't be able to use that anymore. But that's also true for other systems (see below). In short, PKCS#1 pros: Simple, nicely packaged asymmetric encryption, hides plaintext message length (up to its limit). Cons: limited to 470 bytes (or less). Generally, when faced with the prospect of encrypting longer messages, the advice is to adopt a hybrid encryption scheme (as opposed to, say, chaining RSA messages together, or increasing the RSA key size) which uses symmetric encryption with a single-use key for the message and asymmetric encryption to hide the key. If we want Zuul to support the encryption of longer secrets, we may want to adopt the hybrid approach. A frequent hybrid approach is to encrypt the message with AES, and then encrypt the AES key with RSA. The hiera-eyaml work which originally inspired some of this is based on PKCS#7 with AES as the cipher -- ultimately a hybrid approach. An interesting aspect of that implementation is that the use of PKCS#7 as a message passing format allows for multiple possible underlying ciphers since the message is wrapped in ASN.1 and is self-descriptive. We might have simply chosen to go with that except that there don't seem to be many good options for implementing this in Python, largely because of the nightmare that is ASN.1 parsing. The system we have devised for including encrypted content in our YAML files involves a YAML tag which specifies the encryption scheme. So we can evolve our use to add or remove systems as needed in the future. So to break this down into a series of actionable questions: 1) Do we want a system to support encrypting longer secrets? Our PKCS#1 system supports up to 470 bytes. That should be sufficient for most passwords and API keys, but unlikely to be sufficient for some certificate related systems, etc. 2) If so, what system should we use? 2.1a) GPG? This has hybrid encryption and transport combined. Implementation is likely to be a bit awkward, probably involving popen to external processes. 2.1b) RSA+AES? This recommendation from the pycryptodome documentation illustrates a typical hybrid approach: https://pycryptodome.readthedocs.io/en/latest/src/examples.html#encrypt-data-with-rsa The transport protocol would likely just be the concatenation of the RSA and AES encrypted data, as it is in that example. We can port that example to use the python-cryptography primatives, or we can switch to pycryptodome and use it exactly. 2.1c) RSA+Fernet? We can stay closer to the friendly recipes in python-cryptography. While there is no complete hybrid recipe, there is a symmetric recipe for "Fernet" which is essentially a recipe for AES encryption and transport. We could encode the Fernet key with RSA and concatenate the Fernet token. https://github.com/fernet/spec/blob/master/Spec.md 2.1d) NaCL? A "sealed box" in libsodium (which underlies PyNaCL) would do what we want with a completely different set of algorithms. https://github.com/pyca/pynacl/issues/189 3) Do we think it is important to hide the length of the secret? AES will expose the approximate length of the secret up to the block size (16 bytes). This is probably not important for long secrets, but for short ones, it may at least indicate the order of magnitude of a password, for instance. If we want, we can pad the secret further before encrypting it. 4) If we adopt a system for longer secrets, do we still want to include PKCS#1 for shorter ones? The PKCS#1 ciphertext will be shorter than the same secret would be in a hybrid system. But either way, we're talking about a fairly big blob (about 9 lines of base64 for PKCS#1). 5) If we keep PKCS#1, are we okay using SHA-1 or should we see about using SHA-256? 6) Considering all of that, should we: A) Stick with PKCS#1 for now (we can always add something else later) B) Keep PKCS#1 and add something else now C) Drop PKCS#1 and replace it with something else Thanks, Jim __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev