On 5/22/25 23:00, Willy Tarreau wrote:
> On Thu, May 22, 2025 at 10:51:14PM -0400, Demi Marie Obenour wrote:
>> On 5/22/25 22:35, Willy Tarreau wrote:
>>> On Thu, May 22, 2025 at 09:49:20PM -0400, Demi Marie Obenour wrote:
>>>>> What do you think you would need ? For randoms, what do you qualify
>>>>> as strong random numbers for example ?Is it based on the period
>>>>> length ? On the way it's seeded ? Right now our PRNG has a 2^128 bits
>>>>> period and is seeded from: time, pid, ppid, urandom, RAND_bytes() when
>>>>> openssl is available, random(), ASLR, execution time, and host name,
>>>>> i.e. about everything we can locally collect that can differ between
>>>>> boots and machines.
>>>>
>>>> "Strong" means "suitable for cryptographic purposes".  The simplest
>>>> approach would be to get random numbers directly from OpenSSL, and
>>>> to use XChaCha20-Poly1305 with a random nonce for the encryption.
>>>> The XChaCha20-Poly1305 nonce is long enough that it can be generated
>>>> at random.
>>>
>>> Then we'd need a separate strong_random() function that is only
>>> available when openssl is enabled.
>>
>> On Linux, another option is to just call getrandom() directly, which
>> works on any recent kernel.
> 
> ... and blocks on some of them if lacking entropy at boot. It's only
> very recently that it finally adopted a timeout (which is still quite
> long by the way). We've had this problem already on some products. I'd
> rather seed a secure prng and only rely on it later, this also has the
> advantage of being portable.

That *should* be easy to mitigate by calling getrandom() during HAProxy
startup.  Once it stops blocking it should never block again.  Also,
there is RDRAND on many x86 CPUs.

>>>>> And for base64, if the output of the HMAC is of
>>>>> fixed size, base64 will be as well, so the part of the string used
>>>>> to construct it will be made of only table lookups. If your concern
>>>>> is that this property is not guaranteed over time, I can understand,
>>>>> but then we could simply add a comment on top of the function to
>>>>> mention that the processing time per input byte must remain constant.
>>>>
>>>> Table lookups are vulnerable to timing attacks, as shown by
>>>> Daniel J. Bernstein in 2005 [1].  libsodium has a constant-time
>>>> base64 decoder under a permissive license.
>>>>
>>>> [1]: https://cr.yp.to/antiforgery/cachetiming-20050414.pdf
>>>
>>> Thanks for the link. With that said, the attack described above only
>>> works because the tables do not fit in a cache line. For base64 the
>>> tables are a single cache line (64B).
>>
>> Unfortunately, this isn't enough: it turns out that access latency
>> can be different even within a cache line.
> 
> It will depend on architectures but sometimes that's true.
> 
>>> But we could pretty well replace our implementation with a constant
>>> time one if needed. From memory we have two implementations, the
>>> normal one and a URL-safe one. But do not hesitate to have a look at
>>> its replacement. If you need to import a file from libsodium, please
>>> place it in src/ for .c, or in include/import/ for .h, and try to
>>> change the least possible files there so that it's possible to update
>>> from time to time (like we do for xxhash, slz, trees etc). Otherwise
>>> if it's just a matter of replacing a function or two, it's OK to
>>> just copy them into base64.c, but then please mention in the comment
>>> on top of the function where it comes from (both to help check for
>>> updates and for crediting the original author).
>>
>> That should be doable, though I don't have any immediate plans to do
>> this in the near term.
> 
> OK!

One change that I would also like to see (but don’t have any immediate
plans to implement) is a “just use it” high-level encryption/decryption
function.  I'm thinking something like this:

root-secret /path/to/secret/key/file [auto-populate=true]
  Sets the "root" secret used for high-level encryption and decryption
  operations.  The file must be at least 32 bytes long and its contents
  must actually be secret.  The contents of the file are read before
  HAProxy chroots, so the path need not (and should not) be inside the
  chroot.  The file must be owned by the user HAProxy is launched as or
  by root, and must not be readable by any other user.

  HAProxy hashes the contents of the file before using it, so the format
  of the file doesn't matter.  The only thing that matters is the amount
  of entropy (randomness) in the file.  If you are managing HAProxy with
  a job management system with support for providing secrets, using one
  of these secrets for the contents of the file is a good idea.  For
  instance, both systemd credentials and Kubernetes secrets are suitable.

  If you don't use this directive, HAProxy will generate a good quality
  secret every time it starts up.  If you use this directive with the
  auto-populate option, and the file doesn't exist or is empty, HAProxy
  will save the secret into this file during startup.  This is a good
  default for distribution-provided configurations.

  If you use a cluster of HAProxy instances, all of them should use the
  same root secret to ensure that data (such as session cookies) encrypted
  by one instance can be decrypted by another.  If possible, you should
  use the secret-management facilities provided by your cluster manager to
  provide this secret.

  If you use this directive multiple times, HAProxy will use the last
  value for encryption, but will be able to decrypt data encrypted with
  any of the provided keys.  This allows for seamless key rotation without
  downtime or data loss.  You can use the control socket to provide new
  keys without a reload.

  Keys can be used by both converters in the configuration file and by Lua
  scripts.  Data encrypted via the converters can be decrypted in Lua
  and visa versa.

min-encryption-version version
  The minimum version of cryptography HAProxy should use.  Currently, the
  default and only supported version is 1.  This serves as a safeguard
  in case there is a flaw in HAProxy's encryption implementation, to allow
  invalidation of data encrypted with old versions.

haproxy_encrypt(domain,associated_data)
  Encrypts the raw byte input using the root secret, the given domain-separation
  string, and the given associated data.  Returns the encrypted data.

  This performs authenticated encryption with associated data (AEAD).
  Data can only be decrypted by haproxy_decrypt() if the domain-separation
  string and associated data are correct, the same root key is used,
  and the encrypted data has not been tampered with by someone without
  access to the root secret or the key derived from it.  The
  domain-separation string must be a constant and is used for key derivation
  at startup.  The associated data can either be a variable name or a quoted
  string.

haproxy_decrypt(domain,associated_data,success_var)
  Decrypts the raw byte input using the root secret, the given domain-separation
  string, and the given associated data.  On success, sets success_var to 1
  and returns the decrypted data.  On failure, sets success_var to 0 and
  returns the empty string.  The domain-separation string must be a constant
  and is used for key derivation at startup.  The associated data can either
  be a variable name (in which case the contents of the variable are used) or
  a quoted string.

>>>> Is loading a Lua module written in C an option?  Obviously that can't
>>>> be done at runtime, but would it make sense to do that at startup?
>>>
>>> I know that some do it to extend haproxy, but your C code needs to be
>>> careful not to make blocking calls nor to take too much time. That's
>>> the problem with calling external code, you need to be certain it was
>>> designed with extremely low latency in mind (and the reason why some
>>> existing Lua libs are causing problems). I don't know how that fits
>>> with lua-load-per-thread by the way.
>>
>> Is verifying an asymmetric signature too slow?
> 
> Normally not :-)

That's good :-)

>> lua-load-per-thread is
>> the only approach that makes sense here, as you certainly don't want to
>> be doing asymmetric cryptography while holding a global lock.
> 
> I agree! We'd even like to deprecate lua-load in favor of an explicit
> keyword that makes users conscious of the performance impact of the lock.

It might be useful to provide some form of sharded counter for rate limiting
and other purposes.

>>> At the very least it can be an option to easily experiment with extra
>>> code without having to patch haproxy. Another option for testing is to
>>> rely on LD_PRELOAD, but then you need to be super careful to respect
>>> the exact internal API and ABI (any build option counts).
>> This would be purely a Lua extension, not reliant on any of HAProxy's
>> header file.  
> 
> In this case that's totally fine.
Someone actually implemented Rust bindings for the Lua API at
https://github.com/khvzak/haproxy-api-rs, which is a LOT more
complex than anything I would come up with :-).
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

Attachment: OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to