On Monday, 4 May 2026 at 08:17, Eric Biggers <[email protected]> wrote:

> The zero-copy support is one of the riskiest aspects of AF_ALG.  It
> allows userspace to request cryptographic operations directly on
> pagecache pages of files like the 'su' binary.  It also allows userspace
> to concurrently modify the memory which is being operated on, a huge
> recipe for TOCTOU vulnerabilities.
>
> While zero-copy support is more valuable in other areas of the kernel
> like the frequently used networking and file I/O code, it has far less
> value in AF_ALG, which is a niche UAPI.  AF_ALG primarily just exists
> for backwards compatibility with a small set of userspace programs such
> as 'iwd' that haven't yet been fixed to use userspace crypto code.
>
> Originally AF_ALG was intended to be used to access hardware crypto
> accelerators.  However, it isn't an efficient interface for that anyway,
> and it turned out to be rarely used in this way in practice.
>
> Thus, the risks of the zero-copy support in AF_ALG vastly outweigh its
> benefits.  Just remove it.
>
> Note that this isn't a hard break, since the splice syscall is still
> supported.  The data is just now copied instead.  So it still works,
> just a bit slower in some cases.

Syd sandbox uses AF_ALG zero-copy for its Force Sandboxing[1] and Crypt 
Sandboxing[1].
Zero-copy means Syd does not have to copy sandbox process data into its own 
address
space providing safety and security. Switching to read/write rather than pipes 
and
splice breaks a fundamental safety guarantee for the sandbox. Please do not 
break
userspace.

Will sendfile(2) continue to work? How can i test? Please help me.

[1]: https://man.exherbo.org/syd.7.html#Force_Sandboxing
[2]: https://man.exherbo.org/syd.7.html#Crypt_Sandboxing

> Tested with libkcapi/test.sh.  All its test cases still pass.  I also
> verified that this would have prevented the copy.fail exploit as well.
>
> Fixes: 8ff590903d5f ("crypto: algif_skcipher - User-space interface for 
> skcipher operations")
> Fixes: 400c40cf78da ("crypto: algif - add AEAD support")
> Reported-by: Taeyang Lee <[email protected]>
> Reported-by: Feng Ning <[email protected]>
> Cc: [email protected]
> Signed-off-by: Eric Biggers <[email protected]>
> ---
>  Documentation/crypto/userspace-if.rst | 30 ++---------
>  crypto/af_alg.c                       | 73 +++++++++------------------
>  crypto/algif_aead.c                   |  8 +--
>  3 files changed, 32 insertions(+), 79 deletions(-)
>
> diff --git a/Documentation/crypto/userspace-if.rst 
> b/Documentation/crypto/userspace-if.rst
> index 021759198fe7..80eb2819901a 100644
> --- a/Documentation/crypto/userspace-if.rst
> +++ b/Documentation/crypto/userspace-if.rst
> @@ -325,37 +325,13 @@ CRYPTO_USER_API_RNG_CAVP option:
>     but only after the entropy has been set.
>
>  Zero-Copy Interface
>  -------------------
>
> -In addition to the send/write/read/recv system call family, the AF_ALG
> -interface can be accessed with the zero-copy interface of
> -splice/vmsplice. As the name indicates, the kernel tries to avoid a copy
> -operation into kernel space.
> -
> -The zero-copy operation requires data to be aligned at the page
> -boundary. Non-aligned data can be used as well, but may require more
> -operations of the kernel which would defeat the speed gains obtained
> -from the zero-copy interface.
> -
> -The system-inherent limit for the size of one zero-copy operation is 16
> -pages. If more data is to be sent to AF_ALG, user space must slice the
> -input into segments with a maximum size of 16 pages.
> -
> -Zero-copy can be used with the following code example (a complete
> -working example is provided with libkcapi):
> -
> -::
> -
> -    int pipes[2];
> -
> -    pipe(pipes);
> -    /* input data in iov */
> -    vmsplice(pipes[1], iov, iovlen, SPLICE_F_GIFT);
> -    /* opfd is the file descriptor returned from accept() system call */
> -    splice(pipes[0], NULL, opfd, NULL, ret, 0);
> -    read(opfd, out, outlen);
> +AF_ALG used to have zero-copy support, but it was removed due to it being a
> +frequent source of vulnerabilities.  For backwards compatibility the splice
> +system call is still supported, but the data will simply be copied.
>
>
>  Setsockopt Interface
>  --------------------
>
> diff --git a/crypto/af_alg.c b/crypto/af_alg.c
> index 5a00c18eb145..fce0b87c2b65 100644
> --- a/crypto/af_alg.c
> +++ b/crypto/af_alg.c
> @@ -971,11 +971,11 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr 
> *msg, size_t size,
>               struct scatterlist *sg;
>               size_t len = size;
>               ssize_t plen;
>
>               /* use the existing memory in an allocated page */
> -             if (ctx->merge && !(msg->msg_flags & MSG_SPLICE_PAGES)) {
> +             if (ctx->merge) {
>                       sgl = list_entry(ctx->tsgl_list.prev,
>                                        struct af_alg_tsgl, list);
>                       sg = sgl->sg + sgl->cur - 1;
>                       len = min_t(size_t, len,
>                                   PAGE_SIZE - sg->offset - sg->length);
> @@ -1015,64 +1015,41 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr 
> *msg, size_t size,
>                                list);
>               sg = sgl->sg;
>               if (sgl->cur)
>                       sg_unmark_end(sg + sgl->cur - 1);
>
> -             if (msg->msg_flags & MSG_SPLICE_PAGES) {
> -                     struct sg_table sgtable = {
> -                             .sgl            = sg,
> -                             .nents          = sgl->cur,
> -                             .orig_nents     = sgl->cur,
> -                     };
> -
> -                     plen = extract_iter_to_sg(&msg->msg_iter, len, &sgtable,
> -                                               MAX_SGL_ENTS - sgl->cur, 0);
> -                     if (plen < 0) {
> -                             err = plen;
> +             do {
> +                     struct page *pg;
> +                     unsigned int i = sgl->cur;
> +
> +                     plen = min_t(size_t, len, PAGE_SIZE);
> +
> +                     pg = alloc_page(GFP_KERNEL);
> +                     if (!pg) {
> +                             err = -ENOMEM;
>                               goto unlock;
>                       }
>
> -                     for (; sgl->cur < sgtable.nents; sgl->cur++)
> -                             get_page(sg_page(&sg[sgl->cur]));
> +                     sg_assign_page(sg + i, pg);
> +
> +                     err = memcpy_from_msg(page_address(sg_page(sg + i)),
> +                                           msg, plen);
> +                     if (err) {
> +                             __free_page(sg_page(sg + i));
> +                             sg_assign_page(sg + i, NULL);
> +                             goto unlock;
> +                     }
> +
> +                     sg[i].length = plen;
>                       len -= plen;
>                       ctx->used += plen;
>                       copied += plen;
>                       size -= plen;
> -             } else {
> -                     do {
> -                             struct page *pg;
> -                             unsigned int i = sgl->cur;
> -
> -                             plen = min_t(size_t, len, PAGE_SIZE);
> -
> -                             pg = alloc_page(GFP_KERNEL);
> -                             if (!pg) {
> -                                     err = -ENOMEM;
> -                                     goto unlock;
> -                             }
> -
> -                             sg_assign_page(sg + i, pg);
> -
> -                             err = memcpy_from_msg(
> -                                     page_address(sg_page(sg + i)),
> -                                     msg, plen);
> -                             if (err) {
> -                                     __free_page(sg_page(sg + i));
> -                                     sg_assign_page(sg + i, NULL);
> -                                     goto unlock;
> -                             }
> -
> -                             sg[i].length = plen;
> -                             len -= plen;
> -                             ctx->used += plen;
> -                             copied += plen;
> -                             size -= plen;
> -                             sgl->cur++;
> -                     } while (len && sgl->cur < MAX_SGL_ENTS);
> -
> -                     ctx->merge = plen & (PAGE_SIZE - 1);
> -             }
> +                     sgl->cur++;
> +             } while (len && sgl->cur < MAX_SGL_ENTS);
> +
> +             ctx->merge = plen & (PAGE_SIZE - 1);
>
>               if (!size)
>                       sg_mark_end(sg + sgl->cur - 1);
>       }
>
> diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
> index cb651ab58d62..c6c2ce21895d 100644
> --- a/crypto/algif_aead.c
> +++ b/crypto/algif_aead.c
> @@ -7,14 +7,14 @@
>   * This file provides the user-space API for AEAD ciphers.
>   *
>   * The following concept of the memory management is used:
>   *
>   * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
> - * filled by user space with the data submitted via sendmsg (maybe with
> - * MSG_SPLICE_PAGES).  Filling up the TX SGL does not cause a crypto 
> operation
> - * -- the data will only be tracked by the kernel. Upon receipt of one 
> recvmsg
> - * call, the caller must provide a buffer which is tracked with the RX SGL.
> + * filled by user space with the data submitted via sendmsg.  Filling up the 
> TX
> + * SGL does not cause a crypto operation -- the data will only be tracked by 
> the
> + * kernel. Upon receipt of one recvmsg call, the caller must provide a buffer
> + * which is tracked with the RX SGL.
>   *
>   * During the processing of the recvmsg operation, the cipher request is
>   * allocated and prepared. As part of the recvmsg operation, the processed
>   * TX buffers are extracted from the TX SGL into a separate SGL.
>   *
>
> base-commit: 6d35786de28116ecf78797a62b84e6bf3c45aa5a
> --
> 2.54.0
>
>
>

Attachment: publickey - [email protected] - 0x55838BF3.asc
Description: application/pgp-keys

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to