From: Arjun Roy
TCP zerocopy receive is used by high performance network applications
to further scale. For RX zerocopy, the memory containing the network
data filled by the network driver is directly mapped into the address
space of high performance applications. To keep the TLB cost low,
these
From: Arjun Roy
TCP zerocopy receive is used by high performance network applications
to further scale. For RX zerocopy, the memory containing the network
data filled by the network driver is directly mapped into the address
space of high performance applications. To keep the TLB cost low,
these
From: Arjun Roy
getsockopt(TCP_ZEROCOPY_RECEIVE) has a bug where we read a
user-provided "len" field of type signed int, and then compare the
value to the result of an "offsetofend" operation, which is unsigned.
Negative values provided by the user will be promoted to la
From: Arjun Roy
Explicitly define reserved field and require it and any subsequent
fields to be zero-valued for now. Additionally, limit the valid CMSG
flags that tcp_zerocopy_receive accepts.
Fixes: 7eeba1706eba ("tcp: Add receive timestamp support for receive zerocopy.")
Signed-off
From: Arjun Roy
Explicitly define reserved field and require it to be 0-valued.
Fixes: 7eeba1706eba ("tcp: Add receive timestamp support for receive zerocopy.")
Signed-off-by: Arjun Roy
Signed-off-by: Eric Dumazet
Signed-off-by: Soheil Hassas Yeganeh
Suggested-by: David Ahern
Su
From: Arjun Roy
Explicitly define reserved field and require it to be 0-valued.
Fixes: 7eeba1706eba ("tcp: Add receive timestamp support for receive zerocopy.")
Signed-off-by: Arjun Roy
Signed-off-by: Eric Dumazet
Signed-off-by: Soheil Hassas Yeganeh
Suggested-by: David Ahern
Su
From: Arjun Roy
Explicitly define reserved field and require it to be 0-valued.
Fixes: 7eeba1706eba ("tcp: Add receive timestamp support for receive zerocopy.")
Signed-off-by: Arjun Roy
Signed-off-by: Eric Dumazet
Signed-off-by: Soheil Hassas Yeganeh
Suggested-by: David Ahern
Su
From: Arjun Roy
At present, tcp_recvmsg() uses flags to track if any CMSGs are pending
and what those CMSGs are. These flags are currently magic numbers,
used only within tcp_recvmsg().
To prepare for receive timestamp support in tcp receive zerocopy,
gently refactor these magic numbers into
From: Arjun Roy
tcp_recvmsg() uses the CMSG mechanism to receive control information
like packet receive timestamps. This patch adds CMSG fields to
struct tcp_zerocopy_receive, and provides receive timestamps
if available to the user.
Signed-off-by: Arjun Roy
---
include/uapi/linux/tcp.h
From: Arjun Roy
Provide CMSG and receive timestamp support to TCP
receive zerocopy. Patch 1 refactors CMSG pending state for
tcp_recvmsg() to avoid the use of magic numbers; patch 2 implements
receive timestamp via CMSG support for receive zerocopy, and uses the
constants added in patch 1.
v2
From: Arjun Roy
At present, tcp_recvmsg() uses flags to track if any CMSGs are pending
and what those CMSGs are. These flags are currently magic numbers,
used only within tcp_recvmsg().
To prepare for receive timestamp support in tcp receive zerocopy,
gently refactor these magic numbers into
From: Arjun Roy
This patch series provides CMSG and receive timestamp support to TCP
receive zerocopy. Patch 1 refactors CMSG pending state for
tcp_recvmsg() to avoid the use of magic numbers; patch 2 implements
receive timestamp via CMSG support for receive zerocopy, and uses the
constants
From: Arjun Roy
tcp_recvmsg() uses the CMSG mechanism to receive control information
like packet receive timestamps. This patch adds CMSG fields to
struct tcp_zerocopy_receive, and provides receive timestamps
if available to the user.
Signed-off-by: Arjun Roy
Signed-off-by: Eric Dumazet
From: Arjun Roy
A prior patch increased the size of struct tcp_zerocopy_receive
but did not update do_tcp_getsockopt() handling to properly account
for this.
This patch simply reintroduces content erroneously cut from the
referenced prior patch that handles the new struct size.
Fixes
From: Arjun Roy
Refactor skb frag fast-forwarding for tcp receive zerocopy. This is
part of a patch set that introduces short-circuited hybrid copies
for small receive operations, which results in roughly 33% fewer
syscalls for small RPC scenarios.
skb_advance_to_frag(), given a skb and an
From: Arjun Roy
Refactor frag-is-remappable test for tcp receive zerocopy. This is
part of a patch set that introduces short-circuited hybrid copies
for small receive operations, which results in roughly 33% fewer
syscalls for small RPC scenarios.
Signed-off-by: Arjun Roy
Signed-off-by: Eric
From: Arjun Roy
Zapping pages is required only if we are calling vm_insert_page into a
region where pages had previously been mapped. Receive zerocopy allows
reusing such regions, and hitherto called zap_page_range() before
calling vm_insert_page() in that range.
zap_page_range() can also be
From: Arjun Roy
Refactor tcp_recvmsg() by splitting it into locked and unlocked
portions. Callers already holding the socket lock and not using
ERRQUEUE/cmsg/busy polling can simply call tcp_recvmsg_locked().
This is in preparation for a short-circuit copy performed by
TCP receive zerocopy for
From: Arjun Roy
Sometimes, we may call tcp receive zerocopy when inq is 0,
or inq < PAGE_SIZE, or inq is generally small enough that
it is cheaper to copy rather than remap pages.
In these cases, we may want to either return early (inq=0) or
attempt to use the provided copy buffer to sim
From: Arjun Roy
Set zerocopy hint, event when falling back to copy, so that the
pending data can be efficiently received using zerocopy when
possible.
Signed-off-by: Arjun Roy
Signed-off-by: Eric Dumazet
Signed-off-by: Soheil Hassas Yeganeh
---
net/ipv4/tcp.c | 45
From: Arjun Roy
Sometimes, we may call tcp receive zerocopy when inq is 0,
or inq < PAGE_SIZE, in which case we cannot remap pages. In this case,
simply return the appropriate hint for regular copying without taking
mmap_sem.
Signed-off-by: Arjun Roy
Signed-off-by: Eric Dumazet
Signed-off
From: Arjun Roy
When TCP receive zerocopy does not successfully map the entire
requested space, it outputs a 'hint' that the caller should recvmsg().
Augment zerocopy to accept a user buffer that it tries to copy this
hint into - if it is possible to copy the entire hint, it will d
From: Arjun Roy
This patchset contains several optimizations for TCP Recv. Zerocopy.
v3: Fixes 32-bit compilation, stylistic issues and re-adds signoffs.
Summarized:
1. It is possible that a read payload is not exactly page aligned -
that there may exist "straggler" bytes that we
From: Arjun Roy
Sometimes, we may call tcp receive zerocopy when inq is 0,
or inq < PAGE_SIZE, or inq is generally small enough that
it is cheaper to copy rather than remap pages.
In these cases, we may want to either return early (inq=0) or
attempt to use the provided copy buffer to sim
From: Arjun Roy
Zapping pages is required only if we are calling vm_insert_page into a
region where pages had previously been mapped. Receive zerocopy allows
reusing such regions, and hitherto called zap_page_range() before
calling vm_insert_page() in that range.
zap_page_range() can also be
From: Arjun Roy
Set zerocopy hint, event when falling back to copy, so that the
pending data can be efficiently received using zerocopy when
possible.
---
net/ipv4/tcp.c | 45 +
1 file changed, 45 insertions(+)
diff --git a/net/ipv4/tcp.c b/net/ipv4
From: Arjun Roy
Sometimes, we may call tcp receive zerocopy when inq is 0,
or inq < PAGE_SIZE, in which case we cannot remap pages. In this case,
simply return the appropriate hint for regular copying without taking
mmap_sem.
---
net/ipv4/tcp.c | 8
1 file changed, 8 inserti
From: Arjun Roy
Refactor frag-is-remappable test for tcp receive zerocopy. This is
part of a patch set that introduces short-circuited hybrid copies
for small receive operations, which results in roughly 33% fewer
syscalls for small RPC scenarios.
---
net/ipv4/tcp.c | 34
From: Arjun Roy
Refactor tcp_recvmsg() by splitting it into locked and unlocked
portions. Callers already holding the socket lock and not using
ERRQUEUE/cmsg/busy polling can simply call tcp_recvmsg_locked().
This is in preparation for a short-circuit copy performed by
TCP receive zerocopy for
From: Arjun Roy
This patchset contains several optimizations for TCP Recv. Zerocopy.
Note this is v2 of the patchset, fixing two 32-bit compilation errors
and a stylistic error.
Summarized:
1. It is possible that a read payload is not exactly page aligned -
that there may exist "stra
From: Arjun Roy
When TCP receive zerocopy does not successfully map the entire
requested space, it outputs a 'hint' that the caller should recvmsg().
Augment zerocopy to accept a user buffer that it tries to copy this
hint into - if it is possible to copy the entire hint, it will d
From: Arjun Roy
Refactor skb frag fast-forwarding for tcp receive zerocopy. This is
part of a patch set that introduces short-circuited hybrid copies
for small receive operations, which results in roughly 33% fewer
syscalls for small RPC scenarios.
skb_advance_to_frag(), given a skb and an
From: Arjun Roy
Zapping pages is required only if we are calling vm_insert_page into a
region where pages had previously been mapped. Receive zerocopy allows
reusing such regions, and hitherto called zap_page_range() before
calling vm_insert_page() in that range.
zap_page_range() can also be
From: Arjun Roy
This patchset contains several optimizations for TCP Recv. Zerocopy.
Summarized:
1. It is possible that a read payload is not exactly page aligned -
that there may exist "straggler" bytes that we cannot map into the
caller's address space cleanly. For this, we a
From: Arjun Roy
Sometimes, we may call tcp receive zerocopy when inq is 0,
or inq < PAGE_SIZE, or inq is generally small enough that
it is cheaper to copy rather than remap pages.
In these cases, we may want to either return early (inq=0) or
attempt to use the provided copy buffer to sim
From: Arjun Roy
Sometimes, we may call tcp receive zerocopy when inq is 0,
or inq < PAGE_SIZE, in which case we cannot remap pages. In this case,
simply return the appropriate hint for regular copying without taking
mmap_sem.
Signed-off-by: Arjun Roy
Signed-off-by: Eric Dumazet
Signed-off
From: Arjun Roy
Set zerocopy hint, event when falling back to copy, so that the
pending data can be efficiently received using zerocopy when
possible.
Signed-off-by: Arjun Roy
Signed-off-by: Eric Dumazet
Signed-off-by: Soheil Hassas Yeganeh
---
net/ipv4/tcp.c | 45
From: Arjun Roy
Refactor frag-is-remappable test for tcp receive zerocopy. This is
part of a patch set that introduces short-circuited hybrid copies
for small receive operations, which results in roughly 33% fewer
syscalls for small RPC scenarios.
Signed-off-by: Arjun Roy
Signed-off-by: Eric
From: Arjun Roy
When TCP receive zerocopy does not successfully map the entire
requested space, it outputs a 'hint' that the caller should recvmsg().
Augment zerocopy to accept a user buffer that it tries to copy this
hint into - if it is possible to copy the entire hint, it will d
From: Arjun Roy
Refactor skb frag fast-forwarding for tcp receive zerocopy. This is
part of a patch set that introduces short-circuited hybrid copies
for small receive operations, which results in roughly 33% fewer
syscalls for small RPC scenarios.
skb_advance_to_frag(), given a skb and an
From: Arjun Roy
Refactor tcp_recvmsg() by splitting it into locked and unlocked
portions. Callers already holding the socket lock and not using
ERRQUEUE/cmsg/busy polling can simply call tcp_recvmsg_locked().
This is in preparation for a short-circuit copy performed by
TCP receive zerocopy for
From: Arjun Roy
With SO_RCVLOWAT, under memory pressure,
it is possible to enter a state where:
1. We have not received enough bytes to satisfy SO_RCVLOWAT.
2. We have not entered buffer pressure (see tcp_rmem_pressure()).
3. But, we do not have enough buffer space to accept more packets.
In
From: Arjun Roy
With SO_RCVLOWAT, under memory pressure,
it is possible to enter a state where:
1. We have not received enough bytes to satisfy SO_RCVLOWAT.
2. We have not entered buffer pressure (see tcp_rmem_pressure()).
3. But, we do not have enough buffer space to accept more packets.
In
43 matches
Mail list logo