Although the epoll_wait timeout defaults to 10ms, it can be much longer between calls to do I/O. After calling epoll(), for each data transfer a callback is invoked on a VConnection, and therefore I/O will not be done again until all of the pending I/O has been handled. This can take a non-trivial amount of time. You may want to look at the event loop stats to see if that is happening.
https://docs.trafficserver.apache.org/en/8.0.x/admin-guide/monitoring/statistics/core/misc.en.html#stat-proxy-process-eventloop-count On Thu, Sep 12, 2019 at 10:30 PM Leif Hedstrom <zw...@apache.org> wrote: > > > > On Sep 12, 2019, at 4:57 PM, Bryan Call <bc...@apache.org> wrote: > > > > I would double check your buffer settings just in case: > > $ sysctl -a | grep tcp | grep mem > > net.ipv4.tcp_rmem = 4096 87380 6291456 > > net.ipv4.tcp_wmem = 4096 16384 4194304 > > > > $ traffic_ctl config match buffer > > proxy.config.net.sock_send_buffer_size_in: 2097152 > > proxy.config.net.sock_recv_buffer_size_in: 0 > > proxy.config.net.sock_send_buffer_size_out: 0 > > proxy.config.net.sock_recv_buffer_size_out: 2097152 > > > The other thing to watch out for is that if you increase ATS buffer like > the above, every connection will always use that much memory. We had a > situation where that caused things to consume all of the kernels allowed > memory for sockets, and then things got sour really quick (that max memory > is a percentage of all available memory, another sysctl). You really do not > want that to happen :). > > We’ve since removed the ATS settings (setting them only to 0), allowing > for the two sysctl’s above to take effect, autotuning the buffer settings. > However, we increased those as well accordingly, I believe upstream has > also change defaults, but something like this might be acceptable: > > 32768 131072 8388608 > > > And yes, as Bryan points out, for some use cases, that might make things > slightly slower (due to BDP), but it’s a tradeoff. We reclaimed significant > amounts of memory by letting the autotuning to try its best (with the > modified min / initial / max settings). > > Cheers, > > — leif > > > > > You can take a look at the buffers and windows sizes in the kernel, but > it is kinda hard to match that up to what ATS is doing. You might be able > to take the ATS logs, if you have all the milestone information and > correlate it to a connection or the strace logs. > > $ ss -tnei > > > > Another possibility is to have ATS get the tcpinfo information > every-time it does a write to see if there has been able delay in the > socket connection. > > > > -Bryan > > > > > >> On Sep 12, 2019, at 11:30 AM, Chou, Peter <pbc...@labs.att.com> wrote: > >> > >> Bryan, > >> > >> Thanks for the response. Good reminder about the tranmission buffer > limiting the TCP tranmission window > >> which needs to be sized for the bandwidth delay product. I am a L2/L3 > guy so not a TCP expert :-). > >> However, I don't know whether the default (I believe 1MB in our RHEL > release) causes any problems for this > >> with our situation. > >> > >> We were more focused on whether a smaller TCP transmission buffer size > required more > >> frequent servicing by ATS, and if ATS had problems keeping the buffer > from emptying while data still needed > >> to be sent. We did some straces of ATS behavior, and we found that > sometimes the delay between successive > >> writev() following a EAGAIN event was fairly long (long enough it could > jeopardize time constraints for delivery > >> of streaming data). In my development box, I saw the delay between > EAGAIN and retry vary between 8ms to 1300ms. > >> I believe Jeremy saw a situation in the lab where it was 3-seconds! > >> > >> My setup was just a single VM with ATS 7.1.4 with curl (rate limited > option set for 1MB) with a previously cached 10MB > >> data file. I took a look at the code, and it seems on Linux ATS should > be using the epoll_wait mechanism > >> (10 ms time-out), which is driven by a polling continuation. I did not > see anything > >> there that should cause delay in retries of 1+ seconds. Any thoughts? > >> > >> Thanks, > >> Peter > >> > >> -----Original Message----- > >> From: Bryan Call <bc...@apache.org> > >> Sent: Thursday, September 12, 2019 9:24 AM > >> To: dev <dev@trafficserver.apache.org> > >> Subject: Re: TCP socket buffer size. > >> > >> I have seen issues where you can’t reach the max throughput of the > network connection without increasing the TCP buffers, because it effects > the max TCP window size (bandwidth-delay product). Here is a calculator I > have used before to figure out what your buffer size should be: > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.switch.ch_network_tools_tcp-5Fthroughput_&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=8c5kS62dKm3-obVyLvkwkc-kTTgV1vAsbxSPwL-yi3o&m=s_oHKVI9KE_kwjG4HQVPq3HQoi5fh_uBbjpB2xeOjYU&s=PTsYs3JTguDKsu9dHpHXoAIQpp7hyH0UXHky9R1rwww&e= > < > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.switch.ch_network_tools_tcp-5Fthroughput_&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=8c5kS62dKm3-obVyLvkwkc-kTTgV1vAsbxSPwL-yi3o&m=s_oHKVI9KE_kwjG4HQVPq3HQoi5fh_uBbjpB2xeOjYU&s=PTsYs3JTguDKsu9dHpHXoAIQpp7hyH0UXHky9R1rwww&e= > > > >> > >> Theoretically there should be some latency difference between having a > small buffer size vs a larger one (up to some limit), but my guess is it > would be hard to measure because it would be so small. > >> > >> -Bryan > >> > >> > >>> On Sep 11, 2019, at 11:50 AM, Chou, Peter <pbc...@labs.att.com> wrote: > >>> > >>> Hi all, > >>> > >>> Sometimes we see lots of EAGAIN result codes from ATS trying to write > to the TCP socket file descriptor. I presume this is typically due to > congestion or rate mis-match between client and ATS. Is there any benefit > to increasing the TCP socket buffer size which would reduce the number of > these write operations? Specifically, should we expect any kind of latency > difference as there is some concern about how long it takes ATS to > re-schedule that particular VC for another write attempt? > >>> > >>> Thanks, > >>> Peter > >> > > > >