On Tue, Apr 14, 2015 at 01:20:58PM -0500, Andrew Noonan wrote: > I've noticed that at busy times, sends to Gmail in particular get > backed up, causing delays of 10-30 minutes. This was likely happening > before, but with the larger queue sizes, I didn't notice. The systems > are largely idle, and with in-memory queues, disk I/O isn't an issue.
If your connections are tarpitted by the destination, there's not much you can do, other than work with the receiving site to allow better throughput. When input rates exceed output rates queues grow. > We split out gmail into it's own transport ages ago, as Gmail > addresses have been our #1 destination for years now (about 10M a day > today). Even at peak, each VM is sending under 100K emails an hour > (bottlenecks on the sending program), and Google happily accepts the > email from us as fast as we send it without deferral, What does "as fast as we send it" mean? Since your queue is growing, clearly that's not the case. http://www.postfix.org/QSHAPE_README.html You need to analyze the delays=a/b/c/d log entries. In particular your throughput is essentially the concurrency limit divided by the "typical" "c+d". > maximal_backoff_time = 1800 Increase this to 2 hours. > minimal_backoff_time = 600 > queue_run_delay = 600 Reduce these to 300s. > gmail unix - - n - 50 smtp > -o smtp_connection_cache_time_limit=15 This setting is too long. Don't hog idle connections. Leave this at the default value. > -o smtp_destination_concurrency_limit=100 This has no effect. The parameter needs to be set in main.cf since it is used by the queue manager, the prefix is the transport name, not the delivery agent name. main.cf: gmail_destination_concurrency_limit=100 Assuming of course that Google will tolerate this concurrency level. You've actually been using the default which is 20. Don't raise it too quickly, try 50 and see whether that makes things better or worse. > Apr 13 23:52:16 xxx-mail88 postfix88/qmgr[22994]: 56F77E7784AD: > from=<al...@example.com>, size=73004, nrcpt=1 (queue active) > Apr 14 00:00:00 xxx-mail88 postfix88/smtp[13542]: 56F77E7784AD: > to=<persongettingtheal...@gmail.com>, > relay=gmail-smtp-in.l.google.com[173.194.72.27]:25, conn_use=19, > delay=465, delays=0.1/463/0.29/1.6, dsn=2.0.0, status=sent (250 2.0.0 > OK 1428987600 no10si18948696pdb.63 - gsmtp) The transaction latency is close to 2 seconds. Thus your throughput is ~20/2 or 10 messages per second, that's 36k per hour (under 100k per hour as you noted). > Usually I'm seeing about 20-25 of the gmail smtp processes in the > process list at peak, but maybe 10000 gmail emails in the active queue > as reported by qshape. * Learn to live with the delays. Users who get a once a week email can probably tolerate a few hours delay. OR * Reduce latency: Improve your DNS so that Google's queries for your DNS data are answered faster. Work with Gmail to process your email more quickly. Get whitelisted, ... Improve your network capacity, ... Colocate your servers closer to Google's datacenters, ... OR * Increase concurrency, if Google will let you. OR * Simplify your message content, so that Google's content scanners spend less time analyzing its URLs. -- Viktor.