On Wed, Jul 27, 2011 at 1:15 AM, Ralf Hildebrandt <ralf.hildebra...@charite.de> wrote: > * Steve Jenkins <stevejenk...@gmail.com>: > >> QSHAPE is one tool we were already using, and the good news is that >> even during a send process (one of which is going on right now), the >> active queue is generally very small. Like so: > > The output looks very good. No room for optimization!
Cool - that's what we were seeing, too. >> We do this one, because each message is unique (has to have an >> individual unsub link and contains the subscriber's name) > > Good! :) >> As far as parallel submissions, we're only doing three at a time >> (three SwiftMail processes sending at a time). Our in_flow_delay >> parameter is set to 0. We aren't receiving a lot of mail on this box, >> so I'm not sure that delay would even kick in if it were set to the 1s >> default. Beyond this, we're not sure how to check to see if the disk >> is being "overwhelmed with mail submissions." Out iowait% is 0.23, so >> the CPU isn't waiting for the disk. How else can we tell if we're >> overwhelming the disk? > > When you're overwhelming the disk, all IO would be dedicated to > accepting the mail from SwiftMail, not Sending the Mail out. Okay - makes sense. >> We're not really seeing problematic destinations. The mail is getting >> delivered right away when we attempt to deliver. it's just that our >> attempts don't seem to happen very quickly. > > Maybe your upstream network link is saturated? Not likely, it's a pretty large pipe, and everything looks clear there (and all other traffic through our switch is moving normally). >> Before we just blindly throw bigger hardware at the issue, we'd still >> love some ideas to help research what else could be slowing us down. > > default_process_limit is set to what? We haven't set it, so I presume that means it's using the default of 100. I just checked, and we're using 44 of them. I'm gonna take a hard look at the PHP script running the mailer through SwiftMailer today. At first I was thinking disk IO, but CPU is seeming to be a strong culprit. Thanks, SteveJ