* Victor Duchovni <victor.ducho...@morganstanley.com>: > On Thu, Jan 28, 2010 at 06:13:33PM +0100, Stefan Foerster wrote: > > If in a mail cluster, with multiple machines having access to a shared > > storage device (SAN, iSCSI) which is presented to the host as a normal > > block device (e.g. /dev/sda, hosting a normal ext3 filesystem), one of > > the mail nodes fails, what are the necessary Postfix steps to take > > over the queue on another host? > > > > I _think_ it is sufficient to provide the same configuration files as > > on the node which failed, > > If path names for the queue, data and configuration directory are different, > you may need to adjust these in the config files.
Well, that's kind of obvious :-) > > execute "postsuper -s" until the queue file > > names stop changing (which shouldn't happen at all, because it is the > > same physical filesystem) > > Only needed when restoring from backups, copying queue files, ... Not > needed when mounting a filesystem. I think the manpage for postsuper recommends executing it at least once before starting up Postfix. Can it do any harm in this specific scenario? > > What would happen to mails which weren't completely received when the > > original node crashed? Can I prevent qmgr from trying to deliver > > those? > > Nothing needs to be done. This one was giving me a headache. Good to know, thank you. One last thing: If the clocks are perfectly synchronized and the takeover didn't happen immediately but e.g. after 60 minutes (virtualized system, dynamic resource/node allocation), it could happen that the deferred queue holds a large number of messages which are due for a delivery retry. Or, to quote QSHAPE_README: ,---- | When a host with lots of deferred mail is down for some time, it is | possible for the entire deferred queue to reach its retry time | simultaneously. This can lead to a very full active queue once the | host comes back up. The phenomenon can repeat approximately every | maximal_backoff_time seconds if the messages are again deferred after | a brief burst of congestion. `---- If the node doesn't have to process any new incoming mail, will qmgr be able to handle six digit deferred queues? Stefan