> Anyone aware of a particularly good discussion of building a farm of
> vpopmail "compliant" front-end machines for user access to a central
> file server via NFS on linux?  I'm concerned that I haven't thought
> through issues in how to properly account for webmail/IMAP, MySQL for
> storing smtp-auth IPs for relay control, and a few other topics.
> Googling hasn't yielded much but a few threads from the *BSD folks.
>
I was involved in setting up/administering a rather large cluster when I
was consulting for a mid-sized college, which is probably bigger than you
want to get into, but I'll share a few items:

> My tentative thinking is 2+ front end machines that draw from a
> common/identical configuration that provide the client interfaces via:
> - SMTPd, smtp-auth, pop3d, send, IMAPd, anti-virus, anti-spam, webmail
> (apache + squirrelmail)
> - CHKUSER talking to the backend server
> - Local /var/qmail/ (typical) install for queue, bin, supervise,
> etc...   possibly taken from the central, backend server via nightly
> rsync where needed.
> - NFS client communication to the central backend server
>
> A single, large server provides the "backend" services to these machines
> for:
> - MySQL server (for smtp-auth tracking, squirrelmail prefs/abook/sigs,
> users, domains)
> - NFS Service providing Client-mounted folder(s) for the domains' email.
>
We had local installs of qmail/vpopmail/etc. on all of the front-end
servers, and then nfs-mounted certain shared things from the back-end
instead of the local folders.  Originally we nfs-mounted them directly,
but network problems with nested NFS mounts led us to mount them in a
separate tree and use symlinks - e.g. /var/qmail/control was symlinked
/mnt/qmail-control, which in turn nfs-mounted
backend:/export/qmail/control.  We imported the following from the
back-end server:
  /var/qmail/control
  /var/qmail/users
  /var/vpopmail/domains
  /var/vpopmail/domains/<primary domain>/<hash dirs>

Let me explain the last bit - we hosted several domains on the cluster,
but only the primary one was big, so we left the rest on a single
centralized FS.  We then created each hash dir (0-9,A-Z,a-z) as a separate
FS and exported them individually - the speed of not having anywhere near
as many emails on each FS made up for the NFS overhead.  It originally was
all one FS, then we went to multiple FSs on a single server, and
eventually we migrated to 4 back-end servers each exporting a quarter of
the hash dirs.  We also installed vpopmail and POP/IMAP servers on the
back end, and used perdition on the front ends to have the POP/IMAP
connections go directly to the correct back-end server whenever possible. 
We kept one server with all the NFS mounts as a fallback for POP/IMAP, but
it was rarely used.  One other thing we did was a perl script with a DB
back end indicating which NFS mount came from where - it allowed us to do
some shuffling as needed, and also got around the problem of trying to
mount the NFS shares before the network ports had come up, something that
happens a lot with certain combinations of NICs and switches...

For inbound mail we still needed the NFS mounts, although I was working on
and eventually testing a method of doing distributed SMTP as well, using
the front-end systems as a queue and figuring out how to send it to the
correct back-end, also running qmail/vpopmail.  As far as I know if never
left testing after I moved on, but it worked AFAIK.

For the SQL, we used one of the back-ends as a master and all of the other
systems (front- and back-end) as slaves, then had vpopmail do all reads
from the local system and write to the master, which then sent updates. 
IIRC there was a weekly script early sunday morning that tarred up the
master DB and refreshed the slaves to clean up the binlogs as well.

Other than webmail, every front-end system could do anything, and we
routinely would change the DNS entries to have different servers provide
different services so we could take systems offline for maintenance or
clear out large queues.  Webmail was only on a couple of systems, one live
and one for testing.  If you're going to use webmail, a caching IMAP proxy
is a must - it boosts webmail performance by a ton.

Oh, put your NFS traffic on a separate physical network from your public
communications - we used a second interface running at gigabit speed on
RFC1918 address space for the front-end to back-end communications, which
helped a lot when the public connections were being stressed.

> Any special compile/configuration suggestions to support this that I
> wouldn't normally use on a single-box solution?  Should the client
> machines be logging to their local drives, to an NFS mounted drive, or
> log over the network (like syslog-ng, even possible with multilog???) to
> any particular host?
>
We logged locally - we were looking at a way to get everything aggregated
as well, but never really found a good answer.  In any case, I wrote a
patch to qmail that adds the machine name to the received header
("Received by qmail _on <hostname>_") to make tracing mails easier - I
don't think I've ever publicly released it, but I'm currently working on
releasing a large patch cluster including it, and I plan on putting the
individual patches up as well - email me if you want/need it before I get
around to putting the cluster up...

> Any administrative issues that grow through this distributed model?  I'm
> thinking about whether vqadmin or qmailadmin will continue to function
> correctly when run from any of the "farm" machines?  Would I just allow
> one "admin" machine for vqadmin/qmailadmin to prevent issues?
>
If you use a SQL back-end for vpopmail and have vpopmail installed and
properly configured on whatever machine you use the admin tool on, you
should be fine.  Just make sure you have write-access to the NFS-mounted
directories (via no_root_squash) or you'll be in for a world of hurt :) 
Also, the /service directories were handled locally - it was safer to copy
updated run files (many of which needed to be customized for individual
machines) across the cluster than try to run them from NFS.

> Any risks of data collision/overlap or other issues that might surface
> with this multi-server model?  Central MySQL should solve most of this,
> right?
> THANKS!!!!!
> D.

None that I've seen, except if you need individualized control files on
separate machines.  The one that I remember most was the outgoingip
control file (if you use that patch), which we solved by making
/var/qmail/control/outgoingip (on the NFS mount) a symlink for
/etc/qmail/outgoingip, and then having separate files on the local server
with the correct address in each.  I know there were a few more, but I
don't remember what.

One last suggestions - we used identical machines with hardware RAID
mirroring for the front-end systems.  If we ever needed to add a machine
to the cluster, it was as simple as pull one drive from the mirror set
(replacing it immediately with a spare - the rebuild of a 36GB U320 SCSI
drive took ~6 hours), and pop it into a new system along with a
replacement drive to rebuild on.  Keep the network unplugged the first
time you boot it, change all of the hostname and IP references (and purge
the queue if you need to - VERY important to avoid duplicate
deliveries!!!), and then bring it back up.  Instant mail server, just add
platters :)

Josh
-- 
Joshua Megerman
SJGames MIB #5273 - OGRE AI Testing Division
You can't win; You can't break even; You can't even quit the game.
  - Layman's translation of the Laws of Thermodynamics
[EMAIL PROTECTED]


Reply via email to