I have been working on a little PERL project that takes from a list of
550,000 email addresses and generates an SQL statement foreach to create
an entry into a DbMail MySQL database. This will be my new test
database. Too bad numerous things arose in August and I have had to put
dev resources to other use. I have too many bosses. I will get back to
it shortly. I think it will be a worthwhile test Db.
I came back to DbMail in June. It changed a lot since 2000.
I wrote us a programme called DBMA (DbMail Administrator)
(http://library.mobrien.com/dbmailadministrator/) to Administer DbMail
on the assumption of substantial scalability. I needed to prove my
scalability theories internally. So far I have worked with 22,000
accounts in various tests. I am pretty sure that's a small number
compared to OP's experiences. My aim is to push up the totals to as near
the full half million I mentioned as soon as possible, come what may.
Target is September's end.
Postfix with DbMail is the way to go. I use 2.2 dev Postfix and have
experience with most everything else. (SHUDDER) Once you have worked for
a while with Postfix, there is no other MTA.
I have used DbMail on both pgsql and mysql with much the same results
except as the user numbers grow mysql begins to outperform pgsql in
speed. That could be me. I am not as adept with PostgreSQL. One thing is
for sure, whatever lags there may be, pgsql always holds up and all
queues are finished. I have dumped MySQL cores a few times on account of
my test code errors and loops but not PostgreSQL. It tolerates my wicked
tests -- just makes me wait. MySQL dumps. Go figure.
SAP isn't SAP anymore is it? MaxDB now. I think MySQL AB is making some
interesting promises about future development and some commonality and
interoperability dev work. Might be pie in the sky. Might be an
interesting prospect for DbMail.
DbMail has been around for a few years. Maybe it wasn't getting the
interest it has had recently. Probably folks lacked confidence in the
concept of stuffing mail into databases. The database engines are better
now and DbMail is coming into its own time. I wonder if it is hard to
find people who understand both mail and SQL. Perhaps that's changing
too. And DbMail is very good so it is getting much more attention this year.
What I am getting at is that, DbMail is now production quality and a
serious way ahead at the enterprise level therefore worth some level of
commitment to its advancement. From its conceptual core it has good
'future-proof' features either by design or happenstance but nonethless
working well and has greater potential.
I have a hard time getting worried about its scalability. I haven' seen
anything discouraging as yet and I don't see expensive resources being
an issue. Apart from the CPU demands of the database engine, IMHO, it's
all about handling network traffic. The faster you move the volume the
better things work.
MySQL AB has some ideas about clustering but I haven't seen them well
implemented. Seems that production efforts mainly fall back to the tried
and true. (Hand the problem to the router guys.) The thoughts expressed
in this thread about IP load sharing make good sense. MySQL replication
is now working great and that is a major plus. One needs to be cautious
about selecting the MySQL version though. Don't jump out on the bleeding
edge versions or everything falls apart.
Load balancing via IP switching is as good as it gets is my belief. You
can have as many DbMail front ends on the same database as you want.
I don't see any reason why you can't give users a gig of mail storage. I
won't get into the why's of that. It's already suggested in this thread
and therefore you already have ideas about why the world is now in a
place where a Gig of mail is not outlandish. DBMA uses gigs as one of
its possible defaults. I would prefer users put that on their own
computers and I think the email-client developers need a good head
shake. You would never want all users to be hitting quota but you need
to be ready for that eventuality.
I wonder about splitting the load by domains or network segments. In
other words assigning front-end DbMail hosts with designated unique
back-end database clusters to various subsets of the user universe.
That creates a complex failover issue but diminishes if not
statistically eliminating the liklihood of a global fail. A single
failover setup could be designed to handle the universe at a likely
reduction in speed and thus be able to take a failover from any segment.
Maybe this is where you bring our the 32-cpu Sparcs. This concept would
use task assignment as a solution rather than making it a monster
hardware issue and the little 2-4 smb units would handle daily traffic
in segmented lots.
The Google folks did not create a system to handle 3 billion web pages.
Very many distributed systems incorporating interoperability at the
kernel level and machine learning features in the application software
look after one small subset of the total in a peer collaboration network.
You folks may be on the wrong track. Think of logically breaking down
the universe of users. I had to do this with music once. I used the
alphabet. Each server got a letter. Sorting was by author. Less than 100
servers now handle all the music in the world. (Or so the people tell
me. :o))
This might be a good time to think what you can do to help DbMail along.
I don't know. You would need to ask someone important. Not me. But if
it is clear to you that this project has potential for attainingg the
goals you wish to achieve, don't wait for it to happen, jump in (the
water's fine) and help make it happen.
Have many happy days...
Mike