Mike,
I'm basically doing most all of what you suggest. I'm going
to start over with a new build, go to 8.0.3 on postgres and
probably 2.0.4-svn.
We did find a duplex mis-match between the dbmail server and
the database. So that could be part of the issue.
I don't think the Xeon's that I have are HT. The memory is
set up as (2) 512M chips one in each bank.
--
David A. Niblett | email: [EMAIL PROTECTED]
Network Administrator | Phone: (352) 334-3400
Gainesville Regional Utilities | Web: http://www.gru.net/
-----Original Message-----
From: M. J. [Mike] O'Brien [mailto:[EMAIL PROTECTED]
Sent: Saturday, July 16, 2005 4:44 AM
To: DBMail mailinglist
Subject: [Dbmail] DBMail + PostgreSQL Problem
Hello David:
* I see a couple of issues. Would you consider a default postgresql.conf
with minor tweaks?
Like:
shared_buffers = 600
sort_mem = 1024 (not much higher than 8192 as you seem to have many users)
* Your comments about orphan message hunting on the multiple joins
(dbmail-util) makes me wonder if all your indices are intact.
I am running a big server on 7.3.6 and smaller setups on 8.0.3 The
former
(7.3.6) is on Intel, not Xeons but a pair of 1.4G Tualatins -
3gig mem but mail doesn't get it all cuz machine does other mem-intensive
admin stuff too.
The server handles the "alerts" from hundreds of monitored servers on
Micromuse Netcool and other Entmgt monitors. Some Admins literally kick
the
crap out of it with overabundance of senseless yellow alerts :o( they've
been asked a billion times to drop. Hundreds of accounts get thousands of
little messages and hundreds of big ones every hour. Never been a problem
on
psql. Thing is bullet-proof-stable. Cron runs minor dbmail-util every 40
minutes and full dbmail-util every 6hrs. It scoots through these pretty
quick but I agree, the join on the message orphans is a little rough --
but
at worst a few extra minutes, not days.
Questions:
a) In your para(1) you say Dbmail 'master daemon dies' ... which do you
mean
by that (lmtpd, imapd. pop3d ?) (Is PostgreSQL denying a connection
because
it is out of memory, I wonder?)
b) ...have you traced the connect status between the two servers -- is it
consistant? permissions good?.
Troubleshoooting ideas:
1) After 'hiding' your ambitious conf and using a more tame, quasi-default
conf, list your indices and see if they are all there, none missing. (I
pasted a full set below so you can compare to a working server.)
2) You likely did this: check your psql error log and see what if anything
it says the problem is during the time of the dbmail-util run. Also
syslog?
Also see if you can run 'systat -vmstat' to get a picture of phys and
swap
mem status while things are haywire (if you have any cpu left at all
(. ) Also, if you like, go to trace level 5 for lmtpd and see how it
is
doing with its database connects (are they consistant?) - it might tell
something you didn't know.
3) Another suggestion for database checking: Using the latest DbMail 2.0.4
SVN ~/dbmail_snapshot/sql/postgresql/create_tables.sql, create another
(empty) database on your server. Call it dbmail2 or something; compare the
schema; and run some tests against it by changing dbmail.conf on the
database server. (If you don't have dbmail installed on the database
server, it would be a good idea to do so. That way you can run your tools
on
the DB server instead of across the LAN)
4) Consider a fresh dev rebuild using PGSQL 8+ ... it's quite nice* and a
good excuse for a step by step rebuild of your system, using more
conservative aproaches to configs until all is up and running well. *The
nice stuff in 8+ includes Savepoints, Improved Buffer Management,
Checkpoint, Vacuum, Point-In-Time Recovery which remedy your last point:
"I'd also be very interested in knowing a better way to... etc"
5) Hardware: (>DB Server - Dual Xeon 3.06GHz, 1GB RAM, SCSI RAID Ultra320
drives.<) You might not have enough memory for your aggressive
configuration. Gottabe 'HT Xeons'. (Is PosgreSQL threading across all 4
CPUs? Are any threads going linear on account of a broken network
connection
or other issue -- this could eat memory and push into swap and even cause
broken sequences.)
Memory: Are there two mem banks on the board, one for each CPU? When only
one bank is used, that should be CPU0 for most boards. Check manual. I
wonder about a memory issue with mismatched Dual Rank x4/x8 400mhz memory
or
wrong memory for the board. 'HT Xeon' boards can be picky. Run a mem test
to
make certain the mem sticks are paired. (If you are running a single stick
of 1 gig it seldom goes in the first slot. Check ur manual.) What can
happen
say with a pair of 512s mismatched on a HT Xeon board is that Linux will
manage memory well until it must *reuse* phys memory past the first 512
...
it can then have troubles... with bizarre symptoms.
PSQL Indices
dbmail_acl_pkey
dbmail_aliases_alias_idx
dbmail_aliases_alias_low_idx
dbmail_aliases_pkey
dbmail_auto_notifications_pkey
dbmail_auto_replies_pkey
dbmail_idx_ipnumber
dbmail_idx_since
dbmail_mailboxes_name_idx
dbmail_mailboxes_owner_idx
dbmail_mailboxes_owner_name_idx
dbmail_mailboxes_pkey
dbmail_messageblks_physmessage_idx
dbmail_messageblks_physmessage_is_header_idx
dbmail_messageblks_pkey
dbmail_messages_7
dbmail_messages_8
dbmail_messages_mailbox_idx
dbmail_messages_physmessage_idx
dbmail_messages_pkey
dbmail_messages_seen_flag_idx
dbmail_messages_status_idx dbmail_messages_status_notdeleted_idx
dbmail_messages_unique_id_idx
dbmail_pbsp_pkey
dbmail_physmessage_pkey
dbmail_subscription_pkey
dbmail_users_name_idx
dbmail_users_pkey
Sequences:
dbmail_alias_idnr_seq
dbmail_mailbox_idnr_seq
dbmail_message_idnr_seq
dbmail_messageblk_idnr_seq
dbmail_physmessage_id_seq
dbmail_seq_pbsp_id
dbmail_user_idnr_seq
Hope this helps...
best...
Mike
----- Original Message -----
From: "Niblett, David A" <[EMAIL PROTECTED]>
To: <dbmail@dbmail.org>
Sent: Friday, July 15, 2005 9:07 AM
Subject: [Dbmail] DBMail + PostgreSQL Problem
Hello all,
I'm in serious need of help here. I'm about at my whits
end of dealing with dbmail and getting it to work. I've
had a couple of database crashes now and found things like
if I run dbmail-util my DB process load skyrockets and
the server becomes unusable.
I'm hoping that maybe I'm just incapable of tuning PostgreSQL for
performance. At this point I'd like to know if there is anyone out
there that has experience with dbmail-2.0.4 on PostgreSQL-7.4.7 in a
moderately large table size (9-10GB). We are very interested in paying
for some help in the form of consulting if need be. At this point if
we can't work out the bugs then we are going to scrap the entire thing
and go back to our simple Windows based NTMail system.
Some items that seems to happen are:
1) If I stop/reload the postgres database (normal nice stop which
should allow all transactions to finish) the dbmail master daemon dies
and we seem to get a lot of unconnected messages suddenly in the
database. We see these in the form of no user, no subject messages in
users mailboxes.
2) When dbmail-util runs, the process load just sky rockets on the db
server. It seems to be related to the large join done for finding the
messageblks that are not connected.
3) When we vacuum the database the process load screams sky high (like
160+) on the server. The last time we did this it took 4 days for the
vacuum to finish. We believe we have this fixed by using the
pg_autovacuum daemon.
Our set up is:
DB Server - Dual Xeon 3.06GHz, 1GB RAM, SCSI RAID Ultra320 drives.
DBMail Server - P4 3.06GHz, 1GB RAM, SATA drives.
Running: dbmail-2.0.4, postgresql-7.4.7 on Gentoo with Linux 2.6
kernel
As far as tweaking goes, I've set sort_mem and vacuum_mem on postgres
to each 100M (102400) to help try and stop swaps. I've also increased
the shared memory limit from 32M to 100M.
I'd also be very interested in knowing a better way to limit the
database transaction logs such that should I suffer a crash I'm not
having to dump the database and restore. I never really had this
issue with MSSQL. I expect to lose things like the message that is
being delivered, but not corrupt the dbmail_users table and everything
else.
HELP... TIA
--
David A. Niblett | email: [EMAIL PROTECTED]
Network Administrator | Phone: (352) 334-3400
Gainesville Regional Utilities | Web: http://www.gru.net/
_______________________________________________
Dbmail mailing list
Dbmail@dbmail.org https://mailman.fastxs.nl/mailman/listinfo/dbmail
_______________________________________________
Dbmail mailing list
Dbmail@dbmail.org https://mailman.fastxs.nl/mailman/listinfo/dbmail
_______________________________________________
Dbmail mailing list
Dbmail@dbmail.org
https://mailman.fastxs.nl/mailman/listinfo/dbmail