Hello David:

* I see a couple of issues. Would you consider a default postgresql.conf
with minor tweaks?
Like:
shared_buffers = 600
sort_mem = 1024 (not much higher than 8192 as you seem to have many users)

* Your comments about orphan message hunting on the multiple joins
(dbmail-util) makes me wonder if all your indices are intact.

I am running  a big server on 7.3.6 and smaller setups on  8.0.3
The former (7.3.6) is on Intel, not Xeons but a pair of 1.4G Tualatins -
3gig mem but mail doesn't get it all cuz machine does other mem-intensive
admin stuff too.
The server  handles the "alerts" from hundreds of monitored servers on
Micromuse Netcool and other Entmgt monitors. Some Admins literally kick the
crap out of it with overabundance of senseless yellow alerts :o( they've
been asked a billion times to drop. Hundreds of accounts get thousands of
little messages and hundreds of big ones every hour. Never been a problem on
psql. Thing is bullet-proof-stable. Cron runs minor dbmail-util every 40
minutes and full dbmail-util every 6hrs.  It scoots through these pretty
quick but I agree, the join on the message orphans is a little rough -- but
at worst a few extra minutes, not days.

Questions:
a) In your para(1) you say Dbmail 'master daemon dies' ... which do you mean
by that (lmtpd, imapd. pop3d ?) (Is PostgreSQL denying a connection because
it is out of memory, I wonder?)
b) ...have you traced the connect status between the two servers -- is it
consistant?  permissions good?.

Troubleshoooting ideas:
1) After 'hiding' your ambitious conf and using a more tame, quasi-default
conf, list your indices and see if they are all there, none missing. (I
pasted a full set below so you can compare to a working server.)

2) You likely did this: check your psql error log and see what if anything
it says the problem is during the time of the dbmail-util run. Also syslog?
Also see if you can run 'systat -vmstat'  to get a picture of phys and swap
mem status while things are haywire (if you have any cpu left at all
(.   )   Also, if you like, go to trace level 5 for lmtpd and see how it is
doing with its database connects (are they consistant?)  - it might tell
something you didn't know.

3) Another suggestion for database checking: Using the latest DbMail 2.0.4
SVN  ~/dbmail_snapshot/sql/postgresql/create_tables.sql,  create another
(empty) database on your server. Call it dbmail2 or something; compare the
schema;  and run some tests against it by changing dbmail.conf on the
database server.  (If you don't have dbmail installed on the database
server, it would be a good idea to do so. That way you can run your tools on
the DB server instead of across the LAN)

4) Consider a fresh dev rebuild using PGSQL 8+ ... it's quite nice* and a
good excuse for a step by step rebuild of your system, using more
conservative aproaches to configs until all is up and running well. *The
nice stuff in 8+ includes Savepoints, Improved Buffer Management,
Checkpoint, Vacuum, Point-In-Time Recovery which remedy your last point:
"I'd also be very interested in knowing a better way to... etc"

5) Hardware:  (>DB Server - Dual Xeon 3.06GHz, 1GB RAM, SCSI RAID Ultra320
drives.<) You might not have enough memory for your aggressive
configuration. Gottabe 'HT Xeons'. (Is PosgreSQL threading across all 4
CPUs? Are any threads going linear on account of a broken network connection
or other issue -- this could eat memory and push into swap and even cause
broken sequences.)

Memory: Are there two mem banks on the board, one for each CPU? When only
one bank is used, that should be CPU0 for most boards. Check manual.  I
wonder about a memory issue with mismatched Dual Rank x4/x8 400mhz memory or
wrong memory for the board. 'HT Xeon' boards can be picky. Run a mem test to
make certain the mem sticks are paired. (If you are running a single stick
of 1 gig it seldom goes in the first slot. Check ur manual.) What can happen
say with a pair of 512s mismatched on a HT Xeon board is that Linux will
manage memory well until it must *reuse* phys memory past the first 512 ...
it can then have troubles... with bizarre symptoms.

PSQL Indices

dbmail_acl_pkey
dbmail_aliases_alias_idx
dbmail_aliases_alias_low_idx
dbmail_aliases_pkey
dbmail_auto_notifications_pkey
dbmail_auto_replies_pkey
dbmail_idx_ipnumber
dbmail_idx_since
dbmail_mailboxes_name_idx
dbmail_mailboxes_owner_idx
dbmail_mailboxes_owner_name_idx
dbmail_mailboxes_pkey
dbmail_messageblks_physmessage_idx
dbmail_messageblks_physmessage_is_header_idx
dbmail_messageblks_pkey
dbmail_messages_7
dbmail_messages_8
dbmail_messages_mailbox_idx
dbmail_messages_physmessage_idx
dbmail_messages_pkey
dbmail_messages_seen_flag_idx
dbmail_messages_status_idx
dbmail_messages_status_notdeleted_idx
dbmail_messages_unique_id_idx
dbmail_pbsp_pkey
dbmail_physmessage_pkey
dbmail_subscription_pkey
dbmail_users_name_idx
dbmail_users_pkey

Sequences:
dbmail_alias_idnr_seq
dbmail_mailbox_idnr_seq
dbmail_message_idnr_seq
dbmail_messageblk_idnr_seq
dbmail_physmessage_id_seq
dbmail_seq_pbsp_id
dbmail_user_idnr_seq

Hope this helps...
best...
Mike

----- Original Message ----- From: "Niblett, David A" <[EMAIL PROTECTED]>
To: <dbmail@dbmail.org>
Sent: Friday, July 15, 2005 9:07 AM
Subject: [Dbmail] DBMail + PostgreSQL Problem


Hello all,

I'm in serious need of help here.  I'm about at my whits
end of dealing with dbmail and getting it to work.  I've
had a couple of database crashes now and found things like
if I run dbmail-util my DB process load skyrockets and
the server becomes unusable.

I'm hoping that maybe I'm just incapable of tuning PostgreSQL
for performance.  At this point I'd like to know if there is
anyone out there that has experience with dbmail-2.0.4 on
PostgreSQL-7.4.7 in a moderately large table size (9-10GB).
We are very interested in paying for some help in the form of
consulting if need be.  At this point if we can't work out the
bugs then we are going to scrap the entire thing and go back
to our simple Windows based NTMail system.

Some items that seems to happen are:

1) If I stop/reload the postgres database (normal nice stop which
should allow all transactions to finish) the dbmail master daemon
dies and we seem to get a lot of unconnected messages suddenly in
the database.  We see these in the form of no user, no subject
messages in users mailboxes.

2) When dbmail-util runs, the process load just sky rockets on
the db server.  It seems to be related to the large join done
for finding the messageblks that are not connected.

3) When we vacuum the database the process load screams sky high
(like 160+) on the server.  The last time we did this it took 4
days for the vacuum to finish.  We believe we have this fixed by
using the pg_autovacuum daemon.

Our set up is:
DB Server - Dual Xeon 3.06GHz, 1GB RAM, SCSI RAID Ultra320 drives.
DBMail Server - P4 3.06GHz, 1GB RAM, SATA drives.
Running: dbmail-2.0.4, postgresql-7.4.7 on Gentoo with Linux 2.6
kernel

As far as tweaking goes, I've set sort_mem and vacuum_mem on postgres
to each 100M (102400) to help try and stop swaps.  I've also increased
the shared memory limit from 32M to 100M.

I'd also be very interested in knowing a better way to limit the
database transaction logs such that should I suffer a crash I'm not
having to dump the database and restore.  I never really had this
issue with MSSQL.  I expect to lose things like the message that is
being delivered, but not corrupt the dbmail_users table and everything
else.

HELP... TIA

--
David A. Niblett               | email: [EMAIL PROTECTED]
Network Administrator          | Phone: (352) 334-3400
Gainesville Regional Utilities | Web: http://www.gru.net/

_______________________________________________
Dbmail mailing list
Dbmail@dbmail.org
https://mailman.fastxs.nl/mailman/listinfo/dbmail


Reply via email to