Hi monochromec. I have a similar installation and was also having various
problems and delays. I'm on a similarly sized system (Ubuntu 24.04 with
venv installations of mailman3; 32GB memory).

What I found was that tuning of the resources made a very significant
impact. Out of the box tuning parameters for the various components were
not sufficient to keep mail flowing.

The "system info" tab in the web interface tells you where things are held
up, if they are. I was having trouble with the 'out' queue getting slow,
and the postfix 'mailq' was showing thousands of messages waiting to be
delivered. I was running out of smtpd processes. Archiving was delayed
(though not as much as yours).

Some of the things I did:
- Increase PostgreSQL resource limits. Some of the parameters in my
configuration files:
# Tuning, via https://pgtune.leopard.in.ua/ for 'data warehouse' app.
max_connections = 100 # default
shared_buffers = 8GB
effective_cache_size = 24GB
maintenance_work_mem = 2GB
# wal_buffers = 16MB
default_statistics_target = 500
effective_io_concurrency = 2
work_mem = 26214kB
min_wal_size = 4GB
max_wal_size = 16GB
max_parallel_workers_per_gather = 4
max_parallel_maintenance_workers = 4

- Increase the number of mailman processes in mailman.cfg
# Note: Values need to be a power of 2. Make sure PostgreSQL allows more
connections than the # of instances
[runner.in]
class: mailman.runners.incoming.IncomingRunner
instances: 16
# instances: 8

[runner.out]
class: mailman.runners.outgoing.OutgoingRunner
instances: 64

The "out" setting, notably, decreased mail delivery times.

- Increase postfix resources. Notably in main.cf:
# https://www.postfix.org/TUNING_README.html
# default_process_limit = 100 # default
default_process_limit = 250

Maybe these will point you in the right direction. I realize you are
focused on hyperkitty, and my experience was that all the major components
(database, MTA and mailman3) all needed more resources.
  ~ Greg


On Sat, Dec 28, 2024 at 7:26 AM monochromec via Mailman-users <
mailman-users@mailman3.org> wrote:

> Following the root cause analysis which Tobias started all those weeks ago
> we (the admin team behind the installation) are still struggling with the
> following phenomenon: messages on average take more than 24 hours to be
> processed, more precisely, the average lifetime of a pickled message object
> in `/var/lib/mailman3/queue/pipeline` clocks in at around 26 hours).
>
> Couple of stats of the installation: standard installation from Bookworm
> OS repos, Hyperkitty as archiver and Postorius as web frontend as explained
> above, running Python 3.11.2 from the standard systemd service as packaged
> with the Bookworm deb file. All backends (Core + Hyperkitty) are supported
> by Postgres version 15+248. The MTA is a standard Postfix installation,
> again from OS packages.
>
> The underlying VM has 7 cores with just under 24 GB of main memory. This
> production instance is handling less than 130 mailing list (MLs) with an
> average of less than 10 postings per day per ML. CPU core utilisation
> hovers around 50% with the lion share allocated to the four pipeline
> runners as part of the MM configuration.
>
> OS resource utilisation is well below bounds (approx. 8 GB of main memory
> allocated to running processes), plenty of available sockets space (I
> noticed some transient `Connection lost during _handle_client()` warnings
> in the logs so I checked that the SMTP runner can connect to Postfix for
> delivering the messages after processing by checked the socket allocation
> of the running processes).
>
> Cursory review of the corresponding Core classes (runner + pipeline
> implementation in addition to `posting_pipeline`) didn't reveal any further
> pointers. What I did notice though that increasing the logging levels of
> the components (namely `smtp`, `pipeline` and `lmtp` to `debug`) in
> `/etc/mailman3/mailman.cfg` didn't add any useful information to the logs
> as configured after restarting the Core.
>
> As outlined above, Hyperkitty doesn't seem to do a check based on ML and
> message ID before archiving a message in the database. But this only add a
> REST roundtrip and Postgres communication through Hyperkitty's underlying
> Django framework to the overall system load and the driving UWSGI instance
> is well within CPU cycle bounds.
>
> Any pointers are appreciated - more than happy to provide more info if
> required.
> _______________________________________________
> Mailman-users mailing list -- mailman-users@mailman3.org
> To unsubscribe send an email to mailman-users-le...@mailman3.org
> https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/
> Archived at:
> https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/OTY7D5VAGWAIMWERK75RLPVJE7XW75C5/
>
> This message sent to gbne...@petascale.org
>
_______________________________________________
Mailman-users mailing list -- mailman-users@mailman3.org
To unsubscribe send an email to mailman-users-le...@mailman3.org
https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/
Archived at: 
https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/AHR3QWFHGPWEHCMMT2QQ2FHNNJNBPCIL/

This message sent to arch...@mail-archive.com

Reply via email to