Spam detection software, running on the system "smtp2.ece.ubc.ca",
has identified this incoming email as possible spam. The original
message has been attached to this so you can view it or label
similar future email. If you have any questions, see
i...@ece.ubc.ca for details.
Content preview: I'm seeing very occasional "Insufficient system storage"
errors
in my /var/log/mail.log, but it's not the usual lack of disk space. Here
is an excerpt from /var/log/mail.log: Oct 14 20:07:10 smtp2
postfix/smtpd[473668]:
connect from m144-mta1-lv.wordfly.com[52.124.61.163] Oct 14 20:07:13 smtp2
postfix/smtpd[473668]: 01F6F1FE18:
client=m144-mta1-lv.wordfly.com[52.124.61.163
[...]
Content analysis details: (6.2 points, 6.0 required)
pts rule name description
---- ---------------------- --------------------------------------------------
2.0 PDS_OTHER_BAD_TLD Untrustworthy TLDs
[URI: galasone.top (top)]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from
author's domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily
valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
2.5 URIBL_DBL_SPAM Contains a spam URL listed in the Spamhaus DBL
blocklist
[URIs: galasone.top]
0.1 URIBL_CSS_A Contains URL's A record listed in the Spamhaus CSS
blocklist
[URIs: pmta205.journal-oncology.info]
1.7 URIBL_BLACK Contains an URL listed in the URIBL blacklist
[URIs: journal-oncology.info]
0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay
lines
--- Begin Message ---
I'm seeing very occasional "Insufficient system storage" errors in my
/var/log/mail.log, but it's not the usual lack of disk space. Here is an
excerpt from /var/log/mail.log:
Oct 14 20:07:10 smtp2 postfix/smtpd[473668]: connect from
m144-mta1-lv.wordfly.com[52.124.61.163]
Oct 14 20:07:13 smtp2 postfix/smtpd[473668]: 01F6F1FE18:
client=m144-mta1-lv.wordfly.com[52.124.61.163]
Oct 14 20:07:13 smtp2 opendkim[28076]: 01F6F1FE18: s=wordfly02 d=wordfly.com
a=rsa-sha256 SSL
Oct 14 20:07:16 smtp2 postfix/cleanup[473260]: 45CF420090:
message-id=<80fc24fc-0e93-11eb-bca1-redac...@facebookmail.com>
Oct 14 20:07:16 smtp2 opendkim[28076]: 45CF420090: s=s1024-2013-q3
d�cebookmail.com a=rsa-sha256 SSL
Oct 14 20:07:25 smtp2 postfix/smtpd[473668]: disconnect from
m144-mta1-lv.wordfly.com[52.124.61.163] ehlo=2 starttls=1 mail=1 rcpt=1 data=1
quit=1 commands=7
Oct 14 20:07:26 smtp2 postfix/local[474024]: 45CF420090:
to=<moredac...@ece.ubc.ca>, relay=local, delay, delays/0/0/0.01, dsn=2.0.0,
status=sent (forwarded as 541A820093)
Oct 14 20:07:30 smtp2 postfix/smtpd[473637]: connect from
pmta205.journal-oncology.info[191.101.211.207]
Oct 14 20:07:30 smtp2 postfix/smtpd[473668]: connect from
m144-mta1-lv.wordfly.com[52.124.61.163]
Oct 14 20:07:30 smtp2 postfix/smtpd[473668]: NOQUEUE: reject: MAIL from
m144-mta1-lv.wordfly.com[52.124.61.163]: 452 4.3.1 Insufficient system storage;
proto=ESMTP helo=<m144-mta1-lv.wordfly.com>
Oct 14 20:07:30 smtp2 postfix/smtpd[473668]: warning: not enough free space in
mail queue: 0 bytes < 1.5*message size limit
Oct 14 20:07:30 smtp2 postfix/smtpd[473637]: NOQUEUE: reject: MAIL from
pmta205.journal-oncology.info[191.101.211.207]: 452 4.3.1 Insufficient system
storage; proto=ESMTP helo=<pmta205.journal-oncology.info>
Oct 14 20:07:30 smtp2 postfix/smtpd[473637]: warning: not enough free space in
mail queue: 0 bytes < 1.5*message size limit
Oct 14 20:07:30 smtp2 postfix/cleanup[473260]: 9F3F120331:
message-id=<20201015030730.9f3f120...@smtp2.ece.ubc.ca>
Oct 14 20:07:30 smtp2 postfix/cleanup[473260]: warning: 9F3F120331: write queue
file: No space left on device
Oct 14 20:07:30 smtp2 postfix/smtpd[473637]: disconnect from
pmta205.journal-oncology.info[191.101.211.207] ehlo=2 starttls=1 mail=0/1
quit=1 commands=4/5
Oct 14 20:07:35 smtp2 postfix/qmgr[407832]: 501A91FB53:
from=<double-bou...@ece.ubc.ca>, size02, nrcpt=1 (queue active)
Oct 14 20:07:35 smtp2 postfix/smtpd[473668]: disconnect from
m144-mta1-lv.wordfly.com[52.124.61.163] ehlo=2 starttls=1 mail=0/1 quit=1
commands=4/5
Oct 14 20:07:36 smtp2 postfix/smtp[474095]: 541A820093:
to=<mnredac...@gmail.com>, orig_to=<moredac...@ece.ubc.ca>, relay=none, delay,
delays=0/0.02/10/0, dsn=4.4.3, status�ferred (Host or domain name not found.
Name service error for name=gmail.com type=MX: Host not found, try again)
Oct 14 20:07:45 smtp2 postfix/local[473993]: 01F6F1FE18:
to=<heredac...@ece.ubc.ca>, relay=local, delay6, delays=9.8/0/0/26, dsn=2.0.0,
status=sent (delivered to command: /usr/bin/procmail -p)
Oct 14 20:07:45 smtp2 postfix/smtpd[473637]: connect from
unknown[45.193.210.121]
Oct 14 20:07:46 smtp2 postfix/smtpd[473637]: NOQUEUE: reject: RCPT from
unknown[45.193.210.121]: 450 4.7.25 Client host rejected: cannot find your
hostname, [45.193.210.121];
from=<bounce-108980-1597160-105716-redac...@galasone.com>
to=<cgredac...@ece.ubc.ca> proto=ESMTP helo=<galasone.top>
Oct 14 20:07:46 smtp2 postfix/smtpd[473637]: disconnect from
unknown[45.193.210.121] ehlo=1 mail=1 rcpt=0/1 data=0/1 quit=1 commands=3/5
Observations
===========
0. Postfix 3.4.13 on Ubuntu 20.04 (Linux kernel 5.4.0-48-generic) in a vSphere
virtual machine. Also previously observed on Ubuntu 18.04, but not 16.04.
1. This happens very infrequently, but seemingly randomly. For a rough sense
of scale, I count 17500 instances of "delivered" in mail.log, but only 16
instances of "Insufficient system storage" in the same mail.log.
2. When it happens, it may happen to more than one process at around the same
time. The excerpt above shows two such complaints from two different smtpd
processes, which is too tight of a cluster to be a coincidence, given the
rarity. The instance before that was at Oct 14 14:50:30. And it has been fine
for at least two hours since then.
3. Sometimes, these errors result in a double-bounce notification to
postmaster. Here is what the double-bounce notification generated at 20:07:35
says:
===== BEGIN double-bounce BODY 1 ====Transcript of session follows.
Out: 220 mx.ece.ubc.ca ESMTP Postfix
In: EHLO m144-mta1-lv.wordfly.com
Out: 250-smtp2.ece.ubc.ca
Out: 250-PIPELINING
Out: 250-SIZE 52428800
Out: 250-ETRN
Out: 250-STARTTLS
Out: 250-ENHANCEDSTATUSCODES
Out: 250-8BITMIME
Out: 250-DSN
Out: 250-SMTPUTF8
Out: 250 CHUNKING
In: STARTTLS
Out: 220 2.0.0 Ready to start TLS
In: EHLO m144-mta1-lv.wordfly.com
Out: 250-smtp2.ece.ubc.ca
Out: 250-PIPELINING
Out: 250-SIZE 52428800
Out: 250-ETRN
Out: 250-AUTH PLAIN LOGIN
Out: 250-AUTH=PLAIN LOGIN
Out: 250-ENHANCEDSTATUSCODES
Out: 250-8BITMIME
Out: 250-DSN
Out: 250-SMTPUTF8
Out: 250 CHUNKING
In: MAIL FROM:<redac...@return.wordfly.com>
Out: 452 4.3.1 Insufficient system storage
In: QUIT
Out: 221 2.0.0 Bye
For other details, see the local mail logfile
===== END double-bounce BODY 1 ====
But it can also happen at other phases of the SMTP transaction:
===== BEGIN double-bounce BODY 2 ====Transcript of session follows.
Out: 220 mx.ece.ubc.ca ESMTP Postfix
In: HELO lampe16.ece.ubc.ca
Out: 250 smtp2.ece.ubc.ca
In: MAIL FROM: <cfeng...@lampe16.ece.ubc.ca>
Out: 452 4.3.1 Insufficient system storage
Session aborted, reason: lost connection
For other details, see the local mail logfile
===== BEGIN double-bounce BODY 2 ====
What I have tried
================
root@smtp2:~# df -h /var
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg-var 7.8G 4.7G 2.7G 64% /var
root@smtp2:~# df -i /var
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/vg-var 518144 7638 510506 2% /var
root@smtp2:~# postconf message_size_limit queue_minfree
message_size_limit = 52428800
queue_minfree = 0
I've turned on debug_peer_list, and managed to get some verbose logging for the
lampe16.ece.ubc.ca example above:
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: ...[SNIP]...
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: milter8_mail_event: milter
unix:/clamav/clamav-milter.ctl: mail <cfeng...@lampe16.ece.ubc.ca>
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: event: SMFIC_MAIL; macros:
{mail_addr}�eng...@lampe16.ece.ubc.ca {mail_host}=smtp2.ece.ubc.ca
{mail_mailer}=local
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: reply: SMFIR_CONTINUE data 0 bytes
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: smtpd_check_rewrite: trying:
permit_inet_interfaces
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: permit_inet_interfaces:
lampe16.ece.ubc.ca 137.82.57.92
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: fsspace: .: block size 4096,
blocks free 0
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: smtpd_check_queue: blocks 4096
avail 0 min_free 0 msg_size_limit 52428800
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: NOQUEUE: reject: MAIL from
lampe16.ece.ubc.ca[137.82.57.92]: 452 4.3.1 Insufficient system storage;
proto=SMTP helo=<lampe16.ece.ubc.ca>
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: warning: not enough free space in
mail queue: 0 bytes < 1.5*message size limit
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: abort all milters
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: milter8_abort: abort milter
inet:localhost:8892
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: milter8_abort: abort milter
unix:/spamass/spamass.sock
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: milter8_abort: abort milter
unix:/clamav/clamav-milter.ctl
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: >
lampe16.ece.ubc.ca[137.82.57.92]: 452 4.3.1 Insufficient system storage
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: watchdog_pat: 0x56190ce2a700
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: smtp_get: EOF
Oct 14 13:03:00 smtp2 postfix/smtpd[433514]: ...[SNIP]...
The error message is coming from smtpd_check_queue() in
src/smtpd/smtpd_check.c. The disk space check uses fsspace() in
src/util/fsspace.c, which on Linux calls statfs(2). The code confirms that
when it complains of insufficient system storage, it means disk blocks, not
inodes, memory, shared memory, or anything else.
Which filesystem is it testing when it does fsspace(".", &fsbuf)? If I look at
a typical smtpd process, /proc/$pid/cwd is a symlink to /var/spool/postfix, as
I would expect, so it should be testing my /var filesystem. (I don't have
separate filesystems under /var such as /var/log, /var/spool, or anything.)
Perhaps smtpd is doing a sneaky momentary chdir(2) to the wrong place before
calling fsspace()? I grepped the entire Postfix source code, and it's always
either chdir(var_queue_dir), or chdir(var_daemon_dir), or
chdir(var_command_dir), or chdir("/") in a chrooted context — all of which
should be in the same /var filesystem. The exception is chdir(args.cwd) in
src/global/pipe_command.c, but surely that's irrelevant? Anyway, as I noted,
it affects multiple processes simultaneously.
Perhaps there is really something that is momentarily putting huge files in my
/var and immediately deleting them? I monitored that too:
root@smtp2:~# tail -f /var/log/mail.log | while read -r line ; do case "$line"
in *Insufficient*) date ; df --block-size=4K /var ; esac ; done
Wed 14 Oct 2020 08:07:30 PM PDT
Filesystem 4K-blocks Used Available Use% Mounted on
/dev/mapper/vg-var 2020590 1191809 721210 63% /var
Wed 14 Oct 2020 08:07:30 PM PDT
Filesystem 4K-blocks Used Available Use% Mounted on
/dev/mapper/vg-var 2020590 1191809 721210 63% /var
There's nothing interesting in /var/log/messages during the times that this
happens.
Whenever it happens, it's always complaining that there are ZERO bytes free,
not some small number of bytes free. Besides, my /var/log/mail.log is on the
same filesystem, and as far as I can tell, my syslog is working fine.
So, at this point, I'm stumped. Any ideas?
Dara Poon
IT Services, Department of Electrical and Computer Engineering, University of
British Columbia
--- End Message ---