On Thu, 25 Jul 2024, Daniel K. via mailop wrote:
We feed received DMARC reports through Open-Report-Parser and visualize
with Open DMARC Analyzer.
Sometimes the ingestion step fails, because we receive aggregate DMARC
reports with invalid contents. Particularly from senders that seem to
have just started sending DMARC reports.
RFC 7478 says:
The aggregate data MUST be an XML file that
SHOULD be subjected to GZIP compression.
The name of the filename is then specified, and the extension further
implies only gzip as an optional compression method.
extension = "xml" / "xml.gz"
The extension MUST be "xml" for a plain XML file,
or "xml.gz" for an XML file compressed using GZIP.
However, some send zip compressed files, with a ".zip" extension, one
notable sender being google.com.
I have also seen zip files declared as application/gzip, and gzip files
declared as application/zip. They are few and far between, but cause
annoying failures.
Regarding the first type of mis-declared content-types, I've resolved to
using a Sieve script to fix up the content-type on a case by case basis
while waiting for the sender to fix their process, but would it perhaps
be better if the processing tool did its own content-type sniffing?
Could this type of sniffing have or cause other problems?
AIUI the main problem with file type sniffing is that if the result is
different from what the (human) reader expected from the .suffix, they
might initiate an unexpected and unsafe parser (eg running a bash
script called image.png).
As long as your gunzip handles an invalid or malicious gzip file
I see no danger in passing it an attachment with a .zip suffix
I'm thinking about improving the ingestion step by adding Content-Type
sniffing, but are wondering if it should be done at all, or if it will
be considered an anti-feature.
Are you trying to throw out bad input or fix broken input ?
As long as your decompressors all fail safely with broken and malicious
input I would use the input, rather than the metadata, to determine
which decompressor to use (but using metadata might be quicker).
--
Andrew C. Aitchison Kendal, UK
and...@aitchison.me.uk
_______________________________________________
mailop mailing list
mailop@mailop.org
https://list.mailop.org/listinfo/mailop