Hi,

I am trying to diagnose why certain rules do not fire as expected on beginning of lines. Here is a MWE e-mail

"""
From: f...@addr.com
To: t...@addr.com
Subject: email's subject

Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

To: Aa
To: Bb
To: Cc

To: Dd

To: Ee
"""

Here are the rules to test:

# tests for SA list
body T_BODY_TO_NOMULTI /(^|\n|\r)To: \S/i #Should hit "To : " at start of line
tflags      T_BODY_TO_NOMULTI       multiple maxhits=3
body T_BODY_TO_MULTI /(^|\n|\r)To: \S/im #I am unsure how multiline interacts here
tflags      T_BODY_TO_MULTI         multiple maxhits=3
rawbody T_RAWBODY_TO_NOMULTI /(^|\n|\r)To: \S/i #I am unsure how rawbody changes things
tflags      T_RAWBODY_TO_NOMULTI    multiple maxhits=3
rawbody     T_RAWBODY_TO_MULTI      /^To: \S/im
tflags      T_RAWBODY_TO_MULTI      multiple maxhits=3
body        T_BODY_TO_NOCARET       /To: \S/i
tflags T_BODY_TO_NOCARET multiple maxhits=3 # should (and does) hit

The object of these rules is to detect the "To" in a body at the start of a line as to check if this e-mail might be a reply. When run on the e-mail above, I expected the following result for all rules:

To: A
To: B
To: C

Here are the actual results for all rules:

spamassassin -D 2>&1 < t2.eml | grep BODY_TO
jan 29 10:56:14.310 [11274] dbg: rules: ran body rule T_BODY_TO_NOMULTI ======> got hit: "To: A" jan 29 10:56:14.310 [11274] dbg: rules: ran body rule T_BODY_TO_NOMULTI ======> got hit: "To: D" #should be B jan 29 10:56:14.310 [11274] dbg: rules: ran body rule T_BODY_TO_NOMULTI ======> got hit: "To: E" jan 29 10:56:14.335 [11274] dbg: rules: ran body rule T_BODY_TO_MULTI ======> got hit: "To: A" jan 29 10:56:14.335 [11274] dbg: rules: ran body rule T_BODY_TO_MULTI ======> got hit: "To: D" #should be B jan 29 10:56:14.335 [11274] dbg: rules: ran body rule T_BODY_TO_MULTI ======> got hit: "To: E" jan 29 10:56:14.345 [11274] dbg: rules: ran body rule T_BODY_TO_NOCARET ======> got hit: "To: A" jan 29 10:56:14.345 [11274] dbg: rules: ran body rule T_BODY_TO_NOCARET ======> got hit: "To: B" #as expected, but does not test beginning of line jan 29 10:56:14.345 [11274] dbg: rules: ran body rule T_BODY_TO_NOCARET ======> got hit: "To: C" jan 29 10:56:14.545 [11274] dbg: rules: ran rawbody rule T_RAWBODY_TO_MULTI ======> got hit: "To: A" jan 29 10:56:14.546 [11274] dbg: rules: ran rawbody rule T_RAWBODY_TO_MULTI ======> got hit: "To: B" #as expected jan 29 10:56:14.546 [11274] dbg: rules: ran rawbody rule T_RAWBODY_TO_MULTI ======> got hit: "To: C" jan 29 10:56:14.550 [11274] dbg: rules: ran rawbody rule T_RAWBODY_TO_NOMULTI ======> got hit: " #The other (closing) quote is just gone! where did it go? What is matched? jan 29 10:56:14.550 [11274] dbg: rules: ran rawbody rule T_RAWBODY_TO_NOMULTI ======> got hit: " jan 29 10:56:14.550 [11274] dbg: rules: ran rawbody rule T_RAWBODY_TO_NOMULTI ======> got hit: "
[...]

My main interrogation is why neither T_BODY_TO_NOMULTI or T_BODY_TO_MULTI hits as expected. There appears to be some interaction with the previous line that I do not understand. Am I interpreting (^|\n|\r) incorrectly? Is there any reason to search for \n or \r instead of ^? Is there a way to consider a newline with "body" instead of "rawbody"?

Using:
SpamAssassin version 3.4.1
  running on Perl version 5.14.2
on Ubuntu 12.04

Thanks in advance

-Olivier

Reply via email to