Please keep list emails on the list.

I don't think you could do a simple regex match for what you want. As I said previously, this would require a plugin both to build the custom regex(s) (or DB query) and to search for the previous emails. You would want to keep the prior email information in a database of some sort since doing a search of a large text file for every incoming email would probably be too slow.

Bowie

On 9/28/2016 10:05 AM, Nicola Piazzi wrote:

Flux :

I receive an email with subject “Federal Express Important invoice number 20”

Plugin search a regex in maillog database for 10 days ago mails and this regex search match 1 or more lines

So we know that similar mails received in the past

But it is normal to receive similar text but not so normal to receive same subject from different addresses directed to different internal users

Nicola Piazzi
CED - Sistemi
COMET s.p.a.
Via Michelino, 105 - 40127 Bologna – Italia
Tel.  +39 051.6079.293
Cell. +39 328.21.73.470
Web: www.gruppocomet.it <http://www.gruppocomet.it/>
Descrizione: gc

*Da:*Bowie Bailey [mailto:bowie_bai...@buc.com]
*Inviato:* mercoledì 28 settembre 2016 16:01
*A:* users@spamassassin.apache.org
*Oggetto:* Re: R: regular expression needed

I'm still not clear on exactly what you are trying to do, but in order to test anything against previous messages, you will need a custom SA plugin and some sort of database to store the information about previous emails. That is beyond my area of expertise.

If you just need a regex to match something, I'd be happy to help, but I would need a more explicit description of what you are trying to match.

Bowie

On 9/28/2016 9:29 AM, Nicola Piazzi wrote:

    Bowie, your ia a manual way, it works but is not automated

    Automation is a plugin that check similar words in oldest messages
    (for example 3 of 4 words match)

    Then plugin check if sender domain is different and recipient is
    different

    *Da:*Bowie Bailey [mailto:bowie_bai...@buc.com]
    *Inviato:* mercoledì 28 settembre 2016 15:26
    *A:* users@spamassassin.apache.org
    <mailto:users@spamassassin.apache.org>
    *Oggetto:* Re: regular expression needed

    On 9/28/2016 9:02 AM, Nicola Piazzi wrote:


        Usually we receive spam having subjects like these examples in
        order of time :



        Subject From To

        FedEx Shipment 702193383647 Notification j...@company1.com
        <mailto:j...@company1.com> s...@mycompany.it
        <mailto:s...@mycompany.it>

        FedEx Shipment 722566383641 Notification a...@other.com
        <mailto:a...@other.com> a...@mycompany.it
        <mailto:a...@mycompany.it>

        FedEx Shipment 734563383644 Notification i...@company1.com
        <mailto:i...@company1.com> lo...@mycompany.it
        <mailto:lo...@mycompany.it>

        A package for you jim b...@cocacola.com
        <mailto:b...@cocacola.com> j...@mycompany.it
        <mailto:j...@mycompany.it>

        A package for you sue j...@buster.com <mailto:j...@buster.com>
        s...@mycompany.it <mailto:s...@mycompany.it>

        These come from viruses that infect different pcs in the word
        that send same spam

        I want to write a plugin that test each email giving penality
        to these mails

        Detection routine

        A mail arrive

        Subject is : FedEx Shipment 702193383647 Notification

        I search in maillog table for a regex that MATCH FedEx
        Shipment 702193383647 Notification ALSO IN FedEx Shipment
        722566383641 Notification AND IN FedEx Shipment 734563383644
        Notification

        If it match I verify that FROM DOMAIN IS DIFFERENT
        And then I verify that TO ADDRESS IS DIFFERENT

        Now I need a regex sintax to put all extracted words of PHRASE
        FedEx Shipment 734563383644 Notification and match if it found
        at least 3 of 4 words

        Someone can help ?


    I don't follow exactly what you are trying to do in the
    description above, but for that problem, I would start with
    something like this:

    header  __FEDEX_ADDR From:addr /\@fedex\.com/
    header __FEDEX_SUBJ Subject /FedEx Shipment/
    meta FEDEX_SPAM  __FEDEX_SUBJ && ! __FEDEX_ADDR
    score FEDEX_SPAM 2.0

    (Off the top of my head and completely untested.  Adjust score as
    required.)

    This will hit any email with "FedEx Shipment" in the subject that
    doesn't come from fedex.com.  Note that it will also hit on any
    legitimate FedEx emails that have been forwarded.  You could
    minimize this by constraining the subject match to be at the
    beginning of the line (/^Fedex Shipment/).  This may or may not
    have an effect on spam detection.  You could also do a test for
    non-FedEx urls in the body rather than looking at the sender.

    You could use a simple subject line test for the "A package for
    you" emails, unless you know of a valid delivery service that uses
    that phrase.

-- Bowie


Reply via email to