Re: Airline reservations get tagged

Paul Boven Tue, 27 Jun 2006 07:49:03 -0700

Hi Ralf,

Ralf Hildebrandt wrote:

Although our SA setup works very well in general, one issue that hascome up a few times recently is airline E-tickets/reservations. Thesetend to be ALL CAPS and have quite a few other trigger words. Ourcompany seems to do business with more than one travel-agent, so justwhitelisting isn't quite enough. These mails hit the following rules:
X-Spam-Score: ***** (5.696) BAYES_99,HTML_30_40,HTML_MESSAGE,NO_REAL_NAME,
 SARE_OBFU_TBL_03,UPPERCASE_50_75,autolearn=no
You could feed these to the bayes DB as "ham"

You are right, of course. But Bayes is more of a statistical tool, andgiven the total number of mails stored in Bayes already, I fear it willtake quite a bit of learning to offset the current high scoring.


Our current Bayes setup is:
Company-wide Bayes database
bayes_auto_learn 1
bayes_auto_learn_threshold_nonspam -0.1
score BAYES_99 5.0
score BAYES_95 4.0

Perhaps I should lower my BAYES_99 and BAYES_95 a bit, though thesesettings are based on past experience where Bayes alone was not able toput clearly spammy mails over the threshold.

These E-tickets just look terribly spammy to Bayes because of thelanguaged used, it seems. Some high-scoring words for this one are:


bayes token 'visa' => 0.997839158297152
bayes token 'refund' => 0.997646909307943
bayes token 'drinks' => 0.997585038685398
bayes token 'NUMBER' => 0.990398319296953
bayes token 'nights' => 0.98853871069642

Regards, Paul Boven.

Re: Airline reservations get tagged

Reply via email to