Hi, > > However, many of tokens in even Forbes and WP newsletters may occure in > different spamy newsletters, so be careful when traning even these. >
This is exactly what I was thinking. When going through the quarantine, it's also very difficult to always not only identify which newsletters may have been miscategorized or trained incorrectly, but also ever being able to correct an improperly trained newsletter (or email in general). > If you get the score down enough not to be classified as spam, you've won > and should not contine (unless you are willing to check all BAYES_0 mail > for > suspicious newsletters and train those as spam, seeing how much it affects > mentioned Forbes and WP newsletters. > Too bad it wasn't possible to build a shared list of trusted newsletters/senders to compensate for these mistakes. On a related note, how about emails with only an image attachment? People use email to send pictures, screenshots and other emails with nothing in the body and sometimes even no subject, but aren't spam. The ones I see in the quarantine are almost always ham, and despite training them as ham (even with --max-size 0), they continue to be tagged as spam. I've always also had difficulty with marking them so DCC ignores them.