Hello, Your message is a few months old, but I see no answer, and stumbled upon it when writing an enhanced version of the normalize_charset feature, so thought that I could perhaps help.
Jay Sekora wrote > Hi. We're running SpamAssassin 3.3.1, and pursuant to some advice I've > seen in archives of this list and spamassassin-dev, I am *not* > using normalize_charset. I do not know much about the original bug, but until recently I used Unicode normalizing without observing any problems. Perhaps I was lucky, or did not look close enough. However, that's irrelevant, because regardless whether you use normalizing or not, as long as you need to match non-ASCII patterns, you need to write rules also in Unicode anyway, because you cannot reject Unicode messages. So when you disable the normalizing, you only make your case worse. Not only you have to write rules in UTF8 anyway (hence risking that they'll be slow), but in plus you need to write the rules also for any possible characters set that can arrive (and you wrote your server needs to accept email in all possible languages, so there would be dozens of different character sets). That's an unhuman task, and the number of rules or their complexity would slow down your server possibly more than the bug (if it still exists). On my mind, anyone who needs to write rules for a multi-national server and for Asian languages, cannot go around the normalizing. Or he has to stick with mostly only ASCII rules (which are not much useful for Asian languages). Another possibility may be normalizing, instead to UTF, to plain 7bit US-ASCII. The currently proposed patch for ASCII normalizing transliterates also non-Latin alphabets. The patch was proposed to the dev list, so impatient and courageous users might want to try it on a non-production server, but be warned that it is not any official code (at least not now), and currently very little tested. Ivo -- View this message in context: http://spamassassin.1065346.n5.nabble.com/Current-best-practices-around-normalize-charset-tp105840p108513.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.