Daniel, thanks, great work. It's getting late now, and I have a big breakfast meeting early tomorrow, so I'll take a look at this sometime after noon. Is it kosher to roll this with the language-detection stuff and all into the SA distribution then? Sounds like you've got the upstream author's OK to do that. Don't want o accidentally step on his/her toes though. If it's OK then after I patch my local tree and do a test or two, I'll check it into CVS. I'm sure by the time we get around to rolling 2.30 it'll be stable enough.
One thing that would be useful here is probably to get a couple of foreign-language messages for test purposes, along with creation of t/language_ok.t -- I'll do those too -- the latter is pretty easy, the former ought to be straightforward by just copy/pasting some text from random foreign-language websites. As far as accuracy, I understand that if the thing thinks it can't tell what the real language is, it'll try to be overly broad rather than overly narrow, but do you have any stats on how often it rules out the actual language of a message? I suppose that'll be factored into score for the rule. Thanks again, C PS Appologies for my getting a G4 pbook not a crusoe machine Daniel Quinlan wrote: DQ> I'm basically finished adapting TextCat, an open source language DQ> guesser, for use in SA. Thanks to the upstream author, it is now DQ> licensed under the same terms as Perl. At this point, I'm looking for DQ> testing help and comments. _______________________________________________________________ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk