Daniel,

thanks, great work.  It's getting late now, and I have a big breakfast meeting
early tomorrow, so I'll take a look at this sometime after noon.  Is it kosher
to roll this with the language-detection stuff and all into the SA distribution
then?  Sounds like you've got the upstream author's OK to do that.  Don't want o
accidentally step on his/her toes though.  If it's OK then after I patch my
local tree and do a test or two, I'll check it into CVS.  I'm sure by the time
we get around to rolling 2.30 it'll be stable enough.

One thing that would be useful here is probably to get a couple of
foreign-language messages for test purposes, along with creation of
t/language_ok.t -- I'll do those too -- the latter is pretty easy, the former
ought to be straightforward by just copy/pasting some text from random
foreign-language websites.

As far as accuracy, I understand that if the thing thinks it can't tell what the
real language is, it'll try to be overly broad rather than overly narrow, but do
you have any stats on how often it rules out the actual language of a message?
I suppose that'll be factored into score for the rule.

Thanks again,

C

PS Appologies for my getting a G4 pbook not a crusoe machine

Daniel Quinlan wrote:

DQ> I'm basically finished adapting TextCat, an open source language
DQ> guesser, for use in SA.  Thanks to the upstream author, it is now
DQ> licensed under the same terms as Perl.  At this point, I'm looking for
DQ> testing help and comments.


_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [EMAIL PROTECTED]
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to