3 maj 2007 kl. 22.06 skrev Mordo, Aviran (EXP N-NANNATEK):
Anyone knows of a good language detection library that can detect what
language a document (text) is ?
I posted this some time back:
https://issues.apache.org/jira/browse/LUCENE-826
A bit of proof-of-concept:ish, but it does the job well if you ask
me. Uses Weka (GPL) and requires at least 150 characters to be trusted.
--
karl
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]