Re: Language detection library

2007-05-07 Thread Bob Carpenter
Anyone knows of a good language detection library that can detect what language a document (text) is ? Language detection is easy. It's just a simple text classification problem. One way you can do this is using Lucene itself. Create a so-called pseudo-document for each language consi

RE: Language detection library

2007-05-04 Thread Mordo, Aviran (EXP N-NANNATEK)
Thank you, I got the natch plugin, and it is working great -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Thursday, May 03, 2007 4:17 PM To: java-user@lucene.apache.org Subject: Re: Language detection library LingPipe - commercial unless your data/product

Re: Language detection library

2007-05-03 Thread karl wettin
of a good language detection library that can detect what > language a document (text) is ? I posted this some time back: https://issues.apache.org/jira/browse/LUCENE-826 A bit of proof-of-concept:ish, but it does the job well if you ask me. Uses Weka (GPL) and requires at least 150 char

Re: Language detection library

2007-05-03 Thread Chris Lu
://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes On 5/3/07, karl wettin <[EMAIL PROTECTED]> wrote: 3 maj 2007 kl. 22.06 skrev Mordo, Aviran (EXP N-NANNATEK): > Anyone knows of a good language detectio

Re: Language detection library

2007-05-03 Thread karl wettin
3 maj 2007 kl. 22.06 skrev Mordo, Aviran (EXP N-NANNATEK): Anyone knows of a good language detection library that can detect what language a document (text) is ? I posted this some time back: https://issues.apache.org/jira/browse/LUCENE-826 A bit of proof-of-concept:ish, but it does the

Re: Language detection library

2007-05-03 Thread Andrzej Bialecki
Jason Pump wrote: http://software.wise-guys.nl/libtextcat/ ... which is what Nutch implements in its language-identifier plugin. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___

Re: Language detection library

2007-05-03 Thread Jason Pump
- Original Message From: "Mordo, Aviran (EXP N-NANNATEK)" <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, May 3, 2007 4:06:04 PM Subject: Language detection library Anyone knows of a good language detection library that can detect what language a do

Re: Language detection library

2007-05-03 Thread Otis Gospodnetic
t; <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, May 3, 2007 4:06:04 PM Subject: Language detection library Anyone knows of a good language detection library that can detect what language a document (text) is ?

Language detection library

2007-05-03 Thread Mordo, Aviran (EXP N-NANNATEK)
Anyone knows of a good language detection library that can detect what language a document (text) is ? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]