There are LSI and LSI like implementations in open source, but I am unsure what the state of play is in Java. The most interesting work I know of is in Perl available (cvs permitting) from NITLE.<http://www.nitle.org/tools/semantic/search.htm> What I would like to see is an implementation of Magnus Sahlgren, *Random Indexing *which is described as "a very simple and effective approach to automatic [which can also be used for] bilingual lexicon acquisition. The approach is cooccurrence-based, and uses the Random Indexing vector space methodology applied to aligned bilingual data. The approach is simple, efficient and scalable, and generate promising results when compared to a manually compiled lexicon." The point here is its scalability since implementations of LSI are inefficient when documents are removed from the collection. It seems to me that it is also less computationalyy intensive, and should be possible to implement quite simply in Java. Does anyone know of such an attempt? Adam
On 10/5/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > Hi all, > > I am looking for LSI implementation i lucene. Is it available. I couldnt > find it in the website. I searched in the archives but no help. could some > one tell me if it is available or not. > > Could you tell me where can i see to find if there are any Language > processing tools for Indexing and retrieval stuff available in Lucene > > thanks > >