Hi,
Mahout and Carrot2 can cluster the documents from lucene index.
ahmet
On Tuesday, November 11, 2014 10:37 PM, Elshaimaa Ali
wrote:
Hi All,
I have a Lucene index built with Lucene 4.9 for 584 text documents, I need to
extract a Document-term matrix, and Document Document similarity matri
The project semanticvectors might be doing what you are looking for.
paul
On 11 nov. 2014, at 22:37, parnab kumar wrote:
> hi,
>
> While indexing the documents , store the Term Vectors for the content
> field. Now for each document you will have an array of terms and their
> corresponding fre
hi,
While indexing the documents , store the Term Vectors for the content
field. Now for each document you will have an array of terms and their
corresponding frequency in the document. Using the Index Reader you can
retrieve this term vectors. Similarity between two documents can be
computed as