Thanks for the aid, Soren!!! They are successful to extract the matrix. But with collections of large documents is not one too much expensive solution? it is possible to extract the matrix from the indexing file?
Mario Sören Pekrul wrote: > > Hello Mario, > > I had a similar problem a few weeks ago (thread "How to get Term Weights > (document term matrix)?", 2006-11-02, > http://www.gossamer-threads.com/lists/lucene/java-user/41726). > > I think there is no simple function creating a document term matrix or > accessing it. I extracted the matrix from my index and stored the matrix > in a database. > > To create the matrix I iterated the terms and the documents for each term: > TermEnum terms=IndexReader.terms(); > while(terms.next()) { > TermDocs docs=IndexReader.termDocs(terms.term()); > while(docs.next()) { > //store the term, the document and the weight > //document frequency: indexreader.docFreq(term) > //term frequency: termdoc.freq() > } > } > > Sören > > mariolone wrote: >> Hi!!!! >> I have a problem: >> i must create a matrix term for document in which every element of the >> matrix it represents the number of occurrences of that term in the >> document. >> How can I do? >> Can someone help me? >> Thanks to all.... >> >> P.S. I must applicate LSA to this matrix. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/Lucene---LSA-tf2815727.html#a7870561 Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]