The problem is that the TermVectorsFormat for the default codec (Lucene40TermVectorsFormat) does not store this statistic per-document, currently. We could in theory fix this ... maybe open an issue / make a patch if it's important?
-1 return value is actually "valid": it means this statistic is not available. Mike McCandless http://blog.mikemccandless.com On Fri, Jan 4, 2013 at 2:39 AM, 장용석 <need4...@gmail.com> wrote: > Hello. > I have some questions. > > Document 1 : "learning perl learning java learning ruby" > Document 2 : "perl test" > > I have indexed this documents, with StoreTermVectors(true) and > IndexOptions.DOCS_AND_FREQS. > Field name is "f". > > And I executed this code. > > IndexReader ir = IndexReader.open(dir); > Terms terms = ir.getTermVector(0, "f"); > > System.out.println(terms.getDocCount()); -> 1 > System.out.println(terms.getSumDocFreq()); -> 4 > System.out.println(terms.getSumTotalTermFreq()); -> -1 > > I think this terms instance acts like a single-document inverted index. > > So getDocCount is 1 (single document), and getSumDocFreq is 4. (because > each term's docFreq is 1) > Is this right? > > But I can't understand why getSumTotalTermFreq method return -1. > In javadoc getSumTotalTermFreq is sum of > TermsEnum.totalTermFreq<eclipse-javadoc:%E2%98%82=aboutLucene4/lib%5C/lucene-core-4.0.0.jar%3Corg.apache.lucene.index(Terms.class%E2%98%83Terms~getSumTotalTermFreq%E2%98%82TermsEnum%E2%98%82totalTermFreq> > . > > I think in Document1, each term's totalTermFreqs are [learning, 3], [java, > 1], [perl, 1], [ruby, 1]. > So getSumTotalTermFreq method's result is 6 not -1. > > Why temrs.getSumTotalTermFreq() method return -1? > > > Thanks in advance. > -- > DEV용식 > http://devyongsik.tistory.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org