This is expected/intentional, because computing the "true" unique term count across multiple segments is exceptionally costly (you have to do the merge sort to de-dup).
If you really want the true count, you can pull the TermsEnum and .next() until exhaustion. Alternatively, you can use IndexReader.getSequentialSubReaders(), then step through each SegReader calling its .getUniqueTermCount() and then somehow "approximate" (eg the sum will be an upper bound of the total unique count). Mike On Tue, Sep 7, 2010 at 2:34 AM, Ryan McKinley <[email protected]> wrote: > Hello- > > I'm looking at using the new terms.getUniqueTermCount() to give a > quick count for the LukeRequestHandler rather then needing to walk all > the terms. > > When solr index reader has just one segment, it works great. However > with more segments I get: > > java.lang.UnsupportedOperationException: this reader does not > implement getUniqueTermCount() > at org.apache.lucene.index.Terms.getUniqueTermCount(Terms.java:84) > > Is this expected? Is there any way around that? > > I am getting the terms using: > > Terms terms = MultiFields.getTerms(reader, fieldName); > long cnt = (terms==null) ? 0 : terms.getUniqueTermCount(); > > Thanks > ryan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
