If you index the n-grams in their own field using ShingleFilter, you can get statistics using the same term api on that field, in which the terms *are* n-grams, and similarly for queries.

-Mike

On 12/02/2014 03:38 PM, Peter Organisciak wrote:
It is possible to get a total corpus frequency for bigram queries or
higher? i.e. How many times does the query occur in the corpus.

I'm looking to implement a count of occurrences per million terms. I know
for a single term I can use  `TermsEnum.totalTermFreq()`, is there any
comparable way to do so for a bigram or other simple query?

Thank you,

Peter



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to