If you index the n-grams in their own field using ShingleFilter, you can
get statistics using the same term api on that field, in which the terms
*are* n-grams, and similarly for queries.
-Mike
On 12/02/2014 03:38 PM, Peter Organisciak wrote:
It is possible to get a total corpus frequency for bigram queries or
higher? i.e. How many times does the query occur in the corpus.
I'm looking to implement a count of occurrences per million terms. I know
for a single term I can use `TermsEnum.totalTermFreq()`, is there any
comparable way to do so for a bigram or other simple query?
Thank you,
Peter
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org