I think this is the problem that you're running into, though maybe a person
with more expertise can confirm...
ZP, If you look at section 5.1 of the Zhai Lafferty paper (
http://www.cs.cmu.edu/~lafferty/pub/smooth-tois.ps), they note that the
"term weight is log(1+(1-\lambda)p_ml(q_i|d) / \lamdba
It is possible to get a total corpus frequency for bigram queries or
higher? i.e. How many times does the query occur in the corpus.
I'm looking to implement a count of occurrences per million terms. I know
for a single term I can use `TermsEnum.totalTermFreq()`, is there any
comparable way to do