Tom Burton-West
http://www.hathitrust.org/blogs/large-scale-search
-Original Message-
From: Mike O'Leary [mailto:tmole...@uw.edu]
Sent: Thursday, December 15, 2011 12:34 PM
To: java-user@lucene.apache.org
Subject: Obtaining IDF values for the terms in a document set
We have a large s
all of the terms that occur in the
document set and obtain their IDF values.
Thanks,
Mike
-Original Message-
From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
Sent: Thursday, December 15, 2011 11:44 AM
To: java-user@lucene.apache.org
Subject: Re: Obtaining IDF values for the
On Thu, Dec 15, 2011 at 6:33 PM, Mike O'Leary wrote:
> We have a large set of documents that we would like to index with a
> customized stopword list. We have run tests by indexing a random set of about
> 10% of the documents, and we'd like to generate a list of the terms in that
> smaller set
We have a large set of documents that we would like to index with a customized
stopword list. We have run tests by indexing a random set of about 10% of the
documents, and we'd like to generate a list of the terms in that smaller set
and their IDF values as a way to create a starter set of stopw