On Mar 24, 2010, at 9:20 AM, Shashi Kant wrote:
> Add the common terms such as "University", "School", "Medicine",
> "Institute" etc. to stopwords list, so you are left with Stanford,
> "Palo Alto" etc.
I don't know if I would remove them, but you might consider using the
CommonGram or n-gram a
Add the common terms such as "University", "School", "Medicine",
"Institute" etc. to stopwords list, so you are left with Stanford,
"Palo Alto" etc.
Then use Ahmet's suggestion of using a booleanquery
.setMinimumNumberShouldMatch() to (say) 75% of the query string
length.
Finally, if you wish to
Hi Aaron,
Your "false positives" comments point to a mismatch between what you're
currently asking Lucene for (any document matching any one of the terms in the
query) and what you want (only fully "correct" matches).
You need to identify the terms of the query that MUST match and tell Lucene
> hi all, I have been playing
> with Lucene for a while now, but stuck on a perplexing
> issue.
>
> I have an index, with a field "Affiliation", some example
> values are:
>
> - "Stanford University School of Medicine, Palo Alto, CA
> USA",
> - "Institute of Neurobiology, School of Medicine, Sta