Re: More like this returning similarities that are too generic

2006-08-08 Thread Chad Hardin
You're soo right! I'm totally new to lucene (and text analyses, searching etc), but now that you showed me I "get it". Thank you so much for your reply. Chad On Aug 8, 2006, at 12:45 AM, Chris Hostetter wrote: I've never used MoreLikeThis myself, but based on how i know it works, your

Re: More like this returning similarities that are too generic

2006-08-07 Thread Chad Hardin
swer . Would it work to create your own list of stop words (possibly very large) to use for indexing and/or searching? This would simply exclude the "less common" words (as you define them). StandardAnalyzer, for instance, can take a File of stop words in one of its constructors...

More like this returning similarities that are too generic

2006-08-07 Thread Chad Hardin
hi all, I'm new to lucene but I'm loving it! I'm writing a prototype that links documents together based upon similarities. Obviously the first thing I did was use MoreLikeThis. However, it seems to be finding matches based upon words that are too common, in this case the words "from"