Re: How to search for terms containing negation

2014-03-17 Thread Tri Cao
StandardAnalyzer has a constructor that takes a stop word set, so I guess you can pass it an empty set:http://lucene.apache.org/core/4_6_1/analyzers-common/org/apache/lucene/analysis/standard/StandardAnalyzer.html#StandardAnalyzer(org.apache.lucene.util.Version, org.apache.lucene.analysis.util.Char

Re: How to search for terms containing negation

2014-03-17 Thread Natalia Connolly
Hi Tri, Thank you so much for your message! Yes, it looks like the negation terms have indeed been filtered out; when I query on "no" or "not", I get no results. I am just using StandardAnalyzer and the classic QueryParser: Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_47); Que

Re: How to search for terms containing negation

2014-03-17 Thread Tri Cao
Natalia,First make sure that your analyzers (both index and query analyzers) do not filter out these as stop words. I think the standard StopFilter list has "no" and "not". You can try to see if you index have these terms by querying for "no" as a TermQuery. If there is not match for that query, th

How to search for terms containing negation

2014-03-17 Thread Natalia Connolly
Hi All, Is there any way I could construct a query that would not automatically exclude negation terms (such as "no", "not", etc)? For example, I need to find strings like "not happy", "no idea", "never available". I tried using a simple analyzer with combinations such as "not AND happy", an

Fwd: any project for record linkage, fuzzy grouping, and deduplication based on Solr/Lucene?

2014-03-17 Thread Mobius ReX
-- Forwarded message -- Subject: any project for record linkage, fuzzy grouping, and deduplication based on Solr/Lucene? For example, given a new big department merged from three departments. A few employees worked for two or three departments before merging. That means, the attri