Re: Porter stemming problem

Erick Erickson Fri, 22 Jun 2007 09:12:53 -0700

Yes, you should also stem the query terms. Otherwise, you'll have
indexed "working" as "work", but your search for "working" will look
for "working" and won't match. Which is not what you want, I'm sure.


Query.toString() will tell you a lot about how queries are
processed, BTW....

In general, unless you're very sure what the effects are, you should
use the same analyzer for indexing as you use for searching.

Best
Erick

On 6/22/07, Robert Walpole <[EMAIL PROTECTED]> wrote:


Hi,

I am using the PorterStemAnalyzer class (attached) to provide stemming
for a Lucene index.

To stem the terms in the index we use the following...

//open an index writer in append mode
IndexWriter idxWriter = new IndexWriter(LUCENE_INDEX_PATH, new
PorterStemAnalyzer(), false);

//add the lucene document to the index
idxWriter.addDocument(idxDoc);

Having inspected the index using Luke, I can confirm that the terms are
being stemmed as expected. However, in order for this to work properly I
am not clear whether I should also be stemming the search terms that are
entered?

For example there is a term "relax" in the index which I guess is
stemmed from "relaxation". If the user searches on "relaxing" do I need
to stem the search term in order for it to return the result?

At the moment I am attempting to do this as follows...

analyzer = new PorterStemAnalyzer();
parser = new QueryParser("content", analyzer);
Query query = parser.parse("keywords: relaxing");
Hits hits = idxSearcher.search(query);

...but this is not returning any matches.

Thanks
Rob Walpole
Devon Portal Developer
Email [EMAIL PROTECTED]
Web http://www.devonline.gov.uk



<<PorterStemAnalyzer.java>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Porter stemming problem

Reply via email to