har c)
{
return char.IsLetterOrDigit(c);
}
}
DIGY
-Original Message-
From: jchang [mailto:jchangkihat...@gmail.com]
Sent: Tuesday, February 02, 2010 11:16 PM
To: java-user@lucene.apache.org
Subject: Re: Can't get tokenization/stop works working
I am using
I am using org.apache.lucene.analysis.snowball.SnowballAnalyzer.
Looking through luke, I see that www.fubar.com was indexed, not fubar. So,
clearly, I'm not stripping out the stop words of www and com. Any ideas?
--
View this message in context:
http://old.nabble.com/Can%27t-get-tokenizatio
If you make com a stop word then you won't be able to search for it,
but a search for fubar should have worked. Are you sure your analyzer
is doing what you want? You don't tell us what analyzer you are
using.
Tips:
use Luke to see what has been indexed
read the FAQ entry
http://wiki.apache.
I want to be able to store a doc with a field with this as a substring:
www.fubar.com
And then I want this document to get returned when I query on
fubar or
fubar.com
I assume what I should do is make www and com stop words, and make sure the
field is tokenized, so it wil break it up along