FrenchAnalyzer has a stemmer built in. You are seeing the result of that
stemmer in action. If you would not like to stem, you should take a look
at the code for FrenchAnalyzer and copy it to make your own...just
remove the FrenchStemming filter.
- Mark
Jolinar13 wrote:
Finally, I use the standard analyzer with some custom stop words :
le,la,les,l',un,une,des,d',à,au,de,et,en,dans,se,sont,qui,a,est,il,pour,que,du,sa,par,mais,sur,avec,aux,ce,d,s,l,ou,pas,ses
Thanks anyway
Florian
Jolinar13 wrote:
It looks like it remove the letter in the end, if it ends with an 'a', 'e'
or 'i'.
Femelles => all:femel
Is this expected?
How to use FrenchAnalyzer?
Thanks
Florian
Jolinar13 wrote:
Some terms I tested :
vehicle => all:vehicl
vehiCle => all:vehicle
Vehicle => all:vehicl
VeHicle => all:vehicle
VEHICLE => all:vehicle
vehicles => all:vehicl
paris => all:par
:S
Jolinar13 wrote:
Thanks to Luke, I realized my terms were not parsed correctly, and this
has nothing to do with upper case!
It seems to happen when the word ends with "*i". For example "giovanni"
is parsed "giovann".
Something about this?
Florian
Jolinar13 wrote:
Hello Mark!
Thank you a lot for your answer.
You are right for the Luke part. My Luke version was too old. My bad.
But with Luke I still observe the problem I described.
Any idea how to sort this out?
Maybe this has to do with the fact I use Compass?
Thank you
Florian
I got strange
search results on strings in uppercase. (example : VEHICLE)
When I search the string (in lower case), I get no result. I get
results
if
I use "vehicle*" or "vehiclE", or "vehicLe" etc.
What is odd is that it affects only some of the strings, not all of
them.
markrmiller wrote:
FrenchAnalyzer does lowercase and using it would not in anyway alter
Lukes ability to read your index.
- Mark
Jolinar13 wrote:
Hello Erick,
Still no idea about my problem?
Anybody here using the FrenchAnalyzer?
Thanks,
Florian
Jolinar13 wrote:
Hello,
Thank you for your quick answer.
I use Luke to examine the index, but since I switched to
FrenchAnalyzer,
it says 'Not a Lucene index'.
If I open the index files in a text viewer, the strings are in UPPER
case.
I do use the same analyzer to index and search.
So, do I have to specify the FrenchAnalyzer not to be case
sensitive? How
to do that?
Thanks a lot
Florian
Erick Erickson wrote:
First have you gotten a copy of Luke to examine your index to see
what's actually indexed?
The default behavior is usually to lowercase everything, but I'm
not
entirely sure if the French analyzer does this. But I suspect so.
Searches are case sensitive. To get caseless searching, you need
to put everything in the same case. This is usually done for you
with
any of the standard analyzers, but check specifically.
Are you using the same analyzer at index AND search time?
Best
Erick
On 5/21/07, Jolinar13 <[EMAIL PROTECTED]> wrote:
Hello,
I tried org.apache.lucene.analysis.fr.FrenchAnalyzer and I got
strange
search results on strings in uppercase. (example : VEHICLE)
When I search the string (in lower case), I get no result. I get
results
if
I use "vehicle*" or "vehiclE", or "vehicLe" etc.
What is odd is that it affects only some of the strings, not all
of
them.
Anyone who has ever experienced this problem?
Thanks,
Florian
--
View this message in context:
http://www.nabble.com/Very-odd-behaviour-of-FrenchAnalyzer-with-strings-in-capital-letters-tf3789153.html#a10715673
Sent from the Lucene - Java Users mailing list archive at
Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]