Hello fellow lucener :), firstly im no tutor but i will try my best to
explain, if anyone believes im wrong please state it so our friend doesnt
get the wrong idea,
here it goes. O_o
stemming is reducing the word to the root form, where lemmatisation is
concerned with linguistics i believe
lemmatisation is "go", "gone", "going", "goes", "been" and "went"
where stemming a word would be reducing a word from "gone" to "go", so it
can be matched to other stemmed words such as "going", as "going" stemmed
would be "go" also,
a better example.
"engineering", "engineers", "engineered", "engineer"
these four words would not match up if they were tested for equality,
however by stemming these words we can reduce them to a more basic form,
engineering --> engineer
engineers --> engineer
engineered --> engineer
engineer --> engineer
now we have stemmed words they will match for equality, so now if i try
searching using the word for engineer, documents on engineering, engineers
and engineered would be returned from a stemmed index/database.
I hope that helps. Do a search for "porter stemmer" for more information.
_gk
----- Original Message -----
From: "Koji Sekiguchi" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>
Sent: Sunday, November 20, 2005 3:48 PM
Subject: What is stemming?
Hello, Luceners!
What is "stemming"?
I have Lucene in Action and found the following definitions on page 103:
- reducing words to a root form (stemming)
- changing words into the basic form (lemmatization)
but I cannot see the difference between them.
I'm also confused by the following words on page 71:
- convert terms to base word forms (stemming)
It seems "lemmatization" to me...
Could someone explain what "stemming" is?
Some illustrative cases would be very appreciated.
My mother tongue is not English.
Thank you,
Koji
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]