Hello all, We have done some preliminary research on Porter2 and K-stem algorithms and have some questions.
Porter2 was found to be a 'strong' stemming algorithm where it strips off both inflectional suffixes (-s, -es, -ed) and derivational suffixes (-able, -aciousness, -ability). K-Stem seemed to be a weak stemming algorithm as it strips off only the inflectional suffixes (-s, -es, -ed). In IR, it is usually recommended using a "weak" stemmer, as the "weak" stemmer seldom hurts performance, but it usually provides significant improvement with precision. However, Porter2 is the most widely used stemming algorithm AND it is a 'strong' stemmer which is contrary to what is said above. Can you share your ideas, experiences with stemmer algorithms? Thanks in advance. Sibel --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]