Hi Mathieu,

You can't add TokenFilters to an existing Analyzer. However, implementing an Analyzer that acts just like the StandardAnalyzer plus your Stemmer is pretty straightforward. StandardAnalzyer.tokenStream() looks like:
/** Constructs a [EMAIL PROTECTED] StandardTokenizer} filtered by a [EMAIL 
PROTECTED]
StandardFilter}, a [EMAIL PROTECTED] LowerCaseFilter} and a [EMAIL PROTECTED] StopFilter}. */
  public TokenStream tokenStream(String fieldName, Reader reader) {
    TokenStream result = new StandardTokenizer(reader);
    result = new StandardFilter(result);
    result = new LowerCaseFilter(result);
    result = new StopFilter(result, stopSet);
//ADD your Stemming Filter here, or one line above if your Stop word list works off of stemmed words
    return result;
  }

So just create a new Analyzer that has these same filters, plus your stemming TokenFilter. Looking at the source of SnowballAnalyzer (contrib/snowball) may also be useful.

FWIW, it is not that hard to make a "configurable" analyzer similar to what Solr does, if you find you need to change the filters in your analyzer a lot.

Cheers,
Grant


On Mar 5, 2007, at 1:25 PM, DECAFFMEYER MATHIEU wrote:


Hi,
This is a very simple question, but I just can't find the ressources I need ...
I am using the StandardAnalyzer :
StandardAnalyzer stdAnalyzer;
if ((stopWordList != null) && (stopWordList.length != 0)) {
stdAnalyzer = new StandardAnalyzer(stopWordList);
} else {
stdAnalyzer = new StandardAnalyzer();
}
What I want to achive is be able to use an englsih stemmer,
But I can't find any methods to associate my stemmer to my Analayzer.
I appreciate any help, thank u.

__________________________________

   Mathieu Decaffmeyer
   Web Developer
   Fortis Banque Luxembourg
   50, avenue J. F. Kennedy
   L-2951 Luxembourg
   IS Retail Banking - Web Content Management
   Mobile : 0032  479 / 69 . 42 . 96



============================================
Internet communications are not secure and therefore Fortis Banque Luxembourg S.A. does not accept legal responsibility for the contents of this message. The information contained in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Nothing in the message is capable or intended to create any legally binding obligations on either party and it is not intended to provide legal advice.
============================================
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ LuceneFAQ


Reply via email to