I forgot about the Solr/Lucene code shuffling. Back in 3.4, WDF was in Solr rather than Lucene. Here's the code:

http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_3_4/solr/core/src/java/org/apache/solr/analysis/WordDelimiterFilter.java?revision=1166268&view=markup

-- Jack Krupansky

-----Original Message----- From: Paul Hill
Sent: Tuesday, June 12, 2012 7:43 PM
To: java-user@lucene.apache.org
Subject: RE: Stemming - limited index expansion

Thanks for the reply.

-----Original Message-----
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Tuesday, June 12, 2012 1:14 PM
To: java-user@lucene.apache.org
Subject: Re: Stemming - limited index expansion

I don't completely follow precisely what you want to do, but the WordDelimiterFilter is an example of a token filter that outputs an extra token at the same position, such as with its
CATENATE_ALL/WORDS/NUMBERS options.

Thanks for directing me to that. I'm currently using 3.4., it doesn't appear in the code base of 3.6.
If it doesn't show up until 4.0+ (your link is actually 5.0!), I  know that
" Terms are no longer required to be character based. Lucene views a term as an arbitrary byte[]" -- https://builds.apache.org/job/Lucene-trunk/javadoc/changes/Changes.html#4.0.0-alpha.api_changes But hopefully it at the right level to suggest how would be done using the old CharRef instead of whatever the new stuff uses (ByteRef?).
I'll take a look.

Maybe you simple want to internally call some existing stemmer filter and output both the original and
stemmed term at the same location?

Yes, that is very close to what I want to do, possibly only with the addition of only doing stemming on a limited set of all words (but more than just plurals).

-Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to