>>
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>> > -Original Message-
>> > From: Weiwei Wang [mailto:ww.wang...@gmail.com]
>> > Sent: Sunday, Dec
ng [mailto:ww.wang...@gmail.com]
> > Sent: Sunday, December 13, 2009 12:51 PM
> > To: java-user@lucene.apache.org
> > Subject: Re: Recover special terms from StandardTokenizer
> >
> > LowercaseCharFilter is necessary, as in the MappingCharFilter we need to
> > provid
il: u...@thetaphi.de
> -Original Message-
> From: Weiwei Wang [mailto:ww.wang...@gmail.com]
> Sent: Sunday, December 13, 2009 12:51 PM
> To: java-user@lucene.apache.org
> Subject: Re: Recover special terms from StandardTokenizer
>
> LowercaseCharFilter is necessary, as in the
gt; -Original Message-
> > From: Weiwei Wang [mailto:ww.wang...@gmail.com]
> > Sent: Sunday, December 13, 2009 12:23 PM
> > To: java-user@lucene.apache.org
> > Subject: Re: Recover special terms from StandardTokenizer
> >
> > thanks, Uwe.
>
age-
> From: Weiwei Wang [mailto:ww.wang...@gmail.com]
> Sent: Sunday, December 13, 2009 12:23 PM
> To: java-user@lucene.apache.org
> Subject: Re: Recover special terms from StandardTokenizer
>
> thanks, Uwe.
> Maybe i was not very clear. My situation is like this:
> Analyzer:
>
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: Weiwei Wang [mailto:ww.wang...@gmail.com]
> > Sent: Sunday, December 13, 2009 11:43 AM
> > To: java-user
.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Weiwei Wang [mailto:ww.wang...@gmail.com]
> Sent: Sunday, December 13, 2009 11:43 AM
> To: java-user@lucene.apache.org
> Subject: Re: Recover special terms from St
Problem solved. Now another problem comes.
As I want to use Highlighter in my system, the token offset is incorrect
after the MappingCharFilter is used.
Koji, do you known how to fix the offset problem?
On Sun, Dec 13, 2009 at 11:12 AM, Weiwei Wang wrote:
> I use Luke to check the result and
I use Luke to check the result and find only c exists as a term, no
cplusplus found in the index
On Sun, Dec 13, 2009 at 10:34 AM, Weiwei Wang wrote:
> Thanks, Koji, I followed your advice and change my analyzer as shown below:
> NormalizeCharMap RECOVERY_MAP = new NormalizeCharMap();
> RECOVERY
Thanks, Koji, I followed your advice and change my analyzer as shown below:
NormalizeCharMap RECOVERY_MAP = new NormalizeCharMap();
RECOVERY_MAP.add("c++","cplusplus$");
CharFilter filter = new LowercaseCharFilter(reader);
filter = new MappingCharFilter(RECOVERY_MAP,filter);
StandardTokenizer token
Thanks, Koji
On Fri, Dec 11, 2009 at 7:59 PM, Koji Sekiguchi wrote:
> MappingCharFilter can be used to convert c++ to cplusplus.
>
> Koji
>
> --
> http://www.rondhuit.com/en/
>
>
>
> Anshum wrote:
>
>> How about getting the original token stream and then converting c++ to
>> cplusplus or anyothe
MappingCharFilter can be used to convert c++ to cplusplus.
Koji
--
http://www.rondhuit.com/en/
Anshum wrote:
How about getting the original token stream and then converting c++ to
cplusplus or anyother such transform. Or perhaps you might look at
using/extending(in the non java sense) some ot
How about getting the original token stream and then converting c++ to
cplusplus or anyother such transform. Or perhaps you might look at
using/extending(in the non java sense) some other tokenized!
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everyb
Hi, all,
I designed a ftp search engine based on Lucene. I did a few
modifications to the StandardTokenizer.
My problem is:
C++ is tokenized as c from StandardTokenizer and I want to recover it from
the TokenStream from StandardTokenizer
What should I do?
--
Weiwei Wang
Alex Wang
王巍巍
Room
14 matches
Mail list logo