Re: Issues with Special Characters

Erick Erickson Tue, 16 Sep 2008 06:46:07 -0700

You can easily answer the questions about what WhitespaceTokenizer
produces by getting a copy of Luke and looking at your index. Or writing
a really simple test program that prints out tokens.


At the bottom of this page is a list of special characters for escaping:
http://lucene.apache.org/java/docs/queryparsersyntax.html

Best
Erick

On Tue, Sep 16, 2008 at 9:05 AM, miztaken <[EMAIL PROTECTED]> wrote:

>
> Hi there,
> I am using WhiteSpaceAnalyser to index documents. I have used this because
> i
> need to split tokens based on space only. Also Tokensized=true
> While indexing what does it do with special characters like + - && || ! ( )
> { } [ ] ^ " ~ * ? : \, will these characters be indexed or will be chopped
> off? I am confused about this.
>
> Now i am having problem while searching as well..
> for query strings like "jason dartling (e-mail)" and "re: fyi.dat", i don't
> have to escape the special characters ( , ) and : but for input such as
> "re:" queryParser is producing error so i have escaped characters here.
> So it seems like i have two cases to deal with..
> Can anyone suggest me one generic way to deal with both the cases?
>
> Basically how to index and search string with escape characters will be my
> generalized question?
>
>
> Please help me
> miztaken
>
>
>
>
>
> --
> View this message in context:
> http://www.nabble.com/Issues-with-Special-Characters-tp19511428p19511428.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Re: Issues with Special Characters

Reply via email to