Re: Indexing puncutation

Peter Pimley Wed, 29 Jun 2005 04:25:03 -0700


I'm not sure how useful this reply is, but hey ;)


<aol>me too!</aol>

I do a vaguely similar thing; I have to strip accents from characterssuch as e-acute out of both my input data and my incoming search queriesto put them into a standard form. I do this with a custom TokenFiltersubclass. I have an analyzer that includes this filter along with someof the standard ones (LowercaseFilter, etc). I run the same analyzer onindexing and searching, which has been discussed in other posts.

My point is that I'm happy with this approach and I'd recommend you do asimilar thing, at least as a first attempt.


Cheers,
Peter Pimley



Aigner, Thomas wrote:

Hello all,

        I am VERY new to Lucene and we are trying out Lucene to see if
it will accomplish the vast majority of our search functions.

        I have a question about a good way to index some of our product
description codes.  We have description codes like 21-MA-GAB and other

punctuation. Our users need to be able to search for "21 MA GAB" or"21-MA_GAB" or "21MAGAB". Is the best way to accomplish this by

creating synonyms for the 3 different ways when punctuation is in parts
to search for? I know I can stop punctuation in the index but what about
grouping the information together or with spaces?

Thanks all in advance,
Tom


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Indexing puncutation

Reply via email to