On 22/01/2011 15:43, Koji Sekiguchi wrote:
(11/01/20 22:19), Paul Taylor wrote:
Trying to extend MappingCharFilter so that it only changes a token if
the length of the token
matches the length of singleMatch in NormalizeCharMap (currently the
singleMatch just has to be
found in the token I want ut to match the whole token). Can this be
done it sounds simple enough but
I cannot make any headway understanding the MappingCharFilter source
code
thanks Paul
Paul,
Can you give us a concrete input/output (you wanted) with mapping table
so that I can understand what you want?
Thanks,
Koji
Sure
charConvertMap.add("!!!","ApostropheApostropheApostrophe");
charConvertMap.add("*** ***","StarStarStar");
charConvertMap.add("!","Apostrophe");
Normally, punctuation gets removed during index and searching which is
what I want for good search results but when the token only contains
specific punctuation strings I don't want to remove the punctuation
because it would make it impossible to match, so I convert it to a
textual representation.
As it stands in the 3rd case '!' will be preserved wherever it is found,
so to get a good match on 'Wow!' you would have to search for 'Wow!. But
I want you to be able to search for 'Wow' and it return 'Wow!' which is
the case if "!" isn't in the char convert map, but if you searched for
'!' I want it to return the token which is just '!' which is only the
case if the value is added to the map.
I need to do this because the text we are indexing and searching are
short strings representing an music artist name (there is an artist
called !!!)
thanks Paul
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org