Hello, Are there any suggestions / best practices for using Lucene for searching non-linguistic text? What I mean by non-linguistic is that it's not English or any other language, but rather product codes. This is presenting some interesting challenges. Among them are the need for pretty lax wildcard searches. For example, ABC should match on ABCD, but so should BCD. Also, it needs to be agnostic to special characters. So, ABC/D should match ABCD as well as ABC-D or "ABC D".
As I write an analyzer to handle these cases, I seem to be pretty quickly degrading into a "like '%blah%' search, with rules to treat all special characters as single-character, optional wildcards. I'm concerned that the performance of this will be disappointing, though. Any help would be much appreciated. Thanks! - Jes -- View this message in context: http://www.nabble.com/Search-in-non-linguistic-text-tp24515936p24515936.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org