I'm working with Lucene 4.0 and I didn't use lucene's QueryParser, so setAllowLeadingWildcard() is irrelevant. I also realised the issue wasn't with querying, but it was indexing whihch left the terms with leading special character out.
My goal was to do a fuzzymatch by creating a trigram index. The idea is to tokenize the documents into trigrams, not by words during indexing and searching so lucene can search for part of a word or phrase. Say the original text in the document said : "Sample text with special characters :) and such" It's tokenized into 'sam', 'amp', 'mpl', 'ple', 'let', 'ete', 'tex', 'ext', 'xtw', 'twi', 'wit', 'ith', 'ths', 'hsp', 'spe', 'pec', 'eci', 'cia', 'ial', 'alc', 'lch', 'cha', 'har', 'ara', 'rac', 'act', 'cte', 'ter', 'ers', 'rs:', 's:)', ':)a', ')an', 'and', 'nds', 'dsu', 'suc', 'uch'. The above is output from my tokenizer so there's nothing wrong with creating trigrmas. However, when I check the index with lukeall, all the other trigrams are indexed correctly except for the terms ':)a' and ')an'. Since the missing indexes are related to lucene's special characters, I don't think it's got to do with my custom code. I only changed analyser in the IndexFiles.java from demo to index the file. Honestly, I can't locate even the exact class in which the problem is caused. I'm only guessing IndexWriterConfig or IndexWriter is discarding the terms with leading special characters. I hope the above infromation helps. 2013/1/11 Ian Lea <ian....@gmail.com> > QueryParser has a setAllowLeadingWildcard() method. Could that be > relevant? > > What version of lucene? Can you post some simple examples of what > does/doesn't work? Post the smallest possible, but complete, code that > demonstrates the problem? > > > With any question that mentions a custom version of something, that > custom version has to be the prime suspect for any problems. > > > -- > Ian. > > > On Thu, Jan 10, 2013 at 12:08 PM, Hankyu Kim <gksr...@gmail.com> wrote: > > Hi. > > > > I've created a custom analyzer that treats special characters just like > any > > other. The index works fine all the time even when the query includes > > special characters, except when the special characters come to the > begining > > of the query. > > > > I'm using spanTermQuery and wildCardQuery, and they both seem to suffer > the > > same issue with queries begining with special characters. Is it a > > limitation of Lucene or am I missing something? > > > > Thanks > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >