Re: QueryParserUtil, big query with wildcards -> runs endlessly and produces heavy load

2014-06-26 Thread Michael McCandless
The test case is "only" parsing this query, not trying to run it, right? So it doesn't involve automaton/FST ... just the flexible query parser code? It seems bad that flexible QP would take so long, even if the query is "strange". Can you open an issue, and maybe attach a thread dump so we can

Re: QueryParserUtil, big query with wildcards -> runs endlessly and produces heavy load

2014-06-26 Thread Erick Erickson
I suspect you're getting leading wildcard searches as well, which must do entire term scans unless you're doing the reverse trick. Replacing all successive whitespace gives you: Lorem*ipsum*dolor*sit*amet,*consetetur*sadipscing*elitr,*sed*diam*nonumy*eirmod*tempor*invidunt*ut*labore*et*dolore*magn

Re: QueryParserUtil, big query with wildcards -> runs endlessly and produces heavy load

2014-06-26 Thread Jack Krupansky
I'll defer the the hard-core Lucene committers for the technical details, but I would suggest that a very large term with dozens of wildcards is a "known limitation" (albeit not well-documented.) IOW, to use wildcards in Lucene in a performant manner, they need to be "brief". -- Jack Krupansky