So... I know none of this work is possible to contribute back to Lucene because the API I've ended up with is too different, but I thought I would share anyway.
For a query with 10,000 terms: Before any changes: ~7s Change 1: Change QueryNodeImpl to hold an immutable list of children and only copy the children when changes are made. New time: ~4s Change 2: Change QueryNode/QueryNodeImpl to get rid of getParent() so that it doesn't have to update every time you change the hierarchy. New time: ~100ms Change 3: Change QueryNode itself to be mostly-immutable (tags are tricky and not done yet), so that trees of nodes don't have to be cloned. New time: ~80ms The next ones on the list... QueryParserLexer$DFA24.specialStateTransition() 31.157015 11,736 ms (31.2%) 11,736 ms 11,736 ms 11,736 ms TokenStream.assertFinal() 17.665968 6,654 ms (17.7%) 6,654 ms 6,654 ms 6,654 ms QueryNodeProcessorImpl.processIteration() 12.397131 4,669 ms (12.4%) 4,669 ms 19,552 ms 19,552 ms The parser one I probably can't do much about unless a newer version of ANTLR is significantly faster, but that assertFinal() is interesting. I guess this method is fairly expensive, and AnalyzerQueryNodeProcessor is creating a new one over and over again? TX --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org