Re: Crazy increase of MultiPhraseQuery memory usage in Lucene 5 (compared with 3)

2015-08-23 Thread Trejkaz
I spent some time carving out a quick test of the bits that matter and put them up here: https://gist.github.com/trejkaz/a72b87277b1aec800c2e The tests index 1,000,000 docs with just one instance of the field/sub-field trick we're using, plus one unique value. So it's a bit of an artificial test,

Crazy increase of MultiPhraseQuery memory usage in Lucene 5 (compared with 3)

2015-08-23 Thread Trejkaz
There is a MultiPhraseQuery we use which looks a bit like: MultiPhraseQuery query = new MultiPhraseQuery(); query.add(new Term[] { "first" }); query.add(new Term[] { "second1", "second2", ... }); The actual number of terms in this particular case is 207087. The size of the index itsel