Index optimisations for shingles + ngram language modelling

2013-11-11 Thread Matthew Willson
Hi all So after attending Lucene Revolution (thanks all for some really interesting talks!) I've gotten a renewed interested in using lucene to do clever things with shingles. The main problems with shingles seem to be that they swell the index size quite a bit and that a lot of time can be spent

Re: Long query optimisation: using some terms for scoring only

2012-12-11 Thread Matthew Willson
mprove the performance a lot. For your reference: http://dl.acm.org/citation.cfm?id=956944 But it needs to change index a little bit. Thanks, On Tue, Dec 11, 2012 at 6:19 AM, Matthew Willson wrote: Hi all I'm currently benchmarking Lucene to get an understanding of what optimisations ar

Long query optimisation: using some terms for scoring only

2012-12-11 Thread Matthew Willson
Hi all I'm currently benchmarking Lucene to get an understanding of what optimisations are available for long queries, and wanted to check what the recommended approach is. Unsurprisingly a naive approach to long queries (just keep adding SHOULD clauses to a big BooleanQuery) scales close to