Which version of Lucene are you using? Newer versions have optimized the "primary key" use case somewhat...
Mike McCandless http://blog.mikemccandless.com On Sat, Aug 8, 2015 at 8:32 AM, jamie <ja...@stimulussoft.com> wrote: > Greetings > > Our app primarily uses Lucene for its intended purpose i.e. to search across > large amounts of unstructured text. However, recently our requirement > expanded to perform look-ups on specific documents in the index based on > associated custom defined unique keys. For our purposes, a unique key is the > string representation of a 128 bit murmur hash, stored in a Lucene field > named uid. We are currently using the TermsFilter to lookup Documents in > the Lucene index as follows: > > List<Term> terms = new LinkedList<>(); > for (String id : ids) { > terms.add(new Term("uid", id)); > } > TermsFilter idFilter = new TermsFilter(terms); > ... search logic... > > At any time we may need to lookup say a couple of thousand documents. Our > problem is one of performance. On very large indexes with 30 million records > or more, the lookup can be excruciatingly slow. At this stage, its not > practical for us to move the data over to fit for purpose database, nor > change the uid field to a numeric type. I fully appreciate the fact that > Lucene is not designed to be a database, however, is there anything we can > do to improve the performance of these look-ups? > > Much appreciate > > Jamie > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org