Mike

Thank you kindly for the reply. I am using Lucene v4.10.4. Are the optimization you refer to, available in this version?

We haven't yet upgraded to Lucene 5 as there appear to be many API changes.

Jamie

On 2015/08/08 5:13 PM, Michael McCandless wrote:
Which version of Lucene are you using?  Newer versions have optimized
the "primary key" use case somewhat...

Mike McCandless

http://blog.mikemccandless.com


On Sat, Aug 8, 2015 at 8:32 AM, jamie <ja...@stimulussoft.com> wrote:
Greetings

Our app primarily uses Lucene for its intended purpose i.e. to search across
large amounts of unstructured text. However, recently our requirement
expanded to perform look-ups on specific documents in the index based on
associated custom defined unique keys. For our purposes, a unique key is the
string representation of a 128 bit murmur hash, stored in a Lucene field
named uid.  We are currently using the TermsFilter to lookup Documents in
the Lucene index as follows:

List<Term> terms = new LinkedList<>();
             for (String id : ids) {
                 terms.add(new Term("uid", id));
}
TermsFilter idFilter = new TermsFilter(terms);
... search logic...

At any time we may need to lookup say a couple of thousand documents. Our
problem is one of performance. On very large indexes with 30 million records
or more, the lookup can be excruciatingly slow. At this stage, its not
practical for us to move the data over to fit for purpose database, nor
change the uid field to a numeric type. I fully appreciate the fact that
Lucene is not designed to be a database, however, is there anything we can
do to improve the performance of these look-ups?

Much appreciate

Jamie



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to