potential query performance issue

Lin Ma Fri, 15 Mar 2013 10:09:34 -0700

Hello guys,

Supposing I have one million documents, and each document has hundreds of
features. For a given query, it also has hundreds of features. I want to
fetch most relevant top 1000 documents by dot product related features of
query and documents (query/document features are in the same feature space).


I am not sure how Lucene implement internally? If we have to go through all
one million document to dot product the query, then I am concerning about
the performance. Appreciate if anyone could confirm (1) how Lucene works
internally for this use case (2) any smart ideas to make improvement for
query efficiency to select top 1000 documents?

thanks in advance,
Lin

potential query performance issue

Reply via email to