Hello guys, Supposing I have one million documents, and each document has hundreds of features. For a given query, it also has hundreds of features. I want to fetch most relevant top 1000 documents by dot product related features of query and documents (query/document features are in the same feature space).
I am not sure how Lucene implement internally? If we have to go through all one million document to dot product the query, then I am concerning about the performance. Appreciate if anyone could confirm (1) how Lucene works internally for this use case (2) any smart ideas to make improvement for query efficiency to select top 1000 documents? thanks in advance, Lin