Slowdown during the search for similar documents

Valery Khamenya Sat, 27 Mar 2010 10:23:23 -0700

Hi,

there is a strange slowdown during the search for similar documents.
For some reason pylucene version is much slower than the pure Lucene one.
The test document collection contains 200K docs.


Here is the pylucene version:

content = ref_doc.getField('content').stringValue()
similarity_query = SimilarityQueries.formSimilarQuery(content,
default_analyzer, 'content', None)
search = index.search(similarity_query, 200)

Any comments on why it is so and the ideas on how to fix it are welcome.

best regards
--
Valery A.Khamenya

Slowdown during the search for similar documents

Reply via email to