Mh, now I found out that latests sdks provide a priority queue too, although
unbounded. But, supposing the queue has already top-n elements, I could
compare the current heap with the head (O(1)), if it is smaller or equal
just discard it, and another way remove the head ((O(log(top-n))) and insert
the current element ((O(log(top-n)) again). In overall, the entire algorithm
will be O(n log(top-n)) as wished. Do you think this is a good idea? Do you
know of a better way to achieve the same recurring to lucene arcana?

Cheers,
Carlos

http://java.sun.com/j2se/1.5.0/docs/api/java/util/PriorityQueue.html

On 6/1/07, Carlos Pita <[EMAIL PROTECTED]> wrote:

Hi all,

I need to find a couple of result sets at the same time from the same
search-criteria. The two sets are sorted according different sort-criteria.
From both I need just the few top N results, but anyway, because of business
rules, I need to process the entire hit set for the search. Till now I have
implemented this search for just one result set on top of custom
TopFieldDocCollector, Filter and Sort (w/ custom ScoreDocComparator). But
now, having to generate the two different sets, I see two choices:

1) Keep the same implementation, but search twice, switching the Sort on
each search. Here not only the search is done twice but the HitCollector
iterates along all documentIds twice also.

2) Instead of a TopFieldDocCollector use some kind of bounded priority
queue optimized for top-N results. With a HitCollector, a Filter and two of
these queues it's easy and efficient to find both result sets on one simple
pass. Do you know of a good implementation of this data structure? This
seems nice
http://www.alias-i.com/lingpipe/docs/api/com/aliasi/util/BoundedPriorityQueue.htmlbut
 it has some license issues. Is there something similar that I can borrow
from lucene sources?

Any other suggestion?

Thank you very much in advance.
Regards,
Carlos


Reply via email to