Your request seems to be a two steps query. First step, you select image, and then collection Second step, you sort collection.
BitVector can help you? M. Antoine Baudoux a écrit : > Hi, > > I'm developping an image database. Each lucene document > representing an image contains (among other fields ): > > - a date field > - a collection field containing the ID of the collection the image > belongs to. > > I want to be able to give a score to each collection. Collections > with a higher score appear first in the results. I want to avoid > re-indexing all the documents each time i change my collection scores. > > For example on day 1 I decide to give collection #1 a 5 score and > collection #3 a 10 score --> images belonging to collection #3 appear > first in search results. > One day 2 i give collection #3 a 2 score --> images belonging to > collection #1 appear first in search results. > > I have read the lucene docs, and from what i understand there are > many ways to achieve what I want : > > > - I can use a Very big Boolean query (OR query in fact) containing > one TermQuery per collection ID, setting the correct boost factor for > each termquery. The problem with this is that i have 300 collections, > so i have a boolean query with 300 terms that i append to each query i > make. I am afraid that it will be slow. > > - I can use a ValueSourceQuery, where for each document i compute > a custom score based on the value of the collection field. Will it be > faster than the first solution? > > - I can do advanced things such as writing a custom HitCollector, > or a custom Query. > > - I can add another field to each document, containing a computed > custom score, then i could sort on that field. But i want to avoid > this solution at all costs, since it would mean re-indexing all the > documents each time the collection scores change. > > What solution do you suggest? Is there another solution that i > didnt mention? > > More recent documents should also come first : In fact the final > sorting should be a ponderated sum between the collection score of an > image and the date of an image : most recent images from the > best-scored collections come first, then most recent from less-scrored > collections, then less recent from best scored, and so on. I would > also like to be able to adjust the balance between date/collection score. > > What solution do you suggest? > > > I would also like to implement random-sorting. My solution is : i > create 12 new fields R1 -> R12 for each document, each containing a > random number between 1 and 12. To get a random sort, i sort each day > with a different combination of R1 .. R12. For example : > > Day 1 : i sort by R1 then R4 then R5.. > Day 2 : i sort by R10 then R9 then R2.... > etc... > > Is it a good solution? Is there another way to do it? > > > Very big thx in advance for your answers. > > Antoine > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]