Not sure if I fully get it, but bear with me...
Inline below.
On Oct 6, 2008, at 11:37 PM, Alexander Devine wrote:
Hi Luceners,
I have a particular sorting problem and I wanted some advice on what
the
best implementation approach would be. We currently use Lucene as the
searching engine for our vacation rental website. Each vacation rental
property is represented by a single document in Lucene. We want to
add a
feature that allows users to sort results by price. The problem is
that each
rental property can potentially have a different price for each day.
For
example, many rental properties charge more on weekends, or higher
rates for
on-season vs. off-season. When a user performs a search, they can
specify
the start and end dates of their desired travel, so we should be
able to
calculate the total price for that time period for each property by
adding
up the price for each day.
I was thinking of implementing this using payloads. Each document
would have
a "priceByDate" field and there would be one term stored for each
day for
the next 2 years (which is as far out as we support booking a
property).
Don't you then have to update every document every day?
The
payload associated with each term would be the price, and then when
searching I could use those payloads as basis for scoring the docs
using a
BoostingTermQuery. For example, suppose someone was searching for
travel
dates Dec 1 - Dec 5, I would create a query like so:
BooleanQuery query = new BooleanQuery();
query.add(new PriceBoostingTermQuery(new Term("priceByDate",
"20081201")),
BooleanClause.Occur.MUST);
query.add(new PriceBoostingTermQuery(new Term("priceByDate",
"20081202")),
BooleanClause.Occur.MUST);
query.add(new PriceBoostingTermQuery(new Term("priceByDate",
"20081203")),
BooleanClause.Occur.MUST);
query.add(new PriceBoostingTermQuery(new Term("priceByDate",
"20081204")),
BooleanClause.Occur.MUST);
PriceBoostingTermQuery would be a subclass of BoostingTermQuery that
overrides getSimilarity() to return a Similarity with a custom
scorePayload
method that scores based on the price.
Does this approach sound reasonable? Can anyone think of a better
approach?
One thing I don't understand is that the score needs to be the SUM
(or the
average) of all the payloads - how does the BooleanQuery handle
that? Also,
I need to get the calculated total price back to the caller, and I'm
worried
that making priceByDate a stored field will have negative performance
implications. Perhaps there is some way I could just return the
calculated
price as the score and then get it from the ScoreDoc?
Thanks for any and all help, and a huge thank you to all the Lucene
devs for
a great product. The reason I'm trying to solve this problem in Lucene
instead of a database is because Lucene is so much faster for our
queries
and I don't want to add a DB into the mix.
I think you would be better off with a Function Query (see the
org.apache.search.function package), but I am not sure.
How do you calculate the cost of the rental? Is there some way to
just factor that into the scoring process? I think if you could do
this, then you could implement a Function Query to do so.
You might look at Solr's function query capabilities as well.
Alex
--------------------------
Grant Ingersoll
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]