I think the right solution for this would use "payloads", where extra data can be added for each index token. However Lucene currently does not support this. Without this I can think of two options, each with its own disadvantage:
1) more tokens at indexing time - decide on the resolution of the percentage - say it is 5% - and add more tokens of the same. For example, the attributes field for product A in your example would look like: "2 2 2 2 2 2 2 2 2 2 5 5 5 5 5 5 3 3 13 13 13 13 13 13". 2) more tokens at search time - at indexing, include the percentage in the token. So for product A you would have: "2x50 5x30 3x10 13x30". At search time, expand the query accordingly. So the query for attribute 5 would be expanded to: "5x5^5 5x10^10 5x15^15 5x20^20 ... 5x95^95 5x100^100". The first approach would enlarge the index, so if you have lots of data that could eventually be a problem. The second approach would end up with a large query, so, again, if you have lots of data that could eventually be a problem with search time. Also, depending how strict you want the scoring to be, you may want to omit norms for this field. Hope this helps, Doron mmoser <[EMAIL PROTECTED]> wrote on 15/12/2006 13:17:05: > > So, I am still new to Lucene, so please take this into consideration when > reading this. Up until now, a novice like myself has been able to finagle > Lucene into doing what we want. But now we have a problem that I have been > searching for the answer to. We allow users to profile our products with a > predetermined profile attribute id. We then want to take all the users > profiles on a product and take a particular number of times that this > particular profile attribute id has been chosen and come out with a > percentage for it. This is no problem. Where the problem comes into play is > that we want the user to be able to search for products that match that > particular profile attribute id. We want the higher percentages to come up > on top. To add to the complexity, we want to be able to allow for the user > to select multiple profile attribute ids and still have a combination of the > score to come up higher. Keep in mind, we would like to somehow keep these > in one field, because we are trying to use the same algorithm for something > that could potentially become very large. Any suggestions. The more detail, > the better. > > Example: > > Product A > Attribute ID = 2 Percentage Chosen = 50% > Attribute ID = 5 Percentage Chosen = 30% > Attribute ID = 3 Percentage Chosen = 10% > Attribute ID = 13 Percentage Chosen = 30% > > Product B > Attribute ID = 1 Percentage Chosen = 50% > Attribute ID = 2 Percentage Chosen = 20% > Attribute ID = 3 Percentage Chosen = 75% > > So if a user selected the attributes that correspond to 2 and 3, then > Product B should show up before Product A because it has a combined score of > 95% and A has a combined score of 40%. > > Thanks for any help. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]