Hi Rusty, Thanks for the answer.
We have indexed the following json object: { "@class": "com.starsite.data.Answer", "answer_text": "momo is the best nepalese food", "keywords": null, "metaDescription": null, "post_date": null, "id": "202ba4ac-0fd3-4709-ba84-463e0caa413c", "version": 1, "scope": [ "type|com.starsite.data.Answer" ] } we issued the following query: answer_text: "food" and the data we got in keydata was as follows: [{"p":[4,0],"score":[4.855199135883779,1.8398742574541822]}] What does 0-indexing mean ? If the scoring in riak-search is done based on vector-space model like in lucene, I was expecting the scores to be normalized between 0 and 1. In case of position information, I assume the words 'is' and 'the' are removed as part of stopwords removal. If they're not removed the position should have been 5. If they are removed, the position should have been 3. The word "food" occurs only once. Shouldn't we be getting just one position ? Thanks, Archana On Aug 5, 2011, at 11:08 AM, Rusty Klophaus wrote: Hi Archana, Yes, the 'p' attribute is positional information. That list is indicating that the term occurs on the 0th and 43rd positions in the document, and is 0-indexed. Not sure why you are getting two positions if the word only occurred once. What was the original query? The scoring information that you see is a bug. For now, as a workaround, you can add the scores together. This will give you a *relative* score, allowing you to rank results for the current query. To fix this issue, some processing needs to happen within riak to combine and normalize the scores into a final score that can be used for correct ranking against other queries as well. (This is being done for the Solr interface, but not the Map/Reduce interface.) Riak Search models scoring after Lucene as much as possible, so you can read this for more information about scoring, especially the final normalization step: http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html This issue is tracked in https://issues.basho.com/show_bug.cgi?id=1154 Best, Rusty On Thu, Aug 4, 2011 at 3:27 PM, Archana Bhattarai <abhatta...@sharecare.com<mailto:abhatta...@sharecare.com>> wrote: Hi Rusty, Thanks a lot for the answer. We could get some data in the keydata as follows: [{"p":[43,0],"score":[5.3669048584479,1.7201627119528418]} But couldn't exactly interpret what it's representing. I believe p is giving positional information. But why is it two dimensional when the word we searched only occurred once in the document. Does the position ignore stopword positions and just count other words? Also why are there two scores ? Isn't the score normalized ? Or am I doing something wrong to get these scores ? Thanks a lot in advance, Archana On Jul 22, 2011, at 11:09 AM, Rusty Klophaus wrote: Hi Archana, Yes. When you use a search query to initiate a map/reduce job, the scores are fed into the first phase as keydata, along with other metadata about the search result including positional information and any inline fields. More information in the links below: * http://wiki.basho.com/Riak-Search---Querying.html#Querying-Integrated-with-Map-Reduce * http://wiki.basho.com/MapReduce.html (search for "keydata") Best, Rusty On Fri, Jul 22, 2011 at 10:53 AM, Archana Bhattarai <abhatta...@sharecare.com<mailto:abhatta...@sharecare.com>> wrote: Hi, Is there a way to get back the score while querying via solr interface or ideally mapreduce over search ? It looks like solr interface only supports sorting. Thanks in advance, Archana _______________________________________________ riak-users mailing list riak-users@lists.basho.com<mailto:riak-users@lists.basho.com> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Rusty Klophaus Basho Technologies, Inc. 11921 Freedom Drive, Suite 550 Reston, VA 20190 www.basho.com<http://www.basho.com/> _______________________________________________ riak-users mailing list riak-users@lists.basho.com<mailto:riak-users@lists.basho.com> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Rusty Klophaus Basho Technologies, Inc. 11921 Freedom Drive, Suite 550 Reston, VA 20190 www.basho.com<http://www.basho.com/>
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com