Re: Help Relevance Feedback (Rocchio) with lucene

2016-06-28 Thread Ahmet Arslan
Hi Andres, While there can be other ways, in general term vectors are used to extract "important terms" from top-k documents returned by the initial query. Please see getTopTerms() method in http://www.cortecostituzionale.it/documenti/news/advancedluceneeu_69.pdf Ahmet On Tuesday, June 28, 20

Help Relevance Feedback (Rocchio) with lucene

2016-06-28 Thread Andres Fernando Wilches Riano
Hello I want to implement rocchio with lucene. Somebody has idea how to do it? Thanks. -- Atentamente, *Andrés Fernando Wilches Riaño* Ingeniero de Sistemas y Computación Estudiante de Maestría en Ingeniería de Sistemas y Computación Asistente Docente Universidad Nacional de Colombia

Re: Top terms relevance from specific documents ?

2016-01-27 Thread Ahmet Arslan
Hi Yannick, More like this (mlt) stuff does this already. It extracts "interesting terms" from top N documents. Don't remember but this feature may require "term vectors" to be stored. Ahmet On Wednesday, January 27, 2016 10:41 AM, Yannick Martel wrote: Le Tue, 15 Dec 2015 17:56:05 +0100, Ya

Re: Top terms relevance from specific documents ?

2016-01-27 Thread Yannick Martel
Le Tue, 15 Dec 2015 17:56:05 +0100, Yannick Martel a écrit : > Hi ! > > I am using (Java) Lucene for data indexation, and I want to produce > kind of tags cloud for specific data. > > I've found HighFreqTerms to get a top list of terms from *all > documents* (if I have well understood) (by the

Top terms relevance from specific documents ?

2015-12-15 Thread Yannick Martel
Hi ! I am using (Java) Lucene for data indexation, and I want to produce kind of tags cloud for specific data. I've found HighFreqTerms to get a top list of terms from *all documents* (if I have well understood) (by the bye, I had override it to be able to filter on several fields instead only on

Top terms relevance from specific documents ?

2015-12-15 Thread Yannick Martel
Hi ! I am using (Java) Lucene for data indexation, and I want to produce kind of tags cloud for specific data. I've found HighFreqTerms to get a top list of terms from *all documents* (if I have well understood) (by the bye, I had override it to be able to filter on several fields instead only on

Help for Implementing Most relevance Search algorithm in lucene for my project

2014-12-23 Thread Nitin Chauhan
Hi, I wanted to implement "most relevant search" in Lucene for my project. I am currently using the Lucene provided by Hybris 5.3 i.e. Lucene 4.6.1. The scenario is that I have type ahead functionality (autosuggest) implemented already in the project so when user starts typing in the input box,

RE: How to properly correlate relevance in a search across multiple collections

2014-09-09 Thread Baldwin, David
o includes proper ranking during the merge. I would have normally assumed that, but given the discussions we are having here, I am doubting that the merged results are actually merged in any reasonable way as to provided relevance merging and relationships as well. I hope I am wron

RE: How to properly correlate relevance in a search across multiple collections

2014-09-09 Thread Vincent Sevel
Hi, Does someone know if the source of the jira issues search example is available: http://jirasearch.mikemccandless.com/ thanks, vince DISCLAIMER This message is intended only for use by the person to whom it is addressed. It may contain informa

RE: How to properly correlate relevance in a search across multiple collections

2014-09-08 Thread atawfik
for almost three years. However, probably the simple workarounds suggested above might do the job. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-properly-correlate-relevance-in-a-search-across-multiple-collections-tp4157240p4157555.html Sent from the Lucene - Java Us

RE: How to properly correlate relevance in a search across multiple collections

2014-09-08 Thread Baldwin, David
, September 08, 2014 10:31 AM To: java-user Subject: Re: How to properly correlate relevance in a search across multiple collections I think the point got lost in the discussion. Raw scores are simply _not_ comparable from different collections. They aren't even comparable for different queri

RE: How to properly correlate relevance in a search across multiple collections

2014-09-08 Thread Baldwin, David
java-user@lucene.apache.org Subject: Re: How to properly correlate relevance in a search across multiple collections An observation: df and IDF (document frequency) is a key driver of the whole relevancy framework on which stock Lucene is based. There is no question about its significant value. But... that mea

Re: How to properly correlate relevance in a search across multiple collections

2014-09-08 Thread Erick Erickson
n using the raw > score from each separate collection to order and then after a merge come up > with relevancy? > > -Original Message- > From: atawfik [mailto:contact.txl...@gmail.com] > Sent: Sunday, September 07, 2014 9:50 AM > To: java-user@lucene.apache.org >

RE: How to properly correlate relevance in a search across multiple collections

2014-09-08 Thread Baldwin, David
-user@lucene.apache.org Subject: Re: How to properly correlate relevance in a search across multiple collections Hi, if you have documents that might exist in multiple collections, then you can use techniques from meta search. That is combining multiple search results from different collections

Re: How to properly correlate relevance in a search across multiple collections

2014-09-07 Thread atawfik
documents by using some aggregation methods. It is known that using the sum of relevance scores produces good results. If there are no shared documents between collections, you still can use the same approach but using different aggregation methods. One method is round robin. You start by selecting

Re: How to properly correlate relevance in a search across multiple collections

2014-09-06 Thread Jack Krupansky
, but it can include other factors, but simply limited to the contents of the document itself) to sidestep these corpus-dependent scores. In other words, the score of the document could depend on only the contents of the document itself, not the corpus. Yes, that's a major loss of relevanc

How to properly correlate relevance in a search across multiple collections

2014-09-05 Thread Baldwin, David
I have a project where there are multiple collections - could be dozens at times that a single results set needs to be generated by applying the same search criteria to each collection directory and then correlating all the sub searches into a single result set with correlating relevance. Does

Re: Relevance ranking calculation based on filtered document count

2013-07-01 Thread Nigel V Thomas
measurements showed quite significant change in MAP values, I would have expected there to be no change if relevance scores were calculated based on filtered document count, instead of system wide term stats. See results here : http://goo.gl/BI4fv Of course, this bug/feature leads to some

Re: Relevance ranking calculation based on filtered document count

2013-07-01 Thread Jack Krupansky
The very definition of a "filter" in Lucene is that it doesn't influence relevance/scoring in any way, so your question is a contradiction in terms. If you are finding that the use of a filter is affecting the scores of documents, then that is clearly a bug. -- Jack Krupansky

Relevance ranking calculation based on filtered document count

2013-07-01 Thread Nigel V Thomas
Hi, I would like to know if it is possible to calculate the relevance ranks of documents based on filtered document count? The current filter implementations as far as I know, seems to be applied after the query is processed and ranked against the full set of documents. Since system wide IDF

Open Relevance Project.

2012-08-08 Thread Sachin Kulkarni
Dear All, I was wondering if the Open Relevance Project(ORP) is currently active and available for users. I just installed Lucene and was hoping to use the ORP to do some relevance testing and work with their dataset. When I search on google I see that the ORP website and wiki have not been

Re: Federated relevance ranking

2011-06-06 Thread Toke Eskildsen
On Thu, 2011-06-02 at 21:51 +0200, Clint Gilbert wrote: > We're also considering a home-grown scheme involving normalizing the > denominators of all the index components in all our indices, based on > the sums of counts obtained from all the indices. This feels like > re-inventing the wheel, and i

Re: Federated relevance ranking

2011-06-02 Thread Erick Erickson
My gut feel is there isn't really a good solution to intermingling the results, since they come from different sources, index different kinds of data etc. The irreducible problem is that a hit in one index is not comparable to a hit in another, either from a Lucene scoring perspective or from the u

Re: Federated relevance ranking

2011-06-02 Thread Clint Gilbert
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Thank you very much for your reply. Yeah, our indexes (indices?) contain different types and amounts of data. :( The data being indexed is all the same format - RDF - but it describes different numbers and kinds of things. What is your gut feeling on

Re: Federated relevance ranking

2011-06-02 Thread Erick Erickson
As you've found out, raw scores certainly aren't comparable across different indexes #unless# the documents are fairly distributed. You're talking large indexes here, so if the documents are balanced across all your indexes, the results should be pretty comparable. This pre-supposes that the indexe

Federated relevance ranking

2011-06-02 Thread Clint Gilbert
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi everyone, I searched the list archives, but couldn't find a question that closely matches mine. The project I'm working on is designed to allow searching a distributed collection of data repositories. Currently, we index each repository to build

Re: How do I sort lucene search results by relevance and time?

2011-05-11 Thread Otis Gospodnetic
a-user@lucene.apache.org > Sent: Sun, May 8, 2011 11:59:11 PM > Subject: How do I sort lucene search results by relevance and time? > > What do I want to do is just like Google search results. The results in the > first page is the most relevant and also recent documents, but not &

Re: How do I sort lucene search results by relevance and time?

2011-05-10 Thread Johnbin Wang
ne default score and time desc. The sorting results seem good. It meet my requirement. On Mon, May 9, 2011 at 6:31 PM, Ian Lea wrote: > Well, you can use one of the sorting search methods and pass multiple > sort keys including relevance and a timestamp. But I suspect the > Googl

Re: How do I sort lucene search results by relevance and time?

2011-05-09 Thread Ian Lea
Well, you can use one of the sorting search methods and pass multiple sort keys including relevance and a timestamp. But I suspect the Google algorithm may be a bit more complex than that. One technique is boosting: set an index time document boost on recent documents. Of course what is recent

How do I sort lucene search results by relevance and time?

2011-05-08 Thread Johnbin Wang
What do I want to do is just like Google search results. The results in the first page is the most relevant and also recent documents, but not absolutely sorted by time desc. -- cheers, Johnbin Wang

Re: customizable relevance engine

2010-08-03 Thread d' Ani
:14 PM Subject: Re: customizable relevance engine Both Solr or Lucene allow extensive customization of relevance calculations. Examples include boosting matched in title field vs. Body, or boosting recent documents more than older documents. On 8/3/10, d' Ani wrote: > Hi all, > I

Re: customizable relevance engine

2010-08-03 Thread dc tech
Both Solr or Lucene allow extensive customization of relevance calculations. Examples include boosting matched in title field vs. Body, or boosting recent documents more than older documents. On 8/3/10, d' Ani wrote: > Hi all, > Is there any relevance engine that is built in lucenc

customizable relevance engine

2010-08-03 Thread d' Ani
Hi all, Is there any relevance engine that is built in lucence and which can be customized. Regards, Anirban De Yahoo: anirbande Skype: anirbande Gtalk : ade.sxc

Re: IR meetup in Michigan - lucene's scaling performance and relevance tuning

2010-07-21 Thread Ivan Provalov
cene's scaling performance and  > relevance tuning > To: java-user@lucene.apache.org > Date: Tuesday, July 20, 2010, 2:16 PM > are there such events in Russia? > > Best Regards > Alexander Aristov > > > On 20 July 2010 17:59, Ivan Provalov > wrote: > &

Re: IR meetup in Michigan - lucene's scaling performance and relevance tuning

2010-07-20 Thread Alexander Aristov
are there such events in Russia? Best Regards Alexander Aristov On 20 July 2010 17:59, Ivan Provalov wrote: > We are organizing a meetup in michigan on IR. The first meeting is on > august 19. We will be talking about lucene's scalability and relevance > tuning followed b

IR meetup in Michigan - lucene's scaling performance and relevance tuning

2010-07-20 Thread Ivan Provalov
We are organizing a meetup in michigan on IR. The first meeting is on august 19. We will be talking about lucene's scalability and relevance tuning followed by a discussion. Feel free to sign up: http://www.meetup.com/Michigan-Information-Retrieval-Enthusiasts-Group Thanks, Ivan pro

Re: Question about relevance

2010-01-08 Thread Erik Hatcher
esults for the search. My problem is : the two results have the same relevance. I thought that the document containing "Wallis" would have better relevance because I search for the word "wallis" and not "wall". Relevance is calculated from the searched word (wall

Question about relevance

2010-01-08 Thread Yannick Caillaux
yzer and becomes "wall". So the two documents are results for the search. My problem is : the two results have the same relevance. I thought that the document containing "Wallis" would have better relevance because I search for the word "wallis" and not "wall".

Re: Filtering query results based on relevance/acuracy

2009-09-29 Thread Alex
anybody can help ? On Sat, Sep 26, 2009 at 11:22 PM, Alex wrote: > Hi Otis and thank your for helping me out. > > Sorry for the late reply. > > > > Although a Phrase query or TermQuery would be perfectly suited in some > cases, this will not work in my case. > > Basically my application's searc

Re: Filtering query results based on relevance/acuracy

2009-09-26 Thread Alex
Hi Otis and thank your for helping me out. Sorry for the late reply. Although a Phrase query or TermQuery would be perfectly suited in some cases, this will not work in my case. Basically my application's search feature is a single field "à la Google" and the user can be looking for a lot of

Re: Filtering query results based on relevance/acuracy

2009-09-22 Thread Otis Gospodnetic
> Subject: Filtering query results based on relevance/acuracy > > Hi, > > I'm, a total newbie with lucene and trying to understand how to achieve my > (complicated) goals. So what I'm doing is yet totally experimental for me > but is probably extremely trivial for the e

Filtering query results based on relevance/acuracy

2009-09-21 Thread Alex
Hi, I'm, a total newbie with lucene and trying to understand how to achieve my (complicated) goals. So what I'm doing is yet totally experimental for me but is probably extremely trivial for the experts in this list :) I use lucene and Hibernate Search to index locations by their name, type, etc

Re: relevance function for scores

2009-05-27 Thread kenny kim
pared to a vanilla search. -Original Message- From: kenny kim Reply-To: java-user@lucene.apache.org To: java-user@lucene.apache.org Subject: Re: relevance function for scores Date: Wed, 27 May 2009 19:18:39 +0900 I seems to be a good solution. However, I think it may takes some processing t

Re: relevance function for scores

2009-05-27 Thread Joel Halbert
-user@lucene.apache.org To: java-user@lucene.apache.org Subject: Re: relevance function for scores Date: Wed, 27 May 2009 19:18:39 +0900 I seems to be a good solution. However, I think it may takes some processing time to get the distribution of all matching documents before scoring each docs. Would you h

Re: relevance function for scores

2009-05-27 Thread kenny kim
- From: Babak Farhang Reply-To: java-user@lucene.apache.org To: java-user@lucene.apache.org Subject: Re: relevance function for scores Date: Mon, 25 May 2009 16:11:32 -0600 Woops. Got that backwards.. should read if (score[n] / score[n-1]) < c / (boost_factor) On Mon, May 25, 2009 a

Re: relevance function for scores

2009-05-26 Thread Joel Halbert
- From: Babak Farhang Reply-To: java-user@lucene.apache.org To: java-user@lucene.apache.org Subject: Re: relevance function for scores Date: Mon, 25 May 2009 16:11:32 -0600 Woops. Got that backwards.. should read > if (score[n] / score[n-1]) < c / (boost_factor) On Mon, May 25, 2009 at 4

Re: relevance function for scores

2009-05-25 Thread kenny kim
'document collector/result filter' that uses relative score information to filter out documents where any score is less than some magnitude of the best score, but I'm sure this could be more elegantly generalised into some mathematical "relevance/significance" model/function

Re: relevance function for scores

2009-05-25 Thread Babak Farhang
It is an easy thing to write a basic 'document collector/result filter' >> that uses relative score information to filter out documents where any >> score is less than some magnitude of the best score, but I'm sure this >> could be more elegantly generalised into

Re: relevance function for scores

2009-05-25 Thread Babak Farhang
ic 'document collector/result filter' > that uses relative score information to filter out documents where any > score is less than some magnitude of the best score, but I'm sure this > could be more elegantly generalised into some mathematical > "relevance/significance&

Re: relevance function for scores

2009-05-18 Thread Joel Halbert
It's not really a Lucene code question, as such, but it's certainly something that Lucene users may have implemented before... I'm hoping ;) -Original Message- From: Erick Erickson Reply-To: java-user@lucene.apache.org To: java-user@lucene.apache.org Subject: Re: relevan

Re: relevance function for scores

2009-05-18 Thread Erick Erickson
> > J > > -Original Message- > From: Erick Erickson > Reply-To: java-user@lucene.apache.org > To: java-user@lucene.apache.org > Subject: Re: relevance function for scores > Date: Mon, 18 May 2009 09:13:27 -0400 > > Have you looked at TopDocCollector? Basically,

Re: relevance function for scores

2009-05-18 Thread Joel Halbert
solve this - since ideally I'd like a cutoff point optimised to the resultant score values. J -Original Message- From: Erick Erickson Reply-To: java-user@lucene.apache.org To: java-user@lucene.apache.org Subject: Re: relevance function for scores Date: Mon, 18 May 2009 09:13:27 -0400 Hav

Re: relevance function for scores

2009-05-18 Thread Erick Erickson
relevant documents. > > > It is an easy thing to write a basic 'document collector/result filter' > that uses relative score information to filter out documents where any > score is less than some magnitude of the best score, but I'm sure this > could be more elegantly

relevance function for scores

2009-05-18 Thread Joel Halbert
y thing to write a basic 'document collector/result filter' that uses relative score information to filter out documents where any score is less than some magnitude of the best score, but I'm sure this could be more elegantly generalised into some mathematical "relevance/signi

Re: Google's search Appliance relevance ranking

2009-04-17 Thread John Wang
p may be inappropriate and I >> want to apologize for that. >> > > I wouldn't say it's inappropriate, but I don't know if anyone here could > say with certainty b/c the last time I checked GSA was not an open > platform... > > >> >> My question is, w

Re: Google's search Appliance relevance ranking

2009-04-17 Thread Grant Ingersoll
e I checked GSA was not an open platform... My question is, what is the relevance ranking algorithm which is used in Google Search Appliance (GSA) because the search is predominantly on documents rather than web pages. AFAIK, they use a Vector Space Model much as Lucene does, but you&

Google's search Appliance relevance ranking

2009-04-16 Thread Vasudevan Comandur
Hi, The question that I am posting in this group may be inappropriate and I want to apologize for that. My question is, what is the relevance ranking algorithm which is used in Google Search Appliance (GSA) because the search is predominantly on documents rather than web pages. I

RE: relevance vs. score

2009-03-04 Thread spring
> It's the similarity scoring formula. EG see here: > >http://lucene.apache.org/java/2_4_0/scoring.html > > and here: > > > http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene > /search/Similarity.html OK; thank you -

Re: relevance vs. score

2009-03-04 Thread Michael McCandless
It's the similarity scoring formula. EG see here: http://lucene.apache.org/java/2_4_0/scoring.html and here: http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/search/Similarity.html Mike wrote: I think for "ordinary" Lucene queries, "score"

RE: relevance vs. score

2009-03-04 Thread spring
> I think for "ordinary" Lucene queries, "score" and "relevance" mean > the same thing. > > But if you do eg function queries, or you "mixin" recency into your > scoring, etc., then "score" could be anything you computed,

Re: relevance vs. score

2009-03-04 Thread Michael McCandless
I think for "ordinary" Lucene queries, "score" and "relevance" mean the same thing. But if you do eg function queries, or you "mixin" recency into your scoring, etc., then "score" could be anything you computed, a value from a field,

relevance vs. score

2009-03-04 Thread spring
Hi, When I say: sorted by relevance or sorted by score - are relevance and score synonym for each other or what is the difference in relation to sorting? Thank you - To unsubscribe, e-mail: java-user-unsubscr

Re: Sorting by relevance and a field

2009-02-13 Thread Michael McCandless
SortField.FIELD_SCORE lets you sort by relevance. So then make a Sort that contains an array of two SortFields, eg: new Sort(new SortField[] {SortField.FIELD_SCORE, new SortField(myField)}) and pass that when searching. Lucene will then sort first by score, and when there are ties

Sorting by relevance and a field

2009-02-13 Thread Yannick Caillaux
Hi all, Lucene sorts by decreasing relevance by default. The SortField class is used for sorting by lucene field(s). First I must sort by relevance, then (for the results which have the same relevance) I must sort by a lucene field (title for example). I don't know how to do that. So

RE: boosting relevance of certain documents

2008-04-26 Thread Daniel Freudenberger
Hello, thanks for your detailed response. I didn't know there was a method called setBoost for adjusting the relevance of a certain document. Now I simply calculate the boosting factor for the document, based on its newness, the sales rank and some other values. Thank you very much.

Re: boosting relevance of certain documents

2008-04-25 Thread Otis Gospodnetic
ssage > From: Anshum <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Saturday, April 26, 2008 12:32:56 AM > Subject: Re: boosting relevance of certain documents > > Hi Daniel, > > Just a suggestion, how bout storing an extra field while indexing that

Re: boosting relevance of certain documents

2008-04-25 Thread Anshum
lization is often less than optimal > for certain types of documents (see the IBM Haifa's assessment for the > "Million Query" track of TREC on the Lucene Wiki). > > Cheers, > Grant > > > On Apr 25, 2008, at 3:50 PM, Daniel Freudenberger wrote: > > Thanks for

Re: boosting relevance of certain documents

2008-04-25 Thread Grant Ingersoll
Wiki). Cheers, Grant On Apr 25, 2008, at 3:50 PM, Daniel Freudenberger wrote: Thanks for your response. I already knew that the relevance is based on the term frequency but in some cases it's just not what the user expects. As I already mentioned, "fifa 2003 fifa 03" vs. "f

RE: boosting relevance of certain documents

2008-04-25 Thread Daniel Freudenberger
Thanks for your response. I already knew that the relevance is based on the term frequency but in some cases it's just not what the user expects. As I already mentioned, "fifa 2003 fifa 03" vs. "fifa 08" is such a case - searching for "fifa" would return the &qu

Re: boosting relevance of certain documents

2008-04-25 Thread Jonathan Ariel
> Sent: Friday, April 25, 2008 6:59 PM > To: java-user@lucene.apache.org > Subject: Re: boosting relevance of certain documents > > How are you analyzing the searchable field? > > On Fri, Apr 25, 2008 at 12:49 PM, Daniel Freudenberger < > [EMAIL PROTECTED]> wrote: >

RE: boosting relevance of certain documents

2008-04-25 Thread Daniel Freudenberger
I'm using the StandardAnalyzer - hope this answers your question (I'm quite new to the lucene thing) -Original Message- From: Jonathan Ariel [mailto:[EMAIL PROTECTED] Sent: Friday, April 25, 2008 6:59 PM To: java-user@lucene.apache.org Subject: Re: boosting relevance

Re: boosting relevance of certain documents

2008-04-25 Thread Jonathan Ariel
ifa 08") would be the much more relevant result (from the > user side of view). the same problem arises when searching for > "playstation" > - the customer expects products having "playstation" in their names at > first, ideally the console itself. in reality howe

boosting relevance of certain documents

2008-04-25 Thread Daniel Freudenberger
le itself. in reality however, he gets all possible products which are in the "playstation" category as well. my idea was to introduce another attribute relevance, which may increase the relevance of an entry. the actual relevance shouldn't be suppressed completely though, b

Re: Relevance

2008-03-19 Thread Karl Wettin
luceneuser skrev: Hi All, I need help on retrieving results based on relevance + freshness. As of now, i get based on either of the fields, either on relevance or freshness. how can i achieve this. Lucene retrieves results on relevance but also fetches old results too. i need more relevant

Re: Relevance

2008-03-19 Thread Grant Ingersoll
Have a look at the FunctionQuery capabilities in Lucene in org.apache.lucene.search.function You can use this to have field values factor into the score. -Grant On Mar 19, 2008, at 3:43 AM, luceneuser wrote: Hi All, I need help on retrieving results based on relevance + freshness. As

Re: Relevance

2008-03-19 Thread Mathieu Lecarme
luceneuser a écrit : Hi All, I need help on retrieving results based on relevance + freshness. As of now, i get based on either of the fields, either on relevance or freshness. how can i achieve this. Lucene retrieves results on relevance but also fetches old results too. i need more

Relevance

2008-03-19 Thread luceneuser
Hi All, I need help on retrieving results based on relevance + freshness. As of now, i get based on either of the fields, either on relevance or freshness. how can i achieve this. Lucene retrieves results on relevance but also fetches old results too. i need more relevant results with

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-05-30 Thread Yonik Seeley
On 5/30/07, Daniel Einspanjer <[EMAIL PROTECTED]> wrote: What I quickly found I could do without though was the HTTP overhead. I implemented the EmbeddedSolr class found on the Solr wiki that let me interact with the Solr engine directly. This is important since I'm doing thousands of queries in

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-05-30 Thread Daniel Einspanjer
On 4/11/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : Not really. The explain scores aren't normalized and I also couldn't : find a way to get the explain data as anything other than a whitespace : formatted text blob from Solr. Keep in mind that they need confidence the defualt way Solr du

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-05-05 Thread Daniel Einspanjer
r}"~2^10 director_name_mv:${Director}^5 director_name_mv:${Director}~.7 For each item in the source feed, the variables are interpolated (the query term is transformed into a grouped term if there are multiple values for a variable). That query is then made to find the overall best match. I

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-04-11 Thread Chris Hostetter
: Not really. The explain scores aren't normalized and I also couldn't : find a way to get the explain data as anything other than a whitespace : formatted text blob from Solr. Keep in mind that they need confidence the defualt way Solr dumps score explainations is just as plain text, but the Ex

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-04-11 Thread Daniel Einspanjer
Oh geeze. Gmail ripped my pretty table to shreds. Let me try again: A -- id title title score director director score year year score overall score B -

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-04-11 Thread Daniel Einspanjer
Not really. The explain scores aren't normalized and I also couldn't find a way to get the explain data as anything other than a whitespace formatted text blob from Solr. Keep in mind that they need confidence factors from one query to the next. With the explain scores, they can have wildly dif

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-04-10 Thread Grant Ingersoll
On Apr 10, 2007, at 8:03 PM, Daniel Einspanjer wrote: The people reviewing this matching process need some way of determining why a particular match was made other than the overall score. Was it because the title was a perfect match or was it because the title wasn't that close, but the direct

Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-04-10 Thread Daniel Einspanjer
eas. Thank you very much for your time, Daniel -- Forwarded message -- From: Daniel Einspanjer <[EMAIL PROTECTED]> Date: Apr 10, 2007 8:04 AM Subject: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure? To: solr-

RE: Scoring Technique based on Relevance Feeback & other Parameters

2006-08-31 Thread sachin
scorers are non-final classes. Because my interest lies with changes in scoring strategy which is based on Relevance Feedback? One observation : Lucene is designed with inflexible scoring mechanism based on TF-IDF. It would be really nice if much simpler scoring mechanisms should have given chance

RE: Scoring Technique based on Relevance Feeback & other Parameters

2006-08-31 Thread sachin
scorers are non-final classes. Because my interest lies with changes in scoring strategy which is based on Relevance Feedback? One observation : Lucene is designed with inflexible scoring mechanism based on TF-IDF. It would be really nice if much simpler scoring mechanisms should have given chance

Re: Scoring Technique based on Relevance Feeback & other Parameters

2006-08-23 Thread Chris Hostetter
: package. By implementing new type of tuple (Query,Weight,Scorer) I can : easily implement new Scoring technique. Unfortunatly Lucene index shows that : it stores only TF / Position vectors for each term within document. : I am interested in investigating new scoring technique where I w

RE: Scoring Technique based on Relevance Feeback & other Parameters

2006-08-23 Thread Dejan Nenov
Indeed - you bring up interesting questions. You may want to take a look at NUTCH first, however - I am not sure if they have done some of the Google-like ranking you mention. However - collaborative relevance enhancement, based on user feedback, would be a nice Web-2.0-ish feature to bake into

RE: Scoring Technique based on Relevance Feeback & other Parameters

2006-08-23 Thread Russell M. Allen
Technique based on Relevance Feeback & other Parameters Hello Great/smart guys This is my first question for this group as I started working on the Lucene last month. Lucene provide the scoring of documents based on TF-IDF vector analysis. Lucene also provides the Sco

Re: Scoring Technique based on Relevance Feeback & other Parameters

2006-08-23 Thread Grant Ingersoll
of the first to test it!) Another parameter is relevance feedback from the User. Ranking should get affected by relevance feedback from the user. Take a look at Term Vectors. Search the list. Read about them at http://www.cnlp.org/apachecon2005 or in "Lucene In Action&quo

Scoring Technique based on Relevance Feeback & other Parameters

2006-08-23 Thread sachin
investigate       Another parameter is relevance feedback from the User. Ranking should get affected by relevance feedback from the user.   Would someone interested in helping out or thinking about the same problem.

Re: get results by relevance, limiting results and then sort the results by some criterion

2006-02-21 Thread Mufaddal Khumri
user lets say 300. I can do that by only extracting the first 300 hits (sorted by decreasing relevance by default) and displaying those to the user. If you are only talking about ordering the number of items that you are going to show to the user, that seems to imply that the number wil

Re: get results by relevance, limiting results and then sort the results by some criterion

2006-02-21 Thread Mufaddal Khumri
e. After all they may search for "bolt" maybe they want an ancillary product. -Original Message- From: Mufaddal Khumri [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 21, 2006 12:06 PM To: java-user@lucene.apache.org Subject: Re: get results by relevance, limit

RE: get results by relevance, limiting results and then sort the results by some criterion

2006-02-21 Thread John Powers
" products, at least you'll get some of these. After all they may search for "bolt" maybe they want an ancillary product. -Original Message- From: Mufaddal Khumri [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 21, 2006 12:06 PM To: java-user@lucene.apache.o

Re: get results by relevance, limiting results and then sort the results by some criterion

2006-02-21 Thread Dan Armbrust
Mufaddal Khumri wrote: When I do a search for example on "batteries" i get 1200+ results. I would like to show the user lets say 300. I can do that by only extracting the first 300 hits (sorted by decreasing relevance by default) and displaying those to the user. If you are on

Re: get results by relevance, limiting results and then sort the results by some criterion

2006-02-21 Thread Otis Gospodnetic
It sounds like this is a webapp. I'd consider playing with HTML DOM a little bit - come up with a system where I get top N matches by relevance, store them somewhere, and then just re-sort them using users' criteria, without going back to the Lucene index. For instance, you could

RE: get results by relevance, limiting results and then sort the results by some criterion

2006-02-21 Thread John Powers
ll the primary pens and pencils, which makes sense. -Original Message- From: Mufaddal Khumri [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 21, 2006 12:02 PM To: java-user@lucene.apache.org Subject: Re: get results by relevance, limiting results and then sort the results by some criteri

Re: get results by relevance, limiting results and then sort the results by some criterion

2006-02-21 Thread Mufaddal Khumri
to see. I think I'm just curious why getting rid of some that could (in a new sort) be of higher relevance is a good thing. -Original Message- From: Mufaddal Khumri [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 21, 2006 10:33 AM To: java-user@lucene.apache.org Subject: g

Re: get results by relevance, limiting results and then sort the results by some criterion

2006-02-21 Thread Mufaddal Khumri
to sort on the full document list, and then return on the 300 top that you want the user to see. I think I'm just curious why getting rid of some that could (in a new sort) be of higher relevance is a good thing. -Original Message- From: Mufaddal Khumri [mailto:[EMAIL PROTECTE

  1   2   >