Re: Disjunctively scoring non-matching conjunctive clauses

2023-07-21 Thread Uwe Schindler
Hi, this is the normal way to do this: use a filter or constant score query to do the matcing and use disjunctive scoring as a long chain of "should" clauses. Uwe Am 21.07.2023 um 02:35 schrieb Marc D'Mello: Hi all, I'm an engineer on Amazon Product Search and I

Disjunctively scoring non-matching conjunctive clauses

2023-07-20 Thread Marc D'Mello
Hi all, I'm an engineer on Amazon Product Search and I've recently come upon a situation where I've required conjunctive matching but disjunctive scoring. As a concrete example, let's say I have a query like this: (+title:"a" +title:"b" +title:"

Re: Relative cpu cost of fetching term frequency during scoring

2023-06-26 Thread Adrien Grand
gt; > You're saying that you're storing the type of token as part of the > > > term > > > > > > frequency. This doesn't sound like something that would play well > > > with > > > > > > dynamic pruning, so I wonder if this is the reason wh

Re: Relative cpu cost of fetching term frequency during scoring

2023-06-21 Thread Vimal Jain
> > frequency. This doesn't sound like something that would play well > > with > > > > > dynamic pruning, so I wonder if this is the reason why you are > seeing > > > > > slower queries. But since you mentioned custom term queries, maybe > > you &

Re: Relative cpu cost of fetching term frequency during scoring

2023-06-21 Thread Adrien Grand
if this is the reason why you are seeing > > > > slower queries. But since you mentioned custom term queries, maybe > you > > > > never actually took advantage of dynamic pruning? > > > > > > > > On Tue, Jun 20, 2023 at 10:30 AM Vimal Jain > wr

Re: Relative cpu cost of fetching term frequency during scoring

2023-06-20 Thread Vimal Jain
ok advantage of dynamic pruning? > > > > > > On Tue, Jun 20, 2023 at 10:30 AM Vimal Jain wrote: > > > > > > > Ok , sorry , I realized that I need to provide more context. > > > > So we used to create a lucene query which consisted of custom term &

Re: Relative cpu cost of fetching term frequency during scoring

2023-06-20 Thread Adrien Grand
context. > > > So we used to create a lucene query which consisted of custom term > > queries > > > for different fields and based on the type of field , we used to > assign a > > > boost that would be used in scoring. > > > Now we want to get rid off dif

Re: Relative cpu cost of fetching term frequency during scoring

2023-06-20 Thread Vimal Jain
ed that I need to provide more context. > > So we used to create a lucene query which consisted of custom term > queries > > for different fields and based on the type of field , we used to assign a > > boost that would be used in scoring. > > Now we want to get rid off diff

Re: Relative cpu cost of fetching term frequency during scoring

2023-06-20 Thread Adrien Grand
create a lucene query which consisted of custom term queries > for different fields and based on the type of field , we used to assign a > boost that would be used in scoring. > Now we want to get rid off different fields and instead of creating > multiple term queries , we create only

Re: Relative cpu cost of fetching term frequency during scoring

2023-06-20 Thread Vimal Jain
Ok , sorry , I realized that I need to provide more context. So we used to create a lucene query which consisted of custom term queries for different fields and based on the type of field , we used to assign a boost that would be used in scoring. Now we want to get rid off different fields and

Re: Relative cpu cost of fetching term frequency during scoring

2023-06-20 Thread Adrien Grand
i, > > I want to understand if fetching the term frequency of a term during > > scoring is relatively cpu bound operation ? > > Context - I am storing custom term frequency during indexing and later > > using it for scoring during query execution time ( in Scorer'

Re: Relative cpu cost of fetching term frequency during scoring

2023-06-19 Thread Vimal Jain
Note - i am using lucene 7.7.3 *Thanks and Regards,* *Vimal Jain* On Tue, Jun 20, 2023 at 12:26 PM Vimal Jain wrote: > Hi, > I want to understand if fetching the term frequency of a term during > scoring is relatively cpu bound operation ? > Context - I am storing custom term freq

Relative cpu cost of fetching term frequency during scoring

2023-06-19 Thread Vimal Jain
Hi, I want to understand if fetching the term frequency of a term during scoring is relatively cpu bound operation ? Context - I am storing custom term frequency during indexing and later using it for scoring during query execution time ( in Scorer's score() method ). I noticed a performance

Re: Lucene Disable scoring

2022-07-11 Thread Adrien Grand
Note that Lucene automatically disables scoring already when scores are not needed. E.g. queries that compute the top-k hits by score will definitely compute scores, but if you are just counting the number of matches of a query or aggregations, then Lucene skips scoring entirely already. Is there

Re: Lucene Disable scoring

2022-07-11 Thread Mikhail Khludnev
I'd rather agree with Uwe, but you can plug BooleanSimilarity just to check it out. On Mon, Jul 11, 2022 at 6:01 PM Mohammad Kasaei wrote: > Hello > > I have a question. Is it possible to completely disable scoring in lucene? > > Detailed description: > I have an index

Re: Lucene Disable scoring

2022-07-11 Thread Uwe Schindler
No that's the only way to do it. The function call does not cost overheads because it is optimized away by the runtime. Uwe Am 10.07.2022 um 11:34 schrieb Mohammad Kasaei: Hello I have a question. Is it possible to completely disable scoring in lucene? Detailed description: I have an

Lucene Disable scoring

2022-07-11 Thread Mohammad Kasaei
Hello I have a question. Is it possible to completely disable scoring in lucene? Detailed description: I have an index in elasticsearch and it contains big shards (every shard about 500m docs) so a nano second of time spent on scoring every document in any shard causes a few second delay in the

Re: Lucene cpu utilization & scoring

2021-08-20 Thread Varun Sharma
commits, and if you are indexing across multiple threads. We > found this can help reduce the number of segments, and the variability > in the number of segments. I don't know if that is truly a root cause > of your performance problems here though. > > Regarding scoring cos

Re: Lucene cpu utilization & scoring

2021-08-20 Thread Michael Sokolov
here though. Regarding scoring costs -I don't think creating dummy Weight and Scorer will do what you think - Scorers are doing matching in fact as well as scoring. You won't get any results if you don't have any real Scorer. I *think* that setting needsScores() to false should

Lucene cpu utilization & scoring

2021-08-20 Thread Varun Sharma
, performance is significantly better. When we turn on realtime updates, due to accumulation of segments - CPU utilization by lucene goes up by at least *3X* [based on profiling]. b) A profile shows that the vast majority of time is being spent in scoring methods even though we are setting *needsScores() to

Re: Tuning MoreLikeThis scoring algorithm

2021-06-01 Thread TK Solr
same number of count each? That would basically be a cosine similarity between the two documents, I think. TK On 5/28/21 6:27 PM, Robert Muir wrote: See https://cwiki.apache.org/confluence/display/LUCENE/ScoresAsPercentages which has some broken nabble links, but is still valid. TLDR: Scoring

Re: Tuning MoreLikeThis scoring algorithm

2021-05-28 Thread Robert Muir
See https://cwiki.apache.org/confluence/display/LUCENE/ScoresAsPercentages which has some broken nabble links, but is still valid. TLDR: Scoring just doesn't work the way you think. Don't try to interpret it as an absolute value, it is a relative one. On Fri, May 28, 2021 at 1:36

Tuning MoreLikeThis scoring algorithm

2021-05-28 Thread TK Solr
I'd like to have suggestions on changing the scoring algorithm of MoreLikeThis. When I feed the identical string as the content of a document in the index to MoreLikeThis.like("field", new StringReader(docContent)), I get a score less than 1.0 (0.944 in one of my test cases) that

Re: Lucene custom scoring / analyzer

2021-03-17 Thread Charlie Hull
I think you'll need a SpanQuery with the inOrder flag set: https://lucene.apache.org/core/8_8_1/core/org/apache/lucene/search/spans/SpanNearQuery.html Charlie On 17/03/2021 10:30, Vlad Smirnovskiy wrote: Hello! I`d like to do something like that: When I add a document and some text is going wi

Lucene custom scoring / analyzer

2021-03-17 Thread Vlad Smirnovskiy
Hello! I`d like to do something like that: When I add a document and some text is going with (e.g.) quotes it should mean that this text has to be exactly in the query. Better with an examples - text: green "blue apple" juice query : blue apple - result: hit. query : blue apple juice - result: h

Re: Fuzzy Search Scoring Adjustment

2020-09-23 Thread Uwe Schindler
just the most >frequent unique fuzzy match in each document. > >Ideally I'd like to use a built in mechanism for achieving this, but if >it's not available, a way to extend the BooleanQuery, BooleanWeight, >and/or >BooleanScorer classes to have slightly different scor

Fuzzy Search Scoring Adjustment

2020-09-23 Thread Eastlack, Kainoa
but if it's not available, a way to extend the BooleanQuery, BooleanWeight, and/or BooleanScorer classes to have slightly different scoring logic but otherwise function exactly the same would also work, but all of those are either final classes or have no public constructor, effectively making it imp

Re: Scoring Across Multiple Fields

2020-01-27 Thread Michael Froh
, you're likely just inflating those title matches even more (since a title match is probably highly correlated with a body match). (The DisjunctionMaxQuery also has a an optional "tieBreakerMultiplier" property that you can use to weight the scoring somewhere between pure max and pure sum -

Scoring Across Multiple Fields

2020-01-27 Thread John Brown
Hi, I have a question regarding how Lucene computes document similarities from field similarities. Lucene's scoring documentation mentions that scoring works on fields and combines the results to return documents. I'm assuming fields are given scores, and those scores are simply a

Re: Scoring in Lucene 6.6.0, 7.7.2, 8.1

2019-06-26 Thread baris . kazar
^0.56] Thanks On 6/26/19 10:44 AM, baris.ka...@oracle.com wrote: Yes, i know that feature but so far it did not help me much but i am still looking into that. Thanks On 6/26/19 2:41 AM, Adrien Grand wrote: You can use IndexSearcher#explain to see how scores are computed. On Wed, Jun 26, 201

Re: Scoring in Lucene 6.6.0, 7.7.2, 8.1

2019-06-26 Thread baris . kazar
AM, Adrien Grand wrote: You can use IndexSearcher#explain to see how scores are computed. On Wed, Jun 26, 2019 at 12:48 AM wrote: Hi,-    i really want to know why the scoring works this way: search String is either MAINO or MAINS: MAIN appears as the 276th entry in the results. NEW HAMPS

Re: Scoring in Lucene 6.6.0, 7.7.2, 8.1

2019-06-26 Thread baris . kazar
scoring works this way: search String is either MAINO or MAINS: MAIN appears as the 276th entry in the results. NEW HAMPSHIRE in results: city="NASHUA" municipality="HILLSBOROUGH" region="NEW HAMPSHIRE" country="UNITED STATES" in the 0 th result NEW HAMPSHIR

Re: Scoring in Lucene 6.6.0, 7.7.2, 8.1

2019-06-25 Thread Adrien Grand
You can use IndexSearcher#explain to see how scores are computed. On Wed, Jun 26, 2019 at 12:48 AM wrote: > > Hi,- > > i really want to know why the scoring works this way: search String is > either MAINO or MAINS: MAIN appears as the 276th entry in the results. > > NEW

Scoring in Lucene 6.6.0, 7.7.2, 8.1

2019-06-25 Thread baris . kazar
Hi,-  i really want to know why the scoring works this way: search String is either MAINO or MAINS: MAIN appears as the 276th entry in the results. NEW HAMPSHIRE in results: city="NASHUA" municipality="HILLSBOROUGH" region="NEW HAMPSHIRE" country="UN

Re: Lucene scoring components

2018-07-17 Thread Adrien Grand
g/core/6_0_1/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html > > Best regards > > > On 7/17/18 1:01 PM, baris.ka...@oracle.com wrote: > > Hi,- > > > > is there a way to diminish the tf(t in d) component to 1? i dont want > > the number of ti

Re: Lucene scoring overall score

2018-07-17 Thread Adrien Grand
You could use IndexSearcher#explain, which tells you how the score of a document is computed. Le mar. 17 juil. 2018 à 19:06, a écrit : > Hi,- > > how can i check the contributions from different fields indexed in the > hits doc's score? > > Best regards > > > --

Lucene scoring overall score

2018-07-17 Thread baris . kazar
Hi,- how can i check the contributions from different fields indexed in the hits doc's score? Best regards - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lu

Re: Lucene scoring components

2018-07-17 Thread baris . kazar
the number of times a word appears to affect the scoring for my app. Best regards - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Lucene scoring components

2018-07-17 Thread baris . kazar
Hi,- is there a way to diminish the tf(t in d) component to 1? i dont want the number of times a word appears to affect the scoring for my app. Best regards - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org

Lucene Scoring

2018-07-15 Thread Baris Kazar
modified but the order of results is pretty much the same. what happens is that when part of the search string is found on both fields then those entries are hit first since Lucene scoring takes number of occurrences as dominant in scoring. But i want the search string to be fully-matched with the

Re: Custom scoring algorithm and Explanation extending.

2017-11-22 Thread Vadim Gindin
Thank's a lot! On Mon, Nov 20, 2017 at 11:22 PM, Adrien Grand wrote: > Hi Vadim, > > Le jeu. 16 nov. 2017 à 18:09, Vadim Gindin a écrit > : > > > 1. I would like to use my custom scoring algorithm. Is it make sense to > use > > Lucene with other scoring algor

Re: Custom scoring algorithm and Explanation extending.

2017-11-20 Thread Adrien Grand
Hi Vadim, Le jeu. 16 nov. 2017 à 18:09, Vadim Gindin a écrit : > 1. I would like to use my custom scoring algorithm. Is it make sense to use > Lucene with other scoring algorithm? What is the best way for that - > implement Similarity and own Queries? > It really depends what y

Custom scoring algorithm and Explanation extending.

2017-11-16 Thread Vadim Gindin
Hello 1. I would like to use my custom scoring algorithm. Is it make sense to use Lucene with other scoring algorithm? What is the best way for that - implement Similarity and own Queries? 2. I'm reasearching Elasticsearch/Lucene capabilities. Elastichsearch contains request parameter &qu

Unexpected scoring results

2017-07-18 Thread Jeff Wallace
duplicate documents can sometimes report score values that differ considerable for the supposedly duplicate content? Searching through some of the older Lucene mail archives I did notice what I believe to be discussions concerning development test failures having to due with unexpected scoring

Re: Get values in custom scoring during document retrieval

2017-01-17 Thread Uwe Schindler
distance for filtering purposes. And then again i >need >the distance for scoring purposes. I also need the distance for display >purposes and i display some 100 results. So are you sayings its still >okay >to compute the distance twice here once for scoring and once for >displa

Re: Get values in custom scoring during document retrieval

2017-01-17 Thread sidhant92
Okay say i need the distance for filtering purposes. And then again i need the distance for scoring purposes. I also need the distance for display purposes and i display some 100 results. So are you sayings its still okay to compute the distance twice here once for scoring and once for display

Re: Get values in custom scoring during document retrieval

2017-01-17 Thread Adrien Grand
Sorry I just saw your other message that has a bit more information. Actually you do not need the distance for displaying purposes but both for filtering and custom scoring. That said, I think recomputing the distances is still the way to go. Geo-distance filters have optimizations that allow them

Re: Get values in custom scoring during document retrieval

2017-01-17 Thread Adrien Grand
am using custom score provider for scoring lucene documents manually. I > am > doing many calculations in custom score provider to calculate the score. > For > example on of them is distance. So now once the scoring is done i would > like > to know that distance as well. Instead of co

Get values in custom scoring during document retrieval

2017-01-13 Thread sidhant92
I am using custom score provider for scoring lucene documents manually. I am doing many calculations in custom score provider to calculate the score. For example on of them is distance. So now once the scoring is done i would like to know that distance as well. Instead of computing it again cant i

Re: Disabling Lucene Scoring/Ranking

2017-01-09 Thread Mikhail Khludnev
fwiw https://issues.apache.org/jira/browse/LUCENE-5867 is going to be released soon. On Mon, Jan 9, 2017 at 2:17 PM, Rajnish kamboj wrote: > My application does not require scoring/ranking. All data is equally > important for me. > > Search query can return any documents mat

Re: Disabling Lucene Scoring/Ranking

2017-01-09 Thread Michael McCandless
In most cases, it should. Test it and find out and report back :) Mike McCandless http://blog.mikemccandless.com On Mon, Jan 9, 2017 at 10:07 AM, Rajnish kamboj wrote: > Thanks for quick responses.. > I will try the approach.. > > Does bypassing scoring increases search perf

Re: Disabling Lucene Scoring/Ranking

2017-01-09 Thread Rajnish kamboj
Thanks for quick responses.. I will try the approach.. Does bypassing scoring increases search performance also? Regards Rajnish On Monday, January 9, 2017, Ian Lea wrote: > oal.search.ConstantScoreQuery? > > "A query that wraps another query and simply returns a constant sc

Re: Disabling Lucene Scoring/Ranking

2017-01-09 Thread Ian Lea
aher Galal wrote: > Hi, > > What about writing your own scoring that just give a value of 1 to all the > documents that are hits? > > On Mon, Jan 9, 2017 at 12:17 PM, Rajnish kamboj > wrote: > > > My application does not require scoring/ranking. All data is equally &g

Re: Disabling Lucene Scoring/Ranking

2017-01-09 Thread Michael McCandless
ire scoring/ranking. All data is equally > important for me. > > Search query can return any documents matching search criteria. > > So, Is there a way to completely disable scoring/ranking altogether? > OR Is there a better solution to it.

Re: Disabling Lucene Scoring/Ranking

2017-01-09 Thread Taher Galal
Hi, What about writing your own scoring that just give a value of 1 to all the documents that are hits? On Mon, Jan 9, 2017 at 12:17 PM, Rajnish kamboj wrote: > My application does not require scoring/ranking. All data is equally > important for me. > > Search query can return a

Disabling Lucene Scoring/Ranking

2017-01-09 Thread Rajnish kamboj
My application does not require scoring/ranking. All data is equally important for me. Search query can return any documents matching search criteria. So, Is there a way to completely disable scoring/ranking altogether? OR Is there a better solution to it. Regards Rajnish

Re: Explain Scoring function in LMJelinekMercerSimilarity Class

2016-12-20 Thread Dwaipayan Roy
Waiting for an explanation for my query. Thank you very much. On Tue, Dec 20, 2016 at 10:51 PM, Dwaipayan Roy wrote: > Hello, > > Can anyone help me understand the scoring function in the > LMJelinekMercerSimilarity class? > > The scoring function in LMJelinekMercerSimilar

Re: Explain Scoring function in LMJelinekMercerSimilarity Class

2016-12-20 Thread Will Martin
https://doi.org/10.3115/981574.981579 On 12/20/2016 12:21 PM, Dwaipayan Roy wrote: Hello, Can anyone help me understand the scoring function in the LMJelinekMercerSimilarity class? The scoring function in LMJelinekMercerSimilarity is shown below

Explain Scoring function in LMJelinekMercerSimilarity Class

2016-12-20 Thread Dwaipayan Roy
Hello, Can anyone help me understand the scoring function in the LMJelinekMercerSimilarity class? The scoring function in LMJelinekMercerSimilarity is shown below: float score = stats.getTotalBoost() * (float)Math.log(1 + ((1 - lambda

Lucene 5.4 - scoring divided by number of search terms?

2016-03-13 Thread Martin Krämer
I have a simple setup with IndexSearcher, QueryParser, SimpleAnalyzer. Running some queries I recognised that a query with more than one term returns a different ScoreDoc[i].score than shown in explain query statement. Apparently it is the score shown in explain divided by the number of search term

Re: Changing the lucene scoring function

2015-11-21 Thread Doug Turnbull
ers Doug On Saturday, November 21, 2015, Victor Makarenkov wrote: > Hi everybody! > > I would appreciate if you can refer me to some *example *or explanation of > how to change the scoring function of lucene. > > I would expect 2 options: > > 1. changing some configuration

Changing the lucene scoring function

2015-11-21 Thread Victor Makarenkov
Hi everybody! I would appreciate if you can refer me to some *example *or explanation of how to change the scoring function of lucene. I would expect 2 options: 1. changing some configuration, so the ranking function becomes , say Okapi BM 25 instead of standard similarity 2. Is there any

Lucene Scoring in Exact and Phrase Matching

2015-11-18 Thread JayJones11
I'm fairly new to Elasticsearch and Lucene. I quickly went through the Elasticsearch definitive guide and was able to understand how the scoring is calculated for boolean, term and multi term queries. The basic weighting is TF-IDF and scoring is based on custom VSM. Depending on query constru

Re: Scoring over Multiple Indexes

2015-10-22 Thread McKinley, James T
ering from the low statistics problem Erick described. We use an FST (see org.apache.lucene.util.fst.Builder) to hold the stats in memory so that the lookups are fast. Jim From: Erick Erickson Sent: 22 October 2015 15:15 To: java-user Subject: Re: Scoring

Re: Scoring over Multiple Indexes

2015-10-22 Thread Erick Erickson
gt;> We have a test case that boosts a set of terms. Something along the >>>lines of ³term1^2 AND term2^3 AND term3^4 and this query runs over a two >>>content distinct indexes. Our expectation is that the terms would be >>>returned to us as term3, term2 and term1. I

Re: Scoring over Multiple Indexes

2015-10-22 Thread Bauer, Herbert S. (Scott)
terms would be >>returned to us as term3, term2 and term1. Instead we get something >>along the lines of term3, term1 and term2. I realize from a number of >>postings that this is the result of the scoring methods action taking >>place within an individual index rathe

Re: Scoring over Multiple Indexes

2015-10-22 Thread Erick Erickson
long the lines of > term3, term1 and term2. I realize from a number of postings that this is the > result of the scoring methods action taking place within an individual index > rather than against several indexes. At the same time I don’t see a lot of > solutions offered. Is there an ou

Scoring over Multiple Indexes

2015-10-22 Thread Bauer, Herbert S. (Scott)
of term3, term1 and term2. I realize from a number of postings that this is the result of the scoring methods action taking place within an individual index rather than against several indexes. At the same time I don’t see a lot of solutions offered. Is there an out of the box solution to

Use absolute term position for scoring

2015-08-31 Thread aurelien . mazoyer
Hi all, I want to take into account the absolute position of the term for the score calculation. I found many threads that deal with this issue, and the answer is often: "use SpanFirstQuery". The problem with this approach is that it is too "boolean" for me (the document matches the spanfirstq

LMDirichletSimilarity Scoring Function

2015-04-11 Thread Ronan Cummins
really be removed and the following document-specific score should be added to the document score after the term-scoring part (unless I am missing some background scoring that is going on in Lucene): + queryLen * Math.log(mu / (docLen + mu)) Therefore, my question is as follows: Where in lucene

Lucene fuzzy and wildcard search, and scoring in AutomatonQuery

2015-02-18 Thread Yossi Vainshtein
tin query of this sort in Lucene, I've searched for solutions, this issue has been asked about. I used the approach suggested here http://stackoverflow.com/questions/28565090/scoring-results-of-automatonquery <http://stackoverflow.com/questions/2631206/lucene-query-bla-match-words-that-start-wi

Re: disabling all scoring?

2015-02-05 Thread Ahmet Arslan
ed to retrieve them by a query (so using search), but I don't need any scoring nor keeping the documents in any order. When profiling the application, I saw that for my tests, my entire search takes about 2.4 seconds, and BulkScorer takes 0.4 seconds. So I figured that without scoring, I would

disabling all scoring?

2015-02-04 Thread Rob Audenaerde
Hi all, I'm doing some analytics with a custom Collector on a fairly large number of searchresults (+-100.000, all the hits that return from a query). I need to retrieve them by a query (so using search), but I don't need any scoring nor keeping the documents in any order. When pro

Re: Absolute term position in scoring

2015-01-26 Thread Michael McCandless
ent: Monday, January 26, 2015 11:49 AM >> To: java-user@lucene.apache.org >> Subject: Re: Absolute term position in scoring >> >> Hello! >> >> I'd like to ask if this approach: construct a complex query consisting of a >> boosted "specialized&quo

RE: Absolute term position in scoring

2015-01-26 Thread Uwe Schindler
o: java-user@lucene.apache.org > Subject: Re: Absolute term position in scoring > > Hello! > > I'd like to ask if this approach: construct a complex query consisting of a > boosted "specialized" part and an "ordinary" part with no boost, - doesn't

Re: Absolute term position in scoring

2015-01-26 Thread Alexey Morozov
t of the document. > > > Mike McCandless > > http://blog.mikemccandless.com > > On Sun, Jan 25, 2015 at 5:44 PM, Luis A Lastras wrote: > >> Thanks I didn't know about SpanFirstQuery. I can likely get something >> going with that. I was still hoping that we co

Re: Absolute term position in scoring

2015-01-26 Thread Michael McCandless
wrote: > Thanks I didn't know about SpanFirstQuery. I can likely get something > going with that. I was still hoping that we could affect the scoring > formula with the position itself, but maybe this is not feasible. > > Luis > > > >

Re: Absolute term position in scoring

2015-01-25 Thread Luis A Lastras
Thanks I didn't know about SpanFirstQuery. I can likely get something going with that. I was still hoping that we could affect the scoring formula with the position itself, but maybe this is not feasible.

Re: Absolute term position in scoring

2015-01-25 Thread Michael McCandless
Maybe SpanFirstQuery? Mike McCandless http://blog.mikemccandless.com On Sat, Jan 24, 2015 at 9:34 PM, Luis A Lastras wrote: > Is it possible to incorporate in Lucene's scoring function the position of > a matching term (say as measured from the top of the document). The > scena

Absolute term position in scoring

2015-01-24 Thread Luis A Lastras
Is it possible to incorporate in Lucene's scoring function the position of a matching term (say as measured from the top of the document). The scenario is, if the set of documents tend to lk about the most important stuff at the beginning of the document, then we would like to give preferen

MultiReader scoring

2014-05-12 Thread Tamer Gur
Dear lucene users, we are using lucene(4.6) MultiReader for different indexes and for performance reasons i am going to replace it with normal Reader. But we need to keep the scoring similar with MultiReader. and as expected when we switch to normal Reader scoring for each result is not

grouped scoring

2014-04-07 Thread Michael Sokolov
I have an idea for something I'm calling grouped scoring, and I want to know if anybody has already done anything like this. The idea comes from the problem that in your search results you'd like to show only one or a small number of items from each group: for example on google.com

How scorePayload is used as a Scoring Factor?

2013-11-30 Thread Furkan KAMACI
Hi; At TFIDFSimilarity class documentaton says that about return value of scorePayload(): *An implementation dependent float to be used as a scoring factor* However when I read here: http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html I don'

Re: Performance/scoring impacts with multiple occurrences of a field

2013-10-11 Thread Ian Lea
With multiple fields of the same name vs a single field I doubt you'd be able to tell the difference in performance or matching or scoring in normal use. There may be some matching/ranking effect if you are looking at, say, span queries across the multiple fields. Try it out and see what ha

Performance/scoring impacts with multiple occurrences of a field

2013-10-07 Thread Earl Hood
the same name? The other question is if scoring of results differ between the use of a single field vs multiple fields of the same name? For results ranking, I am guessing there is an effect based on <https://wiki.apache.org/lucene-java/LuceneFAQ#How_can_I_search_over_multiple_fields.3F&g

Coordination factor disabled for BM25 and other new scoring models

2013-08-22 Thread Markus Jelsma
Hi, I know it is recommended to disable the coordination factor when using models other than default TFIDFSimilarity. And out of curiosity i'd like to know the motivation behind it but it is not explained anywhere, not even in LUCENE-2959, the patches, wiki, PDF's or whatever. So, anyone here

Re: Question on wildcard queries, filters, scoring and TooManyClauses exception

2013-08-21 Thread Duke DAI
;, "t")); > > indexSearcher.search(prefixQuery, prefixFilter, collector); > > > > This returns about 5000 hits on my index. > > > > But then I discovered that it works just as well without the filter: > > > > QueryParser queryParser = new QueryParser(Ver

RE: Question on wildcard queries, filters, scoring and TooManyClauses exception

2013-08-16 Thread Bill Chesky
); > > Why, I don't know. Seems like this would get expanded out into 5000 > BooleanQueries and since my max clause count is still set to the default 1024 > I should get the exception. But I didn't. So maybe I don't need the filter > after all? > > Next, I need s

Re: Question on wildcard queries, filters, scoring and TooManyClauses exception

2013-08-16 Thread Ian Lea
refixQuery, collector); > > Why, I don't know. Seems like this would get expanded out into 5000 > BooleanQueries and since my max clause count is still set to the default 1024 > I should get the exception. But I didn't. So maybe I don't need the filter > after all

Question on wildcard queries, filters, scoring and TooManyClauses exception

2013-08-15 Thread Bill Chesky
count is still set to the default 1024 I should get the exception. But I didn't. So maybe I don't need the filter after all? Next, I need scoring to work. I read that with wildcard queries all scores are set to 1.0 by default. But I read you can use the QueryParser.setMultiTe

RE: Lucene VSM scoring

2013-07-09 Thread Uwe Schindler
Hi, TF-IDF is just the default (and fast) scoring scheme. You can modify that (the "Similarity") as you want (since Lucene 4.0): http://lucene.apache.org/core/4_3_1/core/org/apache/lucene/search/similarities/package-summary.html There are already various other ones available, like

Lucene VSM scoring

2013-07-09 Thread Jason Z.
Hi, In the Lucene docs it mentions that Lucene impements a tf-idf weighting scheme for scoring. Is there anyway to modfiy Lucene to implement a custom weighting scheme for the VSM? Thank you.

Re: Scoring function in LMDirichletSimilarity Class

2013-04-04 Thread Peter Organisciak
ZP > > P.S: Instead of creating a new question, I used your question because I > believe that the reason should be the same. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Scoring-function-in-LMDiri

RE: Document scoring order?

2013-04-04 Thread Uwe Schindler
is the order in which > > documents are processed/scored and can that be changed? I'm guessing > > it scores matches in whichever order they are stored in the index/on > > disk, which means by increasing docIDs? > > > > I do see some out of order scoring

Re: Document scoring order?

2013-04-04 Thread Alan Woodward
in which > documents are processed/scored and can that be changed? I'm guessing > it scores matches in whichever order they are stored in the index/on > disk, which means by increasing docIDs? > > I do see some out of order scoring is possible but can one visit > docs to sco

RE: Document scoring order?

2013-04-03 Thread Uwe Schindler
Hi Otis, they are generally processed in docId order. The special case "out-of-order" processing is only used for BooleanScorer1, in which the document IDs can be reported to the Collector out-of-order (because BooleanScorer scores documents in buckets). If you don’t allow out-of-ord

Document scoring order?

2013-04-03 Thread Otis Gospodnetic
Hi, When Lucene scores matching documents, what is the order in which documents are processed/scored and can that be changed? I'm guessing it scores matches in whichever order they are stored in the index/on disk, which means by increasing docIDs? I do see some out of order scoring is pos

Re: Scoring function in LMDirichletSimilarity Class

2013-04-02 Thread Zeynep P.
, I used your question because I believe that the reason should be the same. -- View this message in context: http://lucene.472066.n3.nabble.com/Scoring-function-in-LMDirichletSimilarity-Class-tp4052488p4053267.html Sent from the Lucene - Java Users mailing list archive at Nabble.com

Scoring function in LMDirichletSimilarity Class

2013-03-29 Thread python2020
Hi,   Can anyone help me understand the scoring function in the LMDirichletSimilarity class?   The scoring function in LMDirichletSimilarity is shown below: --- float score = stats.getTotalBoost() * (float

Re: Lucene scoring

2013-03-12 Thread Ian Lea
AM, lucas van overberghe wrote: > Hi, > > We are currently using Hibernate Search but had some questions > regarding scoring. We are implementing a quicksearchengine in our > webapp but want to customize the scoring a bit. > > Let's say, you have a User named Peter, and

  1   2   3   4   5   6   7   8   >