date:20120419

RE: Two questions on RussianAnalyzer

2012-04-19 Thread Uwe Schindler

> My questions are: 1) it this change is by design (not a mistake) and > 2) is the only option to achieve old behaviour is to use > Version.LUCENE_30 for creating analyzer? This is why this option is there! - To unsubscribe, e-m

DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies

I am trying to solve a problem using DisjunctionMaxQuery. Consider a query like: a:b OR c:d OR e:f OR ... name:richard OR name:dick OR name:dickie OR name:rich ... At most, one of the richard names matches. So the match score gets dragged down by the long list of things that don't match, as the

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Robert Muir

On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies wrote: > I am trying to solve a problem using DisjunctionMaxQuery. > > > Consider a query like: > > a:b OR c:d OR e:f OR ... > name:richard OR name:dick OR name:dickie OR name:rich ... > > At most, one of the richard names matches. So the match sco

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies

On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir wrote: > On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies > wrote: >> I am trying to solve a problem using DisjunctionMaxQuery. >> >> >> Consider a query like: >> >> a:b OR c:d OR e:f OR ... >> name:richard OR name:dick OR name:dickie OR name:rich ..

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies

Turning on disableCoord for a nested boolean query does not seem to change the overall maxCoord term as displayed in explain. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-

Re: Two questions on RussianAnalyzer

2012-04-19 Thread Vladimir Gubarkov

On Thu, Apr 19, 2012 at 7:57 PM, Uwe Schindler wrote: >> My questions are: 1) it this change is by design (not a mistake) and >> 2) is the only option to achieve old behaviour is to use >> Version.LUCENE_30 for creating analyzer? > > This is why this option is there! Right and it's great, but thi

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Robert Muir

On Thu, Apr 19, 2012 at 3:49 PM, Benson Margulies wrote: > On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir wrote: >> On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies >> wrote: >>> I am trying to solve a problem using DisjunctionMaxQuery. >>> >>> >>> Consider a query like: >>> >>> a:b OR c:d OR e:

Re: Two questions on RussianAnalyzer

2012-04-19 Thread Robert Muir

On Thu, Apr 19, 2012 at 7:26 AM, Vladimir Gubarkov wrote: > New analyzer: > [aaa.bbb.com, , a, b, c, d'e, f, g, h, i, j, k, l_m, n, o, p, q, > r, s, t, u, v, z, y, z] > Old analyzer: > [aaa, bbb, com, , a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, > q, r, s, t, u, v, z, y, z] > > Please

RE: Two questions on RussianAnalyzer

2012-04-19 Thread Steven A Rowe

Hi Vladimir, > The most uncomfortable in new behaviour to me is that in past I used > to search by subdomain like bbb.com: and have displayed results > with www.bbb.com:, aaa.bbb.com: and so on. Now I have 0 > results. About domain names, see my response to a similar question today on

Re: Two questions on RussianAnalyzer

2012-04-19 Thread Vladimir Gubarkov

Thank you Robert for detailed reply On Fri, Apr 20, 2012 at 12:37 AM, Robert Muir wrote: > On Thu, Apr 19, 2012 at 7:26 AM, Vladimir Gubarkov wrote: >> New analyzer: >> [aaa.bbb.com, , a, b, c, d'e, f, g, h, i, j, k, l_m, n, o, p, q, >> r, s, t, u, v, z, y, z] >> Old analyzer: >> [aaa, bbb,

Re: Two questions on RussianAnalyzer

2012-04-19 Thread Robert Muir

On Thu, Apr 19, 2012 at 4:51 PM, Vladimir Gubarkov wrote: > Hmmm... I know this and I reindexed! > I'll try to explain the problem (fortunately, already solved by using > LUCENE_30) ones again: > When indexing with new analyzer the whole lexeme "some.cool.site.com" > goes to index, not 4 lexems "

Re: Two questions on RussianAnalyzer

2012-04-19 Thread Robert Muir

On Thu, Apr 19, 2012 at 4:51 PM, Vladimir Gubarkov wrote: > So it's now imposible to find this document with query: "site.com". > I'm having an RSS subscription for that search, and now it's broken. > Just to point out, its not impossible, as i suggested before, if you were happy with the old tok

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies

On Thu, Apr 19, 2012 at 4:21 PM, Robert Muir wrote: > On Thu, Apr 19, 2012 at 3:49 PM, Benson Margulies > wrote: >> On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir wrote: >>> On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies >>> wrote: I am trying to solve a problem using DisjunctionMaxQuer

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Robert Muir

On Thu, Apr 19, 2012 at 5:05 PM, Benson Margulies wrote: > On Thu, Apr 19, 2012 at 4:21 PM, Robert Muir wrote: >> On Thu, Apr 19, 2012 at 3:49 PM, Benson Margulies >> wrote: >>> On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir wrote: On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies wr

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies

On Thu, Apr 19, 2012 at 5:10 PM, Robert Muir wrote: > On Thu, Apr 19, 2012 at 5:05 PM, Benson Margulies > wrote: >> On Thu, Apr 19, 2012 at 4:21 PM, Robert Muir wrote: >>> On Thu, Apr 19, 2012 at 3:49 PM, Benson Margulies >>> wrote: On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir wrote: >>

Re: Two questions on RussianAnalyzer

2012-04-19 Thread Vladimir Gubarkov

Thank you Steven, I'll look into this On Fri, Apr 20, 2012 at 12:43 AM, Steven A Rowe wrote: > Hi Vladimir, > >> The most uncomfortable in new behaviour to me is that in past I used >> to search by subdomain like bbb.com: and have displayed results >> with www.bbb.com:, aaa.bbb.com:

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies

I see why I'm so confused, but I think I need to construct a simpler test case. My top-level BooleanQuery, which has disableCoord=false, has 22 clauses. All but three are ordinary SHOULD TermQueries. the remainder are a spanNear and a nested BooleanQuery, and an empty PhraseQuery (that's a bug).

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread David Murgatroyd

On Apr 19, 2012, at 6:36 PM, Benson Margulies wrote: > I see why I'm so confused, but I think I need to construct a simpler test > case. > > My top-level BooleanQuery, which has disableCoord=false, has 22 > clauses. All but three are ordinary SHOULD TermQueries. the remainder > are a spanNe

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Robert Muir

On Thu, Apr 19, 2012 at 6:36 PM, Benson Margulies wrote: > I see why I'm so confused, but I think I need to construct a simpler test > case. > > My top-level BooleanQuery, which has disableCoord=false, has 22 > clauses. All but three are ordinary SHOULD TermQueries. the remainder > are a spanNear

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread David Murgatroyd

[apologies for the earlier errant send] I think BooleanQuery bq = new BooleanQuery(false); doesn't quite accomplish the desired "name IN (dick, rich)" scoring behavior. This is because (name:dick | name:rich) with coord=false would score the 'document' "Dick Rich" higher than "Rich" because the f

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies

FWIW, there seems to be an explain bug in 2.9.1 that is fixed in 3.6.0, so I'm no longer confused about the actual behavior. On Thu, Apr 19, 2012 at 8:32 PM, David Murgatroyd wrote: > [apologies for the earlier errant send] > > I think > BooleanQuery bq = new BooleanQuery(false); > doesn't quit

RE: DisjunctionMaxQuery and scoring

2012-04-19 Thread Uwe Schindler

Hi, > I think > BooleanQuery bq = new BooleanQuery(false); doesn't quite accomplish the > desired "name IN (dick, rich)" scoring behavior. This is because (name:dick | > name:rich) with coord=false would score the 'document' "Dick Rich" higher > than "Rich" because the former has two term matches

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Robert Muir

On Thu, Apr 19, 2012 at 8:32 PM, David Murgatroyd wrote: > In contrast, I think the desire > is that one and only one of the terms in the document match those in the > BooleanQuery so that "Rich" would score higher than "Dick Rich", given > document length normalization. It's almost like a desire

RE: DisjunctionMaxQuery and scoring

2012-04-19 Thread Uwe Schindler

Hi, Ah sorry, I misunderstood, you wanted to score the duplicate match lower! To achieve this, you have to change the coord function in your similarity/BooleanWeight used for this query. Either way: If you want a group of terms that get only one score if at least one of the terms match (SQL IN),

RE: Two questions on RussianAnalyzer

DisjunctionMaxQuery and scoring

Re: DisjunctionMaxQuery and scoring

Re: DisjunctionMaxQuery and scoring

Re: DisjunctionMaxQuery and scoring

Re: Two questions on RussianAnalyzer

Re: DisjunctionMaxQuery and scoring

Re: Two questions on RussianAnalyzer

RE: Two questions on RussianAnalyzer

Re: Two questions on RussianAnalyzer

Re: Two questions on RussianAnalyzer

Re: Two questions on RussianAnalyzer

Re: DisjunctionMaxQuery and scoring

Re: DisjunctionMaxQuery and scoring

Re: DisjunctionMaxQuery and scoring

Re: Two questions on RussianAnalyzer

Re: DisjunctionMaxQuery and scoring

Re: DisjunctionMaxQuery and scoring

Re: DisjunctionMaxQuery and scoring

Re: DisjunctionMaxQuery and scoring

Re: DisjunctionMaxQuery and scoring

RE: DisjunctionMaxQuery and scoring

Re: DisjunctionMaxQuery and scoring

RE: DisjunctionMaxQuery and scoring

24 matches

Site Navigation

Mail list logo

Footer information