Re: Proximity query

2015-02-12 Thread Maisnam Ns
Hi, I googled it but could not find the jars of these classes can some help me where to get the jars import org.apache.lucene.corpus.stats.IDFCalc; import org.apache.lucene.corpus.stats.TFIDFPriorityQueue; import org.apache.lucene.corpus.stats.TermIDF; Thanks On Thu, Feb 12, 2015 at 11:01 PM, M

Re: Proximity query

2015-02-12 Thread Maisnam Ns
Hi Allison and Sujit, Thanks so much for your links I am so happy I am looking at exactly the links that almost covers my use case. Allison, sure will get back to you if I have some more questions. Regards NS On Thu, Feb 12, 2015 at 10:49 PM, Sujit Pal wrote: > I did something like this

Re: Proximity query

2015-02-12 Thread Sujit Pal
I did something like this sometime back. The objective was to find patterns surrounding some keywords of interest so I could find keywords similar to the ones I was looking for, sort of like a poor man's word2vec. It uses SpanQuery as Jigar said, and you can find the code here (I believe it was wri

RE: Proximity query

2015-02-12 Thread Allison, Timothy B.
-user@lucene.apache.org Subject: Re: Proximity query Hi Shah, Thanks for your reply. Will try to google SpanQuery meanwhile if you have some links can you please share Thanks On Thu, Feb 12, 2015 at 10:17 PM, Jigar Shah wrote: > This concept is called Proximity Search in general. > >

Re: Proximity query

2015-02-12 Thread Maisnam Ns
Hi Shah, Thanks for your reply. Will try to google SpanQuery meanwhile if you have some links can you please share Thanks On Thu, Feb 12, 2015 at 10:17 PM, Jigar Shah wrote: > This concept is called Proximity Search in general. > > In Lucene they are achieved using SpanQuery. > > On Thu, Feb 1

Re: Proximity query

2015-02-12 Thread Jigar Shah
This concept is called Proximity Search in general. In Lucene they are achieved using SpanQuery. On Thu, Feb 12, 2015 at 10:10 PM, Maisnam Ns wrote: > Hi, > > Can someone help me if this use case is possible or not with lucene > > Use case: I have a string say 'Japan' appearing in 10 documents

Proximity query

2015-02-12 Thread Maisnam Ns
Hi, Can someone help me if this use case is possible or not with lucene Use case: I have a string say 'Japan' appearing in 10 documents and I want to get back , say some results which contain two words before 'Japan' and two words after 'Japan' may be something like this ' Economy of Japan is gro

Any CommonGrams-inspired tricks to speed up other proximity query types?

2012-06-21 Thread Chris Harris
CommonGrams provides a neat trick for optimizing slow phrase queries that contain common words. (E.g. Hathi Trust has some datashowing how effective this can be.) Unfortunately, it does nothing for other positi

Re: Generalized proximity query performance

2007-10-07 Thread Chris Hostetter
: If I could intelligently rewrite queries, this would be better formulated : as: : title:"harry potter"~5 genre:books : : Instead, since I don't have that knowledge, I should perhaps rewrite several : guesses, and take the dismax. These guesses are equivalent to passing the right. okay. the b

Re: Generalized proximity query performance

2007-10-05 Thread Kyle Maxwell
> > Hmmm.. can you give some more concrete examples of what you mean by this? > both in terms of the use case you are trying to satisfy, and in terms of > how your current code works ... you don't have to post code or give away > trade secrets, just describe it as a black box (ie: what is the input

Re: Generalized proximity query performance

2007-10-05 Thread Mike Klaas
On 5-Oct-07, at 11:27 AM, Chris Hostetter wrote: that's what i thought first too, and it is a problem i'd eventaully like to tackle ... it was the part about "c" being in a differnet field from "a" and "b" that confused me ... i don't know what that exactly is being suggested here. I'm

Re: Generalized proximity query performance

2007-10-05 Thread Chris Hostetter
: > : would like to allow for the possibility that a and b are near each other : > in : > : one field, while c is in another field. : I understand the OP to want a PhraseQuery that has an intention (rather than : side-effect) of doing proximity-based scoring. : : "phrase query here"~1000 is the

Re: Generalized proximity query performance

2007-10-05 Thread Mike Klaas
On 5-Oct-07, at 10:54 AM, Chris Hostetter wrote: : I am using a hand rolled query of the following form (implemented with : SpanNearQuery, not a sloppy PhraseQuery): : a b c => +(a AND b AND c) OR "a b"~5 OR "b c"~5 : : The obvious solution, "a b c"~5, is not applicable for my issues, becaus

Re: Generalized proximity query performance

2007-10-05 Thread Chris Hostetter
: I am using a hand rolled query of the following form (implemented with : SpanNearQuery, not a sloppy PhraseQuery): : a b c => +(a AND b AND c) OR "a b"~5 OR "b c"~5 : : The obvious solution, "a b c"~5, is not applicable for my issues, because I : would like to allow for the possibility that a an

Generalized proximity query performance

2007-10-03 Thread Kyle Maxwell
Hi again,As the subject would suggest I'm trying to implement a layer of proximity weighting over lucene. This has greatly increased search relevance, but at the same time has knocked down performance by a substantial amount (see footer). I am using a hand rolled query of the following form (impl

Re: Number Proximity Query

2006-10-04 Thread Chris Hostetter
: Another quick question on the score. If my custom Query is returning a score : that can be any value, and this custom Query is being used together with : other standard Query in a BooleanQuery. How do I ensure the value return by : the custome Query doesnt 'overshadow' the values return by other

Re: Number Proximity Query

2006-10-04 Thread KEGan
Chris, thanks again for your reply. Really appreciate your help. Another quick question on the score. If my custom Query is returning a score that can be any value, and this custom Query is being used together with other standard Query in a BooleanQuery. How do I ensure the value return by the cu

Re: Number Proximity Query

2006-10-04 Thread Chris Hostetter
: (1) Should values returned by DocValues (return from ValueSource) must : always betwen 1.0 and 0.0 ? How is this value affect the overall document : scores, assuming there are others Query clauses as well that is perform on : the document (on other fields). The "values" returned by the various

Re: Number Proximity Query

2006-10-04 Thread KEGan
Erick, thanks for your reply. I have the LIA. But the sorting is not the solution I am looking for. As if I sort, I will lose out the relevancy from searches of other fields. I want the number proximity to be one in many of the fields that is searched. So the "num" field will contribute to the ov

Re: Number Proximity Query

2006-10-04 Thread Erick Erickson
Sorry if this is a re-post, but I got an "undeliverable" error last time I tried to post it, something about SPAM. The nerve of that filter! I don't have my book handy, but you might want to check out "Lucene In Action". There's an example of how to create an index of restaurants

Re: Number Proximity Query

2006-10-04 Thread KEGan
Thanks Chis. After spending half a day to "really" look into FunctionQuery (and related classes), and re-reading about Weight and Scorer. I think I am beginning to understand a bit. But more questions. (1) Should values returned by DocValues (return from ValueSource) must always betwen 1.0 and 0

Re: Number Proximity Query

2006-10-03 Thread Chris Hostetter
: >From my searches, there seems to be a FunctionQuery in Solr that can do this : type of query. But I am using pure Lucene, and trying to port Solr code over : (to create my own version of FunctionQuery) looks too complicated because of : code dependency on other Solr code such as ValueSource, et

Number Proximity Query

2006-10-03 Thread KEGan
Hi, Is there a way to query all numbers that is close to a particular number (query), and score by how close they are to that number (query) ? To illustrate further, assume document with single field "num", and the value for this field can only be integer number. Now, let says, there are 3 docum

Re: Proximity Query Parser

2006-09-01 Thread Paul Elschot
On Friday 01 September 2006 19:46, Mark Miller wrote: > Eric also gave me the idea of using a SpanNear with maximum slop as a > boolean to connect spans. Using this and SpanOr seems to make my time spent > on the distribution of proximity clauses a little foolish :) Is that true? There is practice

Re: Proximity Query Parser

2006-09-01 Thread Mark Miller
Eric also gave me the idea of using a SpanNear with maximum slop as a boolean to connect spans. Using this and SpanOr seems to make my time spent on the distribution of proximity clauses a little foolish :) Is that true? Is there any disadvantage to the max slop Spannear, SpanOr solution? Any adva

Re: Proximity Query Parser

2006-09-01 Thread Mark Miller
Thanks for the tip Paul. It is embarrassing, but I only realized how OrSpan queries worked a day or two ago based on a tip from Eric. The way I assumed it would create the spans before was just wrong and I never had researched further. Now I see that it would be a nice optimization for what I have

Re: Proximity Query Parser

2006-09-01 Thread Paul Elschot
On Friday 01 September 2006 12:54, Mark Miller wrote: > Hi Paul, > > I also have to treat things differently depending on if I am in a > proximity clause or boolean clause. A wildcard in a boolean is mapped to > a wildcard query. A wildcard in a proximity is mapped to a regex span > that has b

Re: Proximity Query Parser

2006-09-01 Thread Mark Miller
Paul Elschot wrote: Mark, On Thursday 31 August 2006 23:18, Mark Miller wrote: I am not a huge fan of the queryparser's syntax so I have started an open source project to create a viable alternative. I could really use some helping testing it out. The more I can get it tested the better ch

Re: Proximity Query Parser

2006-09-01 Thread Paul Elschot
Mark, On Thursday 31 August 2006 23:18, Mark Miller wrote: > I am not a huge fan of the queryparser's syntax so I have started an > open source project to create a viable alternative. I could really use > some helping testing it out. The more I can get it tested the better > chance it has of se

Proximity Query Parser

2006-08-31 Thread Mark Miller
I am not a huge fan of the queryparser's syntax so I have started an open source project to create a viable alternative. I could really use some helping testing it out. The more I can get it tested the better chance it has of serving the community. The parser is called Qsol. I am right up again