Re: PhraseQuery

2020-01-24 Thread baris . kazar
Right, i am getting now expected behavior. In the docs i wish the New York example would be continued for clarity and consistence. Best regards On 1/24/20 12:55 PM, Atri Sharma wrote: PhraseQuery enforces the order of terms specified and needs an exact match of order of terms unless slop is

Re: PhraseQuery

2020-01-24 Thread baris . kazar
Thanks for the quick responses, i was having a bug in my code such that i was building multiple PhraseQuery's instead of one PhraseQuery in a loop. Then i was losing order of terms. Best regards On 1/24/20 12:54 PM, Michael Froh wrote: Did you check the Javadoc for PhraseQuery.Builder? https

Re: PhraseQuery

2020-01-24 Thread Atri Sharma
PhraseQuery enforces the order of terms specified and needs an exact match of order of terms unless slop is specified. When appending terms, term pos numbers need to be incremental in the builder On Fri, Jan 24, 2020 at 11:15 PM wrote: > > Hi,- > > how do i enforce the order of sequence of ter

Re: PhraseQuery

2020-01-24 Thread Michael Froh
Did you check the Javadoc for PhraseQuery.Builder? https://lucene.apache.org/core/6_5_0/core/org/apache/lucene/search/PhraseQuery.Builder.html Checking the source code, I see that the add method that takes a position argument will throw an IllegalArgumentException if you try to add a Term in a lo

Re: PhraseQuery

2016-10-11 Thread lukes
Thanks Mike. I discovered that earlier. Regards. -- View this message in context: http://lucene.472066.n3.nabble.com/PhraseQuery-tp4299871p4300752.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. -

Re: PhraseQuery

2016-10-10 Thread Michael McCandless
PhraseQuery will not work against StringField: that field indexes the entire string as a single token. Try running it against the TextField instead? Mike McCandless http://blog.mikemccandless.com On Wed, Oct 5, 2016 at 6:52 PM, lukes wrote: > Hi all, > > I am trying to do phrase query searc

Re: PhraseQuery boost doesn't affect ScoreDoc.score

2013-10-17 Thread Ian Lea
Boosting query clauses means more "this clause is more important than that clause" rather than "make the score for this search higher". I use it for biblio searching when want to search across multiple fields and want matches in titles to be more important than matches in blurbs.. Amended version

RE: PhraseQuery Search

2013-08-05 Thread Allison, Timothy B.
Try: http://lucene.apache.org/core/4_4_0/queryparser/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.html -Original Message- From: raghavendra.k@barclays.com [mailto:raghavendra.k@barclays.com] Sent: Friday, August 02, 2013 3:17 PM To: java-user@lucene.apach

Re: [PhraseQuery] Can "jakarta apache"~10 be searched by offset ?

2013-05-13 Thread Jack Krupansky
est. Or, you can file a Jira for a new Lucene Query for phrase and or span queries that measures distance by offsets rather than positions. -- Jack Krupansky -Original Message- From: wgggfiy Sent: Monday, May 13, 2013 3:47 AM To: java-user@lucene.apache.org Subject: Re: [PhraseQuery

Re: [PhraseQuery] Can "jakarta apache"~10 be searched by offset ?

2013-05-13 Thread wgggfiy
Jack, according to you, How can I implemt this requirement ?Could you give me a clue ? thank you very much.The regex query seemed not worked ? I got the field such asFieldType fieldType = new FieldType(); FieldInfo.IndexOptions indexOptions = FieldInfo.IndexOptions.DOCS

RE: [PhraseQuery] Can "jakarta apache"~10 be searched by offset ?

2013-05-07 Thread wgggfiy
ok,thx but now How can I implemt this requirement ?Jack gave me a clue, but I failed, and it returns no docs when I cameup with a regex query like "jakarta.{1,10}apache"Is there some limitations when use regex query like not indexed and son on ? - -- Email: wuqiu.m...

RE: [PhraseQuery] Can "jakarta apache"~10 be searched by offset ?

2013-05-07 Thread Uwe Schindler
, 2013 8:37 AM > To: java-user@lucene.apache.org > Subject: Re: [PhraseQuery] Can "jakarta apache"~10 be searched by offset ? > > That's the question.When I get the doc by QueryParser("jakarta > apache"~10), which means it hits the query syntax, but it depe

Re: [PhraseQuery] Can "jakarta apache"~10 be searched by offset ?

2013-05-06 Thread wgggfiy
That's the question.When I get the doc by QueryParser("jakarta apache"~10), which means it hits the query syntax, but it depends on the word position and not on offset, and that is not my intent. There are some docs which satisfied the ("jakarta apache"~10) but not satisfied the regex "jakarta.{1,1

Re: [PhraseQuery] Can "jakarta apache"~10 be searched by offset ?

2013-05-06 Thread Jack Krupansky
Do you mean the raw character offsets of the starting and ending characters of the terms? No. Although, if you index the text as a raw string, you might be able to come up with a regex query like "jakarta.{1,10}apache" -- Jack Krupansky -Original Message- From: wgggfiy Sent: Monda

Re: PhraseQuery vs MultiPhraseQuery

2010-05-28 Thread Ahmet Arslan
> Is there a fundamental difference between > > PhraseQuery query = new PhraseQuery(); > query.add(term1, 0); > query.add(term2, 0); > > and > > MultiPhraseQuery query = new MultiPhraseQuery(); > query.add( new Term[] { term1, term2 } ); > > The only different I could think of is that MPQ som

Re: PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-29 Thread Rafael Turk
unsubscribe - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-22 Thread Daniel Shane
java-user@lucene.apache.org Sent: Friday, March 19, 2010 7:14:06 PM Subject: Re: PhraseQuery Performance Issues [Lucene 2.9.0] Nutch/Solr's CommonGrams is the right way to solve this. It combines frequent terms (eg stopwords) with adjacent terms. So "the wizard of oz" will be indexed e

Re: PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-19 Thread Michael McCandless
Nutch/Solr's CommonGrams is the right way to solve this. It combines frequent terms (eg stopwords) with adjacent terms. So "the wizard of oz" will be indexed eg as the_wizard wizard_of of_oz. It'll require a full re-index though, and you have to fixup searching so that the same term expansion wo

Re: PhraseQuery with term positions

2010-01-19 Thread Avi Rosenschein
Index is pretty large (50GB, divided into 8 shards). I'm afraid I would start running into memory issues by adding the stop words (though it is definitely something I would like to test at some point). My question was more to try to understand if this was known behavior in lucene, since I can't re

Re: PhraseQuery with term positions

2010-01-19 Thread Erick Erickson
How big is your index? Because the simplest thing would be to just not remove stopwords at index or query time. Perhaps in a duplicate field depending upon your needs. Erick On Tue, Jan 19, 2010 at 6:50 AM, Avi Rosenschein wrote: > Hi, > > I am using PhraseQuery with explicitly set term position

Re: PhraseQuery in BooleanQuery not working properly in 2.9.0

2009-10-14 Thread Ion Barcan
Yes, the fix in src/java/org/apache/lucene/search/Scorer.java solves my problem, i.e. the queries return the correct number of results. On Wed, Oct 14, 2009 at 12:29 PM, Michael McCandless wrote: > It sounds likely that this is > https://issues.apache.org/jira/browse/LUCENE-1974 > > Is it possib

Re: PhraseQuery in BooleanQuery not working properly in 2.9.0

2009-10-14 Thread Michael McCandless
It sounds likely that this is https://issues.apache.org/jira/browse/LUCENE-1974 Is it possible for you to test that patch and verify it resolves your problem? Mike On Tue, Oct 13, 2009 at 9:23 AM, Ion Barcan wrote: > Hello, > > With the new Lucene 2.9.0 (on a newly built index of approx. 30 > m

Re: PhraseQuery in BooleanQuery not working properly in 2.9.0

2009-10-13 Thread Chris Hostetter
: With the new Lucene 2.9.0 (on a newly built index of approx. 30 : million documents) running BooleanQueries containing PhraseQuery does : not work properly. I've verified this on both optimized and : unoptimized index versions. I suspect that this is the same problem as identified in LUCENE-197

Re: PhraseQuery and non-letter characters

2008-12-02 Thread Ng Vinny
Hi Ian Thanks for the suggestion. I was able to write the custom analyzer to return non-letters as tokens, as well as to keep the numeric characters instead of skipping them. This is probably not the best solution, but at least i can have a demo without bugs :-) To save time for others who may ha

Re: PhraseQuery and non-letter characters

2008-11-28 Thread Ian Lea
I suggest you write your own analyzer that doesn't remove non-letter characters at index time. There might be one out there already, but not that I can think of off hand. Instead of leaving the non-letters in place you might consider doing something with position increments. I think that would pr

Re: PhraseQuery issues - differences with SpanNearQuery

2008-09-05 Thread Paul Elschot
Op Friday 05 September 2008 16:57:34 schreef Mark Miller: > Paul Elschot wrote: > > Op Thursday 04 September 2008 20:39:13 schreef Mark Miller: > >> Sounds like its more in line with what you are looking for. If I > >> remember correctly, the phrase query factors in the edit distance > >> in scorin

Re: PhraseQuery issues - differences with SpanNearQuery

2008-09-05 Thread Mark Miller
SpanScorer will use the similarity slop factor for each matching span size to adjust the effective frequency. Regards, Paul Elschot You have pointed this out to me before. One day I will remember Every time I look things over again I miss it, and I couldn't find that email in the archive

Re: PhraseQuery issues - differences with SpanNearQuery

2008-09-05 Thread Mark Miller
Paul Elschot wrote: Op Thursday 04 September 2008 20:39:13 schreef Mark Miller: Sounds like its more in line with what you are looking for. If I remember correctly, the phrase query factors in the edit distance in scoring, but the NearSpanQuery will just use the combined idf for each of the t

Re: PhraseQuery issues - differences with SpanNearQuery

2008-09-04 Thread Paul Elschot
Op Thursday 04 September 2008 20:39:13 schreef Mark Miller: > Sounds like its more in line with what you are looking for. If I > remember correctly, the phrase query factors in the edit distance in > scoring, but the NearSpanQuery will just use the combined idf for > each of the terms in it, so dis

Re: PhraseQuery issues - differences with SpanNearQuery

2008-09-04 Thread Mark Miller
Sounds like its more in line with what you are looking for. If I remember correctly, the phrase query factors in the edit distance in scoring, but the NearSpanQuery will just use the combined idf for each of the terms in it, so distance shouldnt matter with spans (I'm sure Paul will correct me

Re: PhraseQuery little bug?

2008-04-04 Thread Ivan Vasilev
In Lucene sytax in htis case ~5 means slop=5 - this is for Span queries. I think the problem is that in the class PhraseQuery the slop that we set some times is interpreted as inclusive other times exclusive. When it is considered inclusive then the distance between "apple" and "pear" is 5, bec

Re: PhraseQuery little bug?

2008-04-03 Thread Darren Govoni
One interpretation of the query with ~5 is that your text has 5 words and ~5 would imply a word in any position can match. Could it be this? - Original Message - From: "Ivan Vasilev" <[EMAIL PROTECTED]> To: "LUCENE MAIL LIST" Sent: Thursday, April 03, 2008 6:03 AM Subject: PhraseQuery

Re: PhraseQuery little bug?

2008-04-03 Thread Ivan Vasilev
Of cours in our system I can use SpanNearQuery instead of PhraseQuery. My question is is there known performance differences between the two classes? Ivan Vasilev wrote: Hi Guys, I make the following test – I create 2 files. File1.txt with content: “apple 2 3 4 pear” And File2.txt with cont

Re: PhraseQuery with synonyms or having n tokens at the same tokenposition.

2006-03-27 Thread Daniel Naber
On Montag 27 März 2006 11:17, Ramana Jelda wrote: > I have indexed name: "sony dsc-d cybershot" as following tokens provided > token positions. > 1: [sony:0->4] > > 2: [dsc:5->10] > > 3: [dscd:5->10] > > 4: [d:5->10] > > 5: [cybershot:11->20] If the first number is the token position, the tokens

Re: PhraseQuery

2006-03-16 Thread Erik Hatcher
On Mar 16, 2006, at 5:10 AM, Waleed Tayea wrote: I'm using the QueryParser to parse and return a query of a search string of a single word. But the analyzer I uses emits another morphological tokens from that single word. How can I prevent the QueryParser of considering the search query as a P

Re: PhraseQuery and edit distance slightly confusing.

2006-03-15 Thread Dawid Weiss
Hi Doug, Yes, it should probably be called "edit-distance-like" or something. It should definitely say so in the JavaDoc because I've seen this propagate to people's articles (it was Eric Hatcher's I think, but I'm not sure). But what then would the criteria for matching at all be? Right

Re: PhraseQuery and edit distance slightly confusing.

2006-03-15 Thread Doug Cutting
Dawid Weiss wrote: I get the concept implemented in PhraseQuery but isn't calling it an edit distance a little bit far fetched? Yes, it should probably be called "edit-distance-like" or something. Only the marginal elements (minimum and maximum distance from their respective query positions)