Right, i am getting now expected behavior.
In the docs i wish the New York example would be continued for clarity
and consistence.
Best regards
On 1/24/20 12:55 PM, Atri Sharma wrote:
PhraseQuery enforces the order of terms specified and needs an exact
match of order of terms unless slop is
Thanks for the quick responses, i was having a bug in my code such that
i was building multiple PhraseQuery's instead of one PhraseQuery in a loop.
Then i was losing order of terms.
Best regards
On 1/24/20 12:54 PM, Michael Froh wrote:
Did you check the Javadoc for PhraseQuery.Builder?
https
PhraseQuery enforces the order of terms specified and needs an exact
match of order of terms unless slop is specified.
When appending terms, term pos numbers need to be incremental in the builder
On Fri, Jan 24, 2020 at 11:15 PM wrote:
>
> Hi,-
>
> how do i enforce the order of sequence of ter
Did you check the Javadoc for PhraseQuery.Builder?
https://lucene.apache.org/core/6_5_0/core/org/apache/lucene/search/PhraseQuery.Builder.html
Checking the source code, I see that the add method that takes a position
argument will throw an IllegalArgumentException if you try to add a Term in
a lo
Thanks Mike. I discovered that earlier.
Regards.
--
View this message in context:
http://lucene.472066.n3.nabble.com/PhraseQuery-tp4299871p4300752.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
PhraseQuery will not work against StringField: that field indexes the
entire string as a single token.
Try running it against the TextField instead?
Mike McCandless
http://blog.mikemccandless.com
On Wed, Oct 5, 2016 at 6:52 PM, lukes wrote:
> Hi all,
>
> I am trying to do phrase query searc
Boosting query clauses means more "this clause is more important than
that clause" rather than "make the score for this search higher". I
use it for biblio searching when want to search across multiple fields
and want matches in titles to be more important than matches in
blurbs.. Amended version
Try:
http://lucene.apache.org/core/4_4_0/queryparser/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.html
-Original Message-
From: raghavendra.k@barclays.com [mailto:raghavendra.k@barclays.com]
Sent: Friday, August 02, 2013 3:17 PM
To: java-user@lucene.apach
est.
Or, you can file a Jira for a new Lucene Query for phrase and or span
queries that measures distance by offsets rather than positions.
-- Jack Krupansky
-Original Message-
From: wgggfiy
Sent: Monday, May 13, 2013 3:47 AM
To: java-user@lucene.apache.org
Subject: Re: [PhraseQuery
Jack, according to you, How can I implemt this requirement ?Could you give me
a clue ? thank you very much.The regex query seemed not worked ? I got the
field such asFieldType fieldType = new FieldType();
FieldInfo.IndexOptions indexOptions =
FieldInfo.IndexOptions.DOCS
ok,thx but now How can I implemt this requirement ?Jack gave me a clue, but I
failed, and it returns no docs when I cameup with a regex query like
"jakarta.{1,10}apache"Is there some limitations when use regex query like
not indexed and son on ?
-
--
Email: wuqiu.m...
, 2013 8:37 AM
> To: java-user@lucene.apache.org
> Subject: Re: [PhraseQuery] Can "jakarta apache"~10 be searched by offset ?
>
> That's the question.When I get the doc by QueryParser("jakarta
> apache"~10), which means it hits the query syntax, but it depe
That's the question.When I get the doc by QueryParser("jakarta apache"~10),
which means it hits the query syntax, but it depends on the word position
and not on offset, and that is not my intent. There are some docs which
satisfied the ("jakarta apache"~10) but not satisfied the regex
"jakarta.{1,1
Do you mean the raw character offsets of the starting and ending characters
of the terms?
No.
Although, if you index the text as a raw string, you might be able to come
up with a regex query like "jakarta.{1,10}apache"
-- Jack Krupansky
-Original Message-
From: wgggfiy
Sent: Monda
> Is there a fundamental difference between
>
> PhraseQuery query = new PhraseQuery();
> query.add(term1, 0);
> query.add(term2, 0);
>
> and
>
> MultiPhraseQuery query = new MultiPhraseQuery();
> query.add( new Term[] { term1, term2 } );
>
> The only different I could think of is that MPQ som
unsubscribe
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
java-user@lucene.apache.org
Sent: Friday, March 19, 2010 7:14:06 PM
Subject: Re: PhraseQuery Performance Issues [Lucene 2.9.0]
Nutch/Solr's CommonGrams is the right way to solve this. It combines
frequent terms (eg stopwords) with adjacent terms. So "the wizard of
oz" will be indexed e
Nutch/Solr's CommonGrams is the right way to solve this. It combines
frequent terms (eg stopwords) with adjacent terms. So "the wizard of
oz" will be indexed eg as the_wizard wizard_of of_oz. It'll require a
full re-index though, and you have to fixup searching so that the same
term expansion wo
Index is pretty large (50GB, divided into 8 shards). I'm afraid I would
start running into memory issues by adding the stop words (though it is
definitely something I would like to test at some point).
My question was more to try to understand if this was known behavior in
lucene, since I can't re
How big is your index? Because the simplest thing would be
to just not remove stopwords at index or query time. Perhaps
in a duplicate field depending upon your needs.
Erick
On Tue, Jan 19, 2010 at 6:50 AM, Avi Rosenschein wrote:
> Hi,
>
> I am using PhraseQuery with explicitly set term position
Yes, the fix in src/java/org/apache/lucene/search/Scorer.java solves
my problem, i.e. the queries return the correct number of results.
On Wed, Oct 14, 2009 at 12:29 PM, Michael McCandless
wrote:
> It sounds likely that this is
> https://issues.apache.org/jira/browse/LUCENE-1974
>
> Is it possib
It sounds likely that this is https://issues.apache.org/jira/browse/LUCENE-1974
Is it possible for you to test that patch and verify it resolves your problem?
Mike
On Tue, Oct 13, 2009 at 9:23 AM, Ion Barcan wrote:
> Hello,
>
> With the new Lucene 2.9.0 (on a newly built index of approx. 30
> m
: With the new Lucene 2.9.0 (on a newly built index of approx. 30
: million documents) running BooleanQueries containing PhraseQuery does
: not work properly. I've verified this on both optimized and
: unoptimized index versions.
I suspect that this is the same problem as identified in LUCENE-197
Hi Ian
Thanks for the suggestion. I was able to write the custom analyzer to return
non-letters as tokens, as well as to keep the numeric characters instead of
skipping them.
This is probably not the best solution, but at least i can have a demo
without bugs :-)
To save time for others who may ha
I suggest you write your own analyzer that doesn't remove non-letter
characters at index time. There might be one out there already, but
not that I can think of off hand.
Instead of leaving the non-letters in place you might consider doing
something with position increments. I think that would pr
Op Friday 05 September 2008 16:57:34 schreef Mark Miller:
> Paul Elschot wrote:
> > Op Thursday 04 September 2008 20:39:13 schreef Mark Miller:
> >> Sounds like its more in line with what you are looking for. If I
> >> remember correctly, the phrase query factors in the edit distance
> >> in scorin
SpanScorer will use the similarity slop factor for each matching
span size to adjust the effective frequency.
Regards,
Paul Elschot
You have pointed this out to me before. One day I will remember
Every time I look things over again I miss it, and I couldn't find that
email in the archive
Paul Elschot wrote:
Op Thursday 04 September 2008 20:39:13 schreef Mark Miller:
Sounds like its more in line with what you are looking for. If I
remember correctly, the phrase query factors in the edit distance in
scoring, but the NearSpanQuery will just use the combined idf for
each of the t
Op Thursday 04 September 2008 20:39:13 schreef Mark Miller:
> Sounds like its more in line with what you are looking for. If I
> remember correctly, the phrase query factors in the edit distance in
> scoring, but the NearSpanQuery will just use the combined idf for
> each of the terms in it, so dis
Sounds like its more in line with what you are looking for. If I
remember correctly, the phrase query factors in the edit distance in
scoring, but the NearSpanQuery will just use the combined idf for each
of the terms in it, so distance shouldnt matter with spans (I'm sure
Paul will correct me
In Lucene sytax in htis case ~5 means slop=5 - this is for Span queries.
I think the problem is that in the class PhraseQuery the slop that we
set some times is interpreted as inclusive other times exclusive. When
it is considered inclusive then the distance between "apple" and "pear"
is 5, bec
One interpretation of the query with ~5 is that your text has 5 words
and ~5 would imply a word in any position can match. Could it be this?
- Original Message -
From: "Ivan Vasilev" <[EMAIL PROTECTED]>
To: "LUCENE MAIL LIST"
Sent: Thursday, April 03, 2008 6:03 AM
Subject: PhraseQuery
Of cours in our system I can use SpanNearQuery instead of PhraseQuery.
My question is is there known performance differences between the two
classes?
Ivan Vasilev wrote:
Hi Guys,
I make the following test – I create 2 files. File1.txt with content:
“apple 2 3 4 pear”
And File2.txt with cont
On Montag 27 März 2006 11:17, Ramana Jelda wrote:
> I have indexed name: "sony dsc-d cybershot" as following tokens provided
> token positions.
> 1: [sony:0->4]
>
> 2: [dsc:5->10]
>
> 3: [dscd:5->10]
>
> 4: [d:5->10]
>
> 5: [cybershot:11->20]
If the first number is the token position, the tokens
On Mar 16, 2006, at 5:10 AM, Waleed Tayea wrote:
I'm using the QueryParser to parse and return a query of a search
string
of a single word. But the analyzer I uses emits another morphological
tokens from that single word. How can I prevent the QueryParser of
considering the search query as a P
Hi Doug,
Yes, it should probably be called "edit-distance-like" or something.
It should definitely say so in the JavaDoc because I've seen this
propagate to people's articles (it was Eric Hatcher's I think, but I'm
not sure).
But what then would the criteria for matching at all be? Right
Dawid Weiss wrote:
I get the concept implemented in PhraseQuery but isn't calling it an
edit distance a little bit far fetched?
Yes, it should probably be called "edit-distance-like" or something.
Only the marginal elements
(minimum and maximum distance from their respective query positions)
37 matches
Mail list logo