chmark is using a lot of memory (~40-50%) and according to the
javadoc the benchmark script I run is single threaded and the cpu usage
reflect that (~100%). Are there some other parameters I should check?
Thank you very much.
On 21 January 2016 at 21:14, Michael McCandless
wrote:
> Shingles sh
Shingles should make a huge different on phrase query performance if
1) the phrase queries involve high frequency terms and 2) you have a
substantial number of documents in the index (so that
time-to-visit-postings dominates over time-to-lookup-terms).
118 rec/sec is already very fast for a long
Be sure to check and see if your app is compute or I/O bound during this
process - whether too little of your index is cached in system memory and
each query requires I/O, lots of it.
-- Jack Krupansky
On Thu, Jan 21, 2016 at 1:52 PM, Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:
>
In my experience, shingles can hurt query performance because the term
dictionary grows quite a bit. There's far more unique bigrams than there
are words. While the lookup time doesn't grow linearly with the number of
terms, it still grows.
I haven't specifically compared performance numbers shing
Hello,
I'm trying improve the speed of an index when searching for long phrases. I
performed some tests with the benchmark module. With a simple analyser and
PhraseQueries and get a throughput of 118 rec/sec. My test dataset is the
latest dump of wikipedia. Here is the filters I use at indexation
com]
> > Sent: Wednesday, November 25, 2015 9:13 AM
> > To: java-user@lucene.apache.org
> > Subject: Using phrase query in Termfilters
> >
> > Hi All,
> >
> >Am using lucene 4.10.4. Is it right to add analyzed multi valued
> fields
> > & phrase quer
nal Message-
> From: Kumaran Ramasubramanian [mailto:kums@gmail.com]
> Sent: Wednesday, November 25, 2015 9:13 AM
> To: java-user@lucene.apache.org
> Subject: Using phrase query in Termfilters
>
> Hi All,
>
>Am using lucene 4.10.4. Is it right to add a
Hi All,
Am using lucene 4.10.4. Is it right to add analyzed multi valued fields
& phrase query for the same field in boolean filter. i believe we could not
apply analyzers to values in filters. So am not getting results for those
filters' match.
String phraseTerm = "hello worl
Hi,
I am facing an issue with phrase query and increment Position. I have tried
following queries and although there were data, 0 result returned.
2) Search Query --> name:"at&t inc" Parsed Query --> +name:"at&t inc"
Result return
Hi,
I am facing an issue with phrase query having special character (like &, dot,
comma, : etc). I have tried following queries and although there were data, 0
result returned.
1) Search Query --> name:"Pep:Trans vaccines, GSK" Parsed Query -->
+name:"pep:trans v
Hi,
May be LUCENE-5317 relevant?
Ahmet
On Thursday, April 23, 2015 8:33 PM, Shashidhar Rao
wrote:
Hi,
I have a large text and from that I need to calculated the top frequencies
of words ,
say 'Driving' occurs the most.
Now , I need to find phrase containing 'Driving' in the given text and th
Hi,
I have a large text and from that I need to calculated the top frequencies
of words ,
say 'Driving' occurs the most.
Now , I need to find phrase containing 'Driving' in the given text and the
frequency count of that phrase. The phrase could be three words where
driving could be in the middle
Hi,
I just noticed that a search like "rooms to go" is failing to highlight. (I
use FastVectorHighlighter). I know it's caused the stop word (to). Is there
a recommended way to fix this? I may just re-index without stop words, and
see if that causes any problems.
thanks,
Rob
`getPositionIncrementGap`
Sent from my BlackBerry® smartphone
-Original Message-
From: Rob Nikander
Date: Thu, 28 Aug 2014 10:26:00
To:
Reply-To: java-user@lucene.apache.org
Subject: Re: How to not span fields with phrase query?
Thank you for the explanation. I subclassed Analyzer
`getPositionIncrementGap`
Sent from my BlackBerry® smartphone
-Original Message-
From: Rob Nikander
Date: Thu, 28 Aug 2014 10:26:00
To:
Reply-To: java-user@lucene.apache.org
Subject: Re: How to not span fields with phrase query?
Thank you for the explanation. I subclassed Analyzer
Thank you for the explanation. I subclassed Analyzer and overrode
`getPositionIncrementGap` for this field. It appears to have worked.
Rob
On Thu, Aug 28, 2014 at 10:21 AM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:
> Usually that's referred to as multiple "values" for the same f
Usually that's referred to as multiple "values" for the same field; in
the index there is no distinction between title:C and title:X as far as
which field they are in -- they're in the same field.
If you want to prevent phrase queries from matching B C X, insert a
position gap between C and X;
Hi,
If I have document with multiple fields "title"
title: A B C
title: X Y Z
A phrase search for title:"B C X" matches this document. Can I prevent
that?
thanks,
Rob
rect_hits.3F
> .
>
>
> --
> Ian.
>
>
> On Tue, Mar 12, 2013 at 12:45 AM, Arlei Ferreira Farnetani Junior
> wrote:
> > Hello, could someone give me an example of how to conduct a search in an
> > already built index with Lucene 4 mode phrase query using a specific
--
Ian.
On Tue, Mar 12, 2013 at 12:45 AM, Arlei Ferreira Farnetani Junior
wrote:
> Hello, could someone give me an example of how to conduct a search in an
> already built index with Lucene 4 mode phrase query using a specific
> analyzer. I tested here with the phrase the search qu
> I'm trying to create a phrase query with wildcard, from the
> forums it seems that the solution is not trivial.
> I'm trying to create the following queries: "this is a
> phrase*" OR "*This is a phrase" and
> Get hits on every possibility where the
, 2012 4:51 AM
To: java-user@lucene.apache.org
Subject: RE: using phrase query with wildcard
It can be both.
-Original Message-
From: Doron Yaacoby [mailto:dor...@gingersoftware.com]
Sent: יום א 22 יולי 2012 11:48
To: java-user@lucene.apache.org
Subject: RE: using phrase query with wildcard
Is
It can be both.
-Original Message-
From: Doron Yaacoby [mailto:dor...@gingersoftware.com]
Sent: יום א 22 יולי 2012 11:48
To: java-user@lucene.apache.org
Subject: RE: using phrase query with wildcard
Is * a placeholder for a term or a part of a term?
-Original Message-
From
Is * a placeholder for a term or a part of a term?
-Original Message-
From: Levin, Ilya [mailto:ilya.le...@hp.com]
Sent: 22 July 2012 11:29
To: java-user@lucene.apache.org
Subject: using phrase query with wildcard
Hi,
I'm trying to create a phrase query with wildcard, from the f
Hi,
I'm trying to create a phrase query with wildcard, from the forums it seems
that the solution is not trivial.
I'm trying to create the following queries: "this is a phrase*" OR "*This is
a phrase" and
Get hits on every possibility where the * resides.
What i
How to use it ? Example please ?
Regards
- Shrinath. M
--
View this message in context:
http://lucene.472066.n3.nabble.com/phrase-query-highlighter-spans-matching-tp828257p2653941.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
nts in the index. I know that
"and" is a stop word, but I'm curious why it's translated into ? instead
of a * during this parsing (or just left along because it's a phrase
query)...
Can I escape boolean keywords somehow?
Here
bruary 10, 2011 10:41 PM
To: java-user-h...@lucene.apache.org; java-user@lucene.apache.org
Subject: lucene 3.0.3 | phrase query problem
Hi Anshum,
Thanks for your replay..
Yes, I am agree with you.
As right now, I am using StandardAnalyzer it remove stop words, Puts text in
lowercase and do not crea
Hi Anshum,
Thanks for your replay..
Yes, I am agree with you.
As right now, I am using StandardAnalyzer it remove stop words, Puts text in
lowercase and do not create index for most common word in English.
Searching on index created by StandardAnalyzer it gives result as
discussed
10, 2011 at 7:59 PM, Ranjit Kumar wrote:
> searchString = "i am using sql. server setting is easy task.";
>
>
>
> while i am searching for phrase query "Sql Server" in above string it gives
> result which is not correct. As In the above string sql and server
searchString = "i am using sql. server setting is easy task.";
while i am searching for phrase query "Sql Server" in above string it gives
result which is not correct. As In the above string sql and server is seperated
by dot(.)
using both PhraseQuery and SpanQuery giv
uot;sql. server" we should not get result?
Best regards, Lisheng
-Original Message-
From: Ranjit Kumar [mailto:ranjit.ku...@otssolutions.com]
Sent: Wednesday, February 09, 2011 9:39 PM
To: java-user-h...@lucene.apache.org; java-user@lucene.apache.org
Subject: lucene 3.0.3 | phrase que
Hi,
I am using SpanQuery and SpanNearQuery to get phrase query like "Sql Server".
In my text file in which I am searching, it is present like (sql. server) mean
'sql dot server' which is not a span like "Sql Server".
While searching for phrase query "Sq
query
> yourself. The latter is quite straightforward:
>
> BooleanQuery bq = new BooleanQuery();
> PhraseQuery pq1 = ...;
> PhraseQuery pq2 = ...;
> bq.add(pq1, ...);
> bq.add(pq2, ...);
> etc.
>
>
> --
> Ian.
>
>
> On Thu, Jan 20, 2011 at 3:13 AM, amg
bq.add(pq1, ...);
bq.add(pq2, ...);
etc.
--
Ian.
On Thu, Jan 20, 2011 at 3:13 AM, amg qas wrote:
> Hi,
>
> I have two question regarding phrase query :
>
> 1) How can I execute a phrase query over multiple fields ? I can only
> get PhraseQuery to work over a single field -
Hi,
I have two question regarding phrase query :
1) How can I execute a phrase query over multiple fields ? I can only
get PhraseQuery to work over a single field - For eg something like
this :
PhraseQuery query = new PhraseQuery();
query.setSlop
(10/05/19 13:58), Li Li wrote:
hi all,
I read lucene in action 2nd Ed. It says SimpleSpanFragmenter will
"make fragments that always include the spans matching each document".
And also a SpanScorer existed for this use. But I can't find any class
named SpanScorer in lucene 3.0.1. And the res
hi all,
I read lucene in action 2nd Ed. It says SimpleSpanFragmenter will
"make fragments that always include the spans matching each document".
And also a SpanScorer existed for this use. But I can't find any class
named SpanScorer in lucene 3.0.1. And the result of HighlighterTest
class in c
Hmmm, you're beyond what I've tried to do, so all I can do is speculate. But
I don't
believe that two terms on top of each other are considered when calculating
slop. But I really don't know for sure, so I'd create a couple of unit tests
to verify.
You're right, the combinatorial explosion with pu
Thanks again for this.
I would like to able to do several things with this data if possible.
As per Mark's post, I'd like to be able to query for phrases like "He _v"~1
(where _v is my verb part of speech token) to recover string like: "He later
apologized".
This already in fact seems to be worki
Ahhh, I should have followed the link. I was interpreting your first note as
emitting two tokens NOT at the same offset. My mistake, ignore my nonsense
about unexpected consequences. Your original assumption is correct, zero
offsets are pretty transparent.
What do you really want to do here? Mark'
Thanks, Erick -
Indeed every word will have a part of speech token but Is this how the slop
actually works? My understanding was that if I have two tokens in the same
location then each will not effect searches involving other in terms of the
slop as slop indicates the number of words *between* s
If I'm reading this right, your tokenizer creates two tokens. One
"report" and one "_n"... I suspect if so that this will create some
"interesting"
behaviors. For instance, if you put two tokens in place, are you going
to double the slop when you don't care about part of speech? Is every
word going
Hello,
I have indexed words in my documents with part of speech tags at the same
location as these words using a custom Tokenizer as described, very
helpfully, here:
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200607.mbox/%3c20060712115026.38897.qm...@web26002.mail.ukl.yahoo.com%3e
, but I don't know how to
search for basically two tokens at the same term position using the
queryparser syntax.
I don't think this is available from the QueryParser. You could make a
subclass that does this for the phrase query syntax. So if you have
something like "term1 term
Is there a syntax to set the term position in a query built with
queryparser? For example, I would like something like:
PhraseQuery q = new PhraseQuery();
q.add(t1, 0);
q.add(t2, 0);
q.setSlop(0);
As I understand it, the slop defaults to 0, but I don't know how to
search for basically two tokens
On Fri, Nov 14, 2008 at 12:05 PM, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote:
> My problem with Phrase Query is that it requires
> existence of all the terms in documents. I want them more
> permissible. I want it to match with lower score.
> Does dismax also requires a
Yonik,
Thank you for your reply.
My problem with Phrase Query is that it requires
existence of all the terms in documents. I want them more
permissible. I want it to match with lower score.
Does dismax also requires all the terms?
> Solr's dismax parser can generate queries that do
Solr's dismax parser can generate queries that do most of this... it's
a combination of term queries and sloppy phrase queries.
Simplest example:
+(DEF GHI) "DEF GHI"~10^5
The only thing that it doesn't work for is the terms out of order
(they will still be matched). You could use span queries i
PhraseQuery requires all the terms in the phrase
exists in the field being searched. I am looking
for a more permissible version of PhraseQuery which
is sensitive to the order of the terms but
allows missing terms, which would lower the score
but still matches.
For example, query "DEF GHI" would
g
forward to your response.)
--- On Sun, 11/2/08, Erick Erickson <[EMAIL PROTECTED]> wrote:
> From: Erick Erickson <[EMAIL PROTECTED]>
> Subject: Re: Exact Phrase Query
> To: java-user@lucene.apache.org, [EMAIL PROTECTED]
> Date: Sunday, November 2, 2008, 12:11 PM
&
: score and words
> and that no tokenization is needed, what would be the most efficient way for
> implementing this index using Lucene ?
>
>
> --- On Sun, 11/2/08, semelak ss <[EMAIL PROTECTED]> wrote:
>
> > From: semelak ss <[EMAIL PROTECTED]>
> > Subject
ting this index using Lucene ?
--- On Sun, 11/2/08, semelak ss <[EMAIL PROTECTED]> wrote:
> From: semelak ss <[EMAIL PROTECTED]>
> Subject: Re: Exact Phrase Query
> To: java-user@lucene.apache.org
> Date: Sunday, November 2, 2008, 7:26 AM
> I was in a hurry when copy
t way of
handling things.
I would appreciate any input in this regard on how to improve the efficiency.
--- On Sat, 11/1/08, Erick Erickson <[EMAIL PROTECTED]> wrote:
> From: Erick Erickson <[EMAIL PROTECTED]>
> Subject: Re: Exact Phrase Query
> To: java-user@lucene.ap
th the
> indexWriter, but I can not pinpoint the exact cause of the problem.
>
>
> --- On Sat, 11/1/08, semelak ss <[EMAIL PROTECTED]> wrote:
>
> > From: semelak ss <[EMAIL PROTECTED]>
> > Subject: Re: Exact Phrase Query
> > To: java-user@lucene.apache.org
thing to do with the indexWriter, but I
can not pinpoint the exact cause of the problem.
--- On Sat, 11/1/08, semelak ss <[EMAIL PROTECTED]> wrote:
> From: semelak ss <[EMAIL PROTECTED]>
> Subject: Re: Exact Phrase Query
> To: java-user@lucene.apache.org
> Date: Saturda
s (their size is 0)
--- On Fri, 10/31/08, semelak ss <[EMAIL PROTECTED]> wrote:
> From: semelak ss <[EMAIL PROTECTED]>
> Subject: Re: Exact Phrase Query
> To: java-user@lucene.apache.org
> Date: Friday, October 31, 2008, 9:41 AM
> For indexing, I use
08, Erick Erickson <[EMAIL PROTECTED]> wrote:
> From: Erick Erickson <[EMAIL PROTECTED]>
> Subject: Re: Exact Phrase Query
> To: java-user@lucene.apache.org, [EMAIL PROTECTED]
> Date: Friday, October 31, 2008, 5:57 AM
> You need to give us more information for meaningful repli
You need to give us more information for meaningful replies, like
the analyzers you use when indexing and searching, the exact
query you use, perhaps the snippets of the code, etc.
That said, things to check:
Get a copy of Luke and examine your index. You can even
run queries through that tool and
semelak ss <[EMAIL PROTECTED]> wrote:
> From: semelak ss <[EMAIL PROTECTED]>
> Subject: Exact Phrase Query
> To: java-user@lucene.apache.org
> Date: Friday, October 31, 2008, 5:44 AM
> I have documents containing multiple words in the the field
> "word"
> f
I have documents containing multiple words in the the field "word"
for example, one of the documents contain in the field "word" the following:
homeowners work
When searching for single words (i.e. homewoners ) I get hits.
However, searching for the exact phrase "homeowners work" gives me no hits
" - so I want a tokenizer
take those two, and make two tokens as keyword analyzer would do.
This way one can search bidirectionally edges of a graph with one
phrase query of slop 2.
Is it possible to do such a manupulation?
Best.
On Tue, Sep 16, 2008 at 7:15 PM, Otis Gospodnetic
<[EMAIL PROTE
Are the terms stopwords?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Cam Bazz <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Tuesday, September 16, 2008 1:33:48 AM
> Subject: Phrase Query
>
>
Is it possible to write a document with different analyzers in different fields?
PerFieldAnalyzerWrapper
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hello,
Lets say I have two documents, both containing field F.
document 0 has the string "a b" as F
document 1 has the string "b a" as F
I am trying to make a phrasequery like:
PhraseQuery pq = new PhraseQuery();
pq.add(new Term("F", "a"));
pq.add(new Term("F", "b"));
I noticed this was because I was using a KeywordAnalyzer.
Is it possible to write a document with different analyzers in different fields?
Best.
On Tue, Sep 16, 2008 at 8:33 AM, Cam Bazz <[EMAIL PROTECTED]> wrote:
> Hello,
>
> Lets say I have two documents, both containing field F.
>
> document
: No, as far as I know you can't combine wildcards in phrases. This would
The QueryParser doesn't support it, and there is no native query type
for it, but if you are willing to do the query expansion yourself, you can
build a MultiPhraseQuery (where you generate the terms using a
WildcardTerm
201
> signature: LA A 202
> signature: LA B 200
> signature: LC B 300
> Now i use getFields and search them.
> Let's assume i'm searching for a signature like "LA B 200". If i use a
> phrase query, no problem. I search all the fields and the only if the
> field
assume i'm searching for a signature like "LA B 200". If i use a
phrase query, no problem. I search all the fields and the only if the field
value and query exactly match, i get a hit.
But what if you want to use wildcards and search for something like LA A
20*. Now all the LA signatures w
has several signature numbers i want to save them in a field
>> signature and when i search for such a number i want the search hit every
>> single field and not all fields together.
>> Right now i separate the string using an unique separator (in this case
>> just
>> $
; Right now i separate the string using an unique separator (in this case
> just
> $$$) so i can split the string into the numbers but i think this is kinda
> the worst form doing it.
>
>
>
>
> JensBurkhardt wrote:
> >
> > hey everybody,
> >
> >
:
>
> hey everybody,
>
> I'm wondering if it's possible to combine wildcards and phrase query.
>
> For example "term1 term*"
>
> I know that the documentation says "Lucene supports single and multiple
> character wildcard searches within single
hey everybody,
I'm wondering if it's possible to combine wildcards and phrase query.
For example "term1 term*"
I know that the documentation says "Lucene supports single and multiple
character wildcard searches within single terms (not within phrase queries)"
ing is to make sure the exact query requirement,
> then picking up analyzer.
>
> Best regards, Lisheng
>
> --
> View this message in context:
> http://www.nabble.com/Phrase-Query-Problem-tp14373945p14404143.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
hing is to make sure the exact query requirement,
then picking up analyzer.
Best regards, Lisheng
--
View this message in context:
http://www.nabble.com/Phrase-Query-Problem-tp14373945p14404143.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---
ish Vadala [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 18, 2007 10:26 AM
To: java-user@lucene.apache.org
Subject: RE: Phrase Query Problem
ok, thnx... I will implement using the WhiteSpaceAnalyzer... Let me check
the
indexing speed... I mean time taken to index my data set... If that takes
t
-Original Message-
> From: mark harwood [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, December 18, 2007 9:42 AM
> To: java-user@lucene.apache.org
> Subject: Re: Phrase Query Problem
>
>
> You could write a custom analyzer that drops stopwords but adds an extra 1
> to the &
-
From: mark harwood [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 18, 2007 9:42 AM
To: java-user@lucene.apache.org
Subject: Re: Phrase Query Problem
You could write a custom analyzer that drops stopwords but adds an extra 1
to the "positionIncrement" property for the next v
ecause the remaining words are not
recorded as being directly next to each other)
Cheers
Mark
- Original Message
From: Sirish Vadala <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, 18 December, 2007 5:10:19 PM
Subject: RE: Phrase Query Problem
Yes... If my
ers out "and" (also "or, "in" and others) as stop
> words during indexing, and the QueryParser filters those
> words out also.
>
> Best regards, Lisheng
>
> -Original Message-
> From: Sirish Vadala [mailto:[EMAIL PROTECTED]
> Sent: Monday,
Hi Sirish,
A few hours ago I sent a reply to your message, if my
understanding is correct, you indexed a doc with text
as
Health and Safety
and you used phrase
Health Safety
to create a phrase query. If that is the case, this is
normal since you used StandardAnalyzer to tokenize the
input
eryParser filters those
words out also.
Best regards, Lisheng
-Original Message-
From: Sirish Vadala [mailto:[EMAIL PROTECTED]
Sent: Monday, December 17, 2007 9:49 AM
To: java-user@lucene.apache.org
Subject: Phrase Query Problem
I have the following code for search:
BooleanQuery b
archer.search(bQuery, sort);
Now My problem here is: If I do a search on a phrase with text Health
Safety, it is fetching me all the records where in the text is Health
and/or/in Safety. It is fetching me these records even after setting the
slop of the phrase query to zero for exact match. I am u
not that I know of
Erick
On 7/30/07, Max Metral <[EMAIL PROTECTED]> wrote:
>
> I have a set of tags associated with content in my corpus. I also have
> normal text. Our system tries to figure out which "words" are tags and
> which are text, and falls back on text when tags fail. I'm wonder
I have a set of tags associated with content in my corpus. I also have
normal text. Our system tries to figure out which "words" are tags and
which are text, and falls back on text when tags fail. I'm wondering,
is there anything in Lucene which might help disambiguate multi-word
tags from text?
has any one used Lucene-794? how stable it it. is it widely used in
industry.
I have used it extensively and I would say it is extremely stable. As I
said, much of the code from it is literally the same compiled code from
Contrib Highlighter (It is really just a new Scorer class for the
to
my manager. :)
--Renaud
-Original Message-
From: Mark Miller [mailto:[EMAIL PROTECTED]
Sent: Monday, July 02, 2007 2:11 PM
To: java-user@lucene.apache.org
Subject: Re: highlighting phrase query
There has been a lot of Highlighter discussion lately, but just to try and
sum up the st
Mark:
Thanks a million for this comprehensive analysis. This is going straight to
my manager. :)
--Renaud
-Original Message-
From: Mark Miller [mailto:[EMAIL PROTECTED]
Sent: Monday, July 02, 2007 2:11 PM
To: java-user@lucene.apache.org
Subject: Re: highlighting phrase query
There
There has been a lot of Highlighter discussion lately, but just to try
and sum up the state of Highlighting in the Lucene world:
There are four Highlighter implementations that I know of. From what I
can tell, only the original Contrib Highlighter has received sustained
active development by m
Hi All,
I am developing a search tool using lucene. I am using lucene 2.1.
i have a requirement to highlight query words in the results.
.Lucene-highlighter 2.1 doesn't work well in highlighting phase query.
For example - if i have a query string "lucene Java" .It highlights
not only occurrence
is a stop work, so matching "find an answer" is as expected, but
> > there
> > is no stop word between "you" and "find" in the original input string. I
> > do
> > not see why "you find an answer" matches.
> >
> > What am
As I understand it, there really is no "space indicator". I think of it
as replacing the stop word with a space, which is then discarded.
so, you're indexing 'you find answer', and both your searches are
looking for 'you find answer', the stop words are just gone as though
they never were. So bo
I found some discussions of this question from back in 2003, but that was
many updates ago.
I have built an index using the standard stop analyser which uses the
standard list of stop words. "will" and :the" are stop words.
As I understand analyzers and phrase queries, when I search for
you wi
Sorry about that. I think II found the diagram you're talking about on page
89.
It even addresses the exact problem I'm talking about.
It's not the first time I've looked like a fool, you'd think I'd be getting
used to it by now .
So, it seems like the most reasonable solution to this issue woul
: I think that's "working as designed". Although I could understand
: someone wanting it to work differently. The slop is sort of like the
: edit distance from the current given phrase, hence the order of terms
: in the phrase matters.
correct ... LIA has a great diagram explaining this ... th
On 3/8/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
In a nutshell, reversing the order of the terms in a phrase query can
result in different hit counts. That is, "person place"~3 may return
different results from "place person"~3, depending on the number
of interveni
In a nutshell, reversing the order of the terms in a phrase query can
result in different hit counts. That is, "person place"~3 may return
different results from "place person"~3, depending on the number
of intervening terms.
There's a self-contained program below
Assuming it is, I subclassed the QueryParser to handle phrase prefixes,
replacing getFieldQuery by a version that calls the super, and if the
result is a phrase query and the text contains an asterisk tries to
create a MultiPhraseQuery. This seems to work ok, returning a
MultiPhraseQuery "micr
Hi Otis
Thanks for the information. I'm actually writing something to search files
containing code (such as JSP files) so I do expect there will be a few
problems like this because I guess Lucene's out-of-the box analyzers are
really suited to natural languages. But, I was wondering if you could
!Character.isWhitespace(c);
}
Otis
- Original Message
From: Richard Gunderson <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Monday, March 27, 2006 10:56:18 AM
Subject: Phrase Query query
Hi
I'm using PhraseQuery in conjunction with WhiteSpaceAnalyzer but it's
giving me
1 - 100 of 125 matches
Mail list logo