e word and then apply /expand the
> wild card query. Is this a bug with queryparser? OR Analyzer will not stem
> the word if it finds a wild card operator in it?
>
> Regards
> Ganesh
>
>
> - Original Message -
> From: "Anshum"
> To:
> Sent: Mon
parser? OR Analyzer will not stem the word
if it finds a wild card operator in it?
Regards
Ganesh
- Original Message -
From: "Anshum"
To:
Sent: Monday, August 31, 2009 3:01 PM
Subject: Re: Wildcard query
> Hi Ganesh,
>
> Its the snowball analyzer that uses English
Hi Ganesh,
Its the snowball analyzer that uses English Stemmer which is causing this
behavior. When you search for* 'attention'*, the query gets parsed to*'attent'
*whereas when you search for *'attenti'* it stays as it is because the
stemmer is not able to fit it anywhere.
Could you tell me what
BooleanQuery picks a Scorer based on the number of clauses and what their
options are ... all of teh scorers it might pick from are smart enough to
continuously reorder the clauses having them "skip ahead" to the next
document they match, beyond whatever docIds it already knows can't match
(ba
he query <> is parsed to
> PhraseQuery("smith ann").
> And that seems right, from a user standpoint.
>
> In fact, considering this, I realize <> should be parsed
> to MultiPhraseQuery("smith", "ann*"), not <<+smith +ann*>> as I sa
his issue: how to get QueryParser to generate
MultiPhraseQueries. Got some good ideas from it, but unfortunately no
complete solution. I'll keep on hacking.
--Renaud
-Original Message-
From: Mark Miller [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 14, 2007 12:07 PM
To: java-user@
uot;, "ann*"), not <<+smith +ann*>> as I said earlier.
B. Getting hairy. Any hope?
--Renaud
-Original Message-
From: Mark Miller [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 14, 2007 6:43 AM
To: java-user@lucene.apache.org
Subject: Re: Wildcard query with unt
r [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 14, 2007 6:43 AM
To: java-user@lucene.apache.org
Subject: Re: Wildcard query with untokenized punctuation (again)
Gotto agree with Erick here...best idea is just to preprocess the query
before sending it to the QueryParser.
My first thought i
Gotto agree with Erick here...best idea is just to preprocess the query
before sending it to the QueryParser.
My first thought is always to get out the sledgehammer...
- Mark
Erick Erickson wrote:
Well, perhaps the simplest thing would be to pre-process the query and
make the comma into a whi
if you don't use the same tokenizer for indexing and searching, you will
have troubles like this.
Mixing exact match (with ") and wildcard (*) is a strange idea.
Typographical rules says that you have a space after a comma, no?
Your field is tokenized?
M.
Renaud Waldura a écrit :
> My very simple
Well, perhaps the simplest thing would be to pre-process the query and
make the comma into a whitespace before sending anything to the
query parser. I don't know how generalizable that sort of solution is in
your problem space though
Best
Erick
On 6/13/07, Renaud Waldura <[EMAIL PROTECTED]>
After taking a quick look, I don't see how you can do this without
modifying the QueryParser. In QueryParser.jj you will find the conflict
of interest at line 891. This line will cause a match on smith,ann* and
trigger a wildcard term match on the whole piece.
This is again caused by the fact
: You're entirely correct about the analyzer (I'm using one that breaks on
: non-alphanumeric characters, so all punctuation is ignored). To be
: honest, I hadn't thought about altering this, but I guess I could; just
: reticent that there might be unforeseen consequences.
this is where the PerF
reasoning behind not analyzing wildcard queries is also explained in
the FAQ: "Are Wildcard, Prefix, and Fuzzy queries case sensitive?"
Regards,
Doron
>
> --Colin McGuigan
>
> -Original Message-
> From: Doron Cohen [mailto:[EMAIL PROTECTED]
> Sent: Saturday, Mar
arch 10, 2007 2:08 AM
To: java-user@lucene.apache.org
Subject: Re: Wildcard query with untokenized punctuation
Hi Colin,
Is it possible that you are using an analyzer that breaks words on non
letters? For instance SimpleAnalyzer? if so, the doc text:
pagefile.sys
is indexed as two words:
pagefile
Hi Colin,
Is it possible that you are using an analyzer that breaks words on non
letters? For instance SimpleAnalyzer? if so, the doc text:
pagefile.sys
is indexed as two words:
pagefile sys
At search time, the query text:
pagefile.sys
is also parsed-tokenized into a two words query:
prof
-Original Message-
From: Steffen Heinrich [mailto:[EMAIL PROTECTED]
Sent: Fri 3/9/2007 4:31 PM
To: java-user@lucene.apache.org
Subject: Re: Wildcard query with untokenized punctuation
On 9 Mar 2007 at 15:10, McGuigan, Colin wrote:
>> I have a "filename" field in Luc
On 9 Mar 2007 at 15:10, McGuigan, Colin wrote:
> I have a "filename" field in Lucene that holds a value, like this:
> pagefile.sys
>
Hi Colin,
I'm still _very_ new to lucene, but isn't that what the un-tokenized
indexing is for?
Like in 1.9.1
doc.add(Field.Keyword("filename", "pagefile.sys"));
18 matches
Mail list logo