he query <> is parsed to
> PhraseQuery("smith ann").
> And that seems right, from a user standpoint.
>
> In fact, considering this, I realize <> should be parsed
> to MultiPhraseQuery("smith", "ann*"), not <<+smith +ann*>> as I sa
his issue: how to get QueryParser to generate
MultiPhraseQueries. Got some good ideas from it, but unfortunately no
complete solution. I'll keep on hacking.
--Renaud
-Original Message-
From: Mark Miller [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 14, 2007 12:07 PM
To: java-user@
uot;, "ann*"), not <<+smith +ann*>> as I said earlier.
B. Getting hairy. Any hope?
--Renaud
-Original Message-
From: Mark Miller [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 14, 2007 6:43 AM
To: java-user@lucene.apache.org
Subject: Re: Wildcard query with unt
r [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 14, 2007 6:43 AM
To: java-user@lucene.apache.org
Subject: Re: Wildcard query with untokenized punctuation (again)
Gotto agree with Erick here...best idea is just to preprocess the query
before sending it to the QueryParser.
My first thought i
Gotto agree with Erick here...best idea is just to preprocess the query
before sending it to the QueryParser.
My first thought is always to get out the sledgehammer...
- Mark
Erick Erickson wrote:
Well, perhaps the simplest thing would be to pre-process the query and
make the comma into a whi
if you don't use the same tokenizer for indexing and searching, you will
have troubles like this.
Mixing exact match (with ") and wildcard (*) is a strange idea.
Typographical rules says that you have a space after a comma, no?
Your field is tokenized?
M.
Renaud Waldura a écrit :
> My very simple
Well, perhaps the simplest thing would be to pre-process the query and
make the comma into a whitespace before sending anything to the
query parser. I don't know how generalizable that sort of solution is in
your problem space though
Best
Erick
On 6/13/07, Renaud Waldura <[EMAIL PROTECTED]>
After taking a quick look, I don't see how you can do this without
modifying the QueryParser. In QueryParser.jj you will find the conflict
of interest at line 891. This line will cause a match on smith,ann* and
trigger a wildcard term match on the whole piece.
This is again caused by the fact
: You're entirely correct about the analyzer (I'm using one that breaks on
: non-alphanumeric characters, so all punctuation is ignored). To be
: honest, I hadn't thought about altering this, but I guess I could; just
: reticent that there might be unforeseen consequences.
this is where the PerF
reasoning behind not analyzing wildcard queries is also explained in
the FAQ: "Are Wildcard, Prefix, and Fuzzy queries case sensitive?"
Regards,
Doron
>
> --Colin McGuigan
>
> -Original Message-
> From: Doron Cohen [mailto:[EMAIL PROTECTED]
> Sent: Saturday, Mar
arch 10, 2007 2:08 AM
To: java-user@lucene.apache.org
Subject: Re: Wildcard query with untokenized punctuation
Hi Colin,
Is it possible that you are using an analyzer that breaks words on non
letters? For instance SimpleAnalyzer? if so, the doc text:
pagefile.sys
is indexed as two words:
pagefile
Hi Colin,
Is it possible that you are using an analyzer that breaks words on non
letters? For instance SimpleAnalyzer? if so, the doc text:
pagefile.sys
is indexed as two words:
pagefile sys
At search time, the query text:
pagefile.sys
is also parsed-tokenized into a two words query:
prof
-Original Message-
From: Steffen Heinrich [mailto:[EMAIL PROTECTED]
Sent: Fri 3/9/2007 4:31 PM
To: java-user@lucene.apache.org
Subject: Re: Wildcard query with untokenized punctuation
On 9 Mar 2007 at 15:10, McGuigan, Colin wrote:
>> I have a "filename" field in Luc
On 9 Mar 2007 at 15:10, McGuigan, Colin wrote:
> I have a "filename" field in Lucene that holds a value, like this:
> pagefile.sys
>
Hi Colin,
I'm still _very_ new to lucene, but isn't that what the un-tokenized
indexing is for?
Like in 1.9.1
doc.add(Field.Keyword("filename", "pagefile.sys"));
14 matches
Mail list logo