Re: Wildcard query

2009-08-31 Thread Anshum
e word and then apply /expand the > wild card query. Is this a bug with queryparser? OR Analyzer will not stem > the word if it finds a wild card operator in it? > > Regards > Ganesh > > > - Original Message - > From: "Anshum" > To: > Sent: Mon

Re: Wildcard query

2009-08-31 Thread Ganesh
parser? OR Analyzer will not stem the word if it finds a wild card operator in it? Regards Ganesh - Original Message - From: "Anshum" To: Sent: Monday, August 31, 2009 3:01 PM Subject: Re: Wildcard query > Hi Ganesh, > > Its the snowball analyzer that uses English

Re: Wildcard query

2009-08-31 Thread Anshum
Hi Ganesh, Its the snowball analyzer that uses English Stemmer which is causing this behavior. When you search for* 'attention'*, the query gets parsed to*'attent' *whereas when you search for *'attenti'* it stays as it is because the stemmer is not able to fit it anywhere. Could you tell me what

Re: Wildcard query ...

2008-10-13 Thread Chris Hostetter
BooleanQuery picks a Scorer based on the number of clauses and what their options are ... all of teh scorers it might pick from are smart enough to continuously reorder the clauses having them "skip ahead" to the next document they match, beyond whatever docIds it already knows can't match (ba

Re: Wildcard query with untokenized punctuation (again)

2007-06-15 Thread Erick Erickson
he query <> is parsed to > PhraseQuery("smith ann"). > And that seems right, from a user standpoint. > > In fact, considering this, I realize <> should be parsed > to MultiPhraseQuery("smith", "ann*"), not <<+smith +ann*>> as I sa

RE: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Renaud Waldura
his issue: how to get QueryParser to generate MultiPhraseQueries. Got some good ideas from it, but unfortunately no complete solution. I'll keep on hacking. --Renaud -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Thursday, June 14, 2007 12:07 PM To: java-user@

Re: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Mark Miller
uot;, "ann*"), not <<+smith +ann*>> as I said earlier. B. Getting hairy. Any hope? --Renaud -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Thursday, June 14, 2007 6:43 AM To: java-user@lucene.apache.org Subject: Re: Wildcard query with unt

RE: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Renaud Waldura
r [mailto:[EMAIL PROTECTED] Sent: Thursday, June 14, 2007 6:43 AM To: java-user@lucene.apache.org Subject: Re: Wildcard query with untokenized punctuation (again) Gotto agree with Erick here...best idea is just to preprocess the query before sending it to the QueryParser. My first thought i

Re: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Mark Miller
Gotto agree with Erick here...best idea is just to preprocess the query before sending it to the QueryParser. My first thought is always to get out the sledgehammer... - Mark Erick Erickson wrote: Well, perhaps the simplest thing would be to pre-process the query and make the comma into a whi

Re: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Mathieu Lecarme
if you don't use the same tokenizer for indexing and searching, you will have troubles like this. Mixing exact match (with ") and wildcard (*) is a strange idea. Typographical rules says that you have a space after a comma, no? Your field is tokenized? M. Renaud Waldura a écrit : > My very simple

Re: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Erick Erickson
Well, perhaps the simplest thing would be to pre-process the query and make the comma into a whitespace before sending anything to the query parser. I don't know how generalizable that sort of solution is in your problem space though Best Erick On 6/13/07, Renaud Waldura <[EMAIL PROTECTED]>

Re: Wildcard query with untokenized punctuation (again)

2007-06-13 Thread Mark Miller
After taking a quick look, I don't see how you can do this without modifying the QueryParser. In QueryParser.jj you will find the conflict of interest at line 891. This line will cause a match on smith,ann* and trigger a wildcard term match on the whole piece. This is again caused by the fact

RE: Wildcard query with untokenized punctuation

2007-03-12 Thread Chris Hostetter
: You're entirely correct about the analyzer (I'm using one that breaks on : non-alphanumeric characters, so all punctuation is ignored). To be : honest, I hadn't thought about altering this, but I guess I could; just : reticent that there might be unforeseen consequences. this is where the PerF

RE: Wildcard query with untokenized punctuation

2007-03-10 Thread Doron Cohen
reasoning behind not analyzing wildcard queries is also explained in the FAQ: "Are Wildcard, Prefix, and Fuzzy queries case sensitive?" Regards, Doron > > --Colin McGuigan > > -Original Message- > From: Doron Cohen [mailto:[EMAIL PROTECTED] > Sent: Saturday, Mar

RE: Wildcard query with untokenized punctuation

2007-03-10 Thread McGuigan, Colin
arch 10, 2007 2:08 AM To: java-user@lucene.apache.org Subject: Re: Wildcard query with untokenized punctuation Hi Colin, Is it possible that you are using an analyzer that breaks words on non letters? For instance SimpleAnalyzer? if so, the doc text: pagefile.sys is indexed as two words: pagefile

Re: Wildcard query with untokenized punctuation

2007-03-10 Thread Doron Cohen
Hi Colin, Is it possible that you are using an analyzer that breaks words on non letters? For instance SimpleAnalyzer? if so, the doc text: pagefile.sys is indexed as two words: pagefile sys At search time, the query text: pagefile.sys is also parsed-tokenized into a two words query: prof

RE: Wildcard query with untokenized punctuation

2007-03-09 Thread McGuigan, Colin
-Original Message- From: Steffen Heinrich [mailto:[EMAIL PROTECTED] Sent: Fri 3/9/2007 4:31 PM To: java-user@lucene.apache.org Subject: Re: Wildcard query with untokenized punctuation On 9 Mar 2007 at 15:10, McGuigan, Colin wrote: >> I have a "filename" field in Luc

Re: Wildcard query with untokenized punctuation

2007-03-09 Thread Steffen Heinrich
On 9 Mar 2007 at 15:10, McGuigan, Colin wrote: > I have a "filename" field in Lucene that holds a value, like this: > pagefile.sys > Hi Colin, I'm still _very_ new to lucene, but isn't that what the un-tokenized indexing is for? Like in 1.9.1 doc.add(Field.Keyword("filename", "pagefile.sys"));