ok, thnx... I will implement using the WhiteSpaceAnalyzer... Let me check the indexing speed... I mean time taken to index my data set... If that takes too long then probably I will look into implementing a custom analyzer...
Zhang, Lisheng wrote: > > Hi, > > In case you donot want to toss away any stop words and even > preserve case, WhiteSpaceAnalyzer can be used, also using > WhiteSpaceTokenizer would serve as a test (but need to reindex > whole data set first), to make sure there is no other problems. > > Best regards, Lisheng > > > > -----Original Message----- > From: mark harwood [mailto:[EMAIL PROTECTED] > Sent: Tuesday, December 18, 2007 9:42 AM > To: java-user@lucene.apache.org > Subject: Re: Phrase Query Problem > > > You could write a custom analyzer that drops stopwords but adds an extra 1 > to the "positionIncrement" property for the next valid Token after each > omiited stop word. > > This would retain the benefit of removing stopwords from your index and > yet > prevent your example phrases matching (because the remaining words are not > recorded as being directly next to each other) > > Cheers > Mark > > > ----- Original Message ---- > From: Sirish Vadala <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Tuesday, 18 December, 2007 5:10:19 PM > Subject: RE: Phrase Query Problem > > > Yes... If my query phrase is "Health Safety", docs with "Health and > Safety", > "Health or Safety" are being returned... > > So... Is there any other way to handle this situation... Especially in > the > above mentioned case, the user is expecting around 5 records and the > query > is fetching more than 550 records.8-O > > Thanks. > > > Zhang, Lisheng wrote: >> >> Hi, >> >> Do you mean that your query phrase is "Health Safety", >> but docs with "Health and Safety" returned? >> >> If that is the case, the reason is that StandardAnalyzer >> filters out "and" (also "or, "in" and others) as stop >> words during indexing, and the QueryParser filters those >> words out also. >> >> Best regards, Lisheng >> >> -----Original Message----- >> From: Sirish Vadala [mailto:[EMAIL PROTECTED] >> Sent: Monday, December 17, 2007 9:49 AM >> To: java-user@lucene.apache.org >> Subject: Phrase Query Problem >> >> >> >> I have the following code for search: >> >> BooleanQuery bQuery = new BooleanQuery(); >> Query queryAuthor; >> queryAuthor = new TermQuery(new Term(IFIELD_LEAD_AUTHOR, >> author.trim().toLowerCase())); >> bQuery.add(queryAuthor, BooleanClause.Occur.MUST); >> .................................................................... >> .................................................................... >> >> PhraseQuery pQuery = new PhraseQuery(); >> String[] phrase = txtWithPhrase.toLowerCase().split(" "); >> for (int i = 0; i < phrase.length; i++) { >> pQuery.add(new Term(IFIELD_TEXT, phrase[i])); >> } >> pQuery.setSlop(0); >> bQuery.add(pQuery, BooleanClause.Occur.MUST); >> .................................................................... >> .................................................................... >> >> String[] sortOrder = {IFIELD_LEAD_AUTHOR, IFIELD_TEXT}; >> Sort sort = new Sort(sortOrder); >> hits = indexSearcher.search(bQuery, sort); >> >> Now My problem here is: If I do a search on a phrase with text Health >> Safety, it is fetching me all the records where in the text is Health >> and/or/in Safety. It is fetching me these records even after setting > the >> slop of the phrase query to zero for exact match. I am using standard >> analyzer while indexing my records. >> >> Any help on this is greatly appreciated. >> >> Sirish Vadala >> -- >> View this message in context: >> http://www.nabble.com/Phrase-Query-Problem-tp14373945p14373945.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> >> > > -- > View this message in context: > http://www.nabble.com/Phrase-Query-Problem-tp14373945p14401354.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > __________________________________________________________ > Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/Phrase-Query-Problem-tp14373945p14402820.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]