moreover, search for Mr. Arun Kumar also matches other names because Mr. matches. i am ready to use Mr. as a stop word in an analyzer.
Rohit Banga On Tue, Feb 9, 2010 at 2:42 PM, Rohit Banga <iamrohitba...@gmail.com> wrote: > i'll try using Luke. > > how i want to use Lucene? > > there is a sentence that may contain the names of some people from among > those in a list. the names may be incomplete or may have spelling mistakes. > > so i created a lucene index, with each person as a document. > > eg. > > Mr. Arun Kumar > > with a hit highlighter i get > > <B>Mr</B>. <B>Arun</B> <B>Kumar</B> > > what i want is > <B>Mr. Arun Kumar</B> > > even when there are spelling mistakes. > > > Rohit Banga > > > > On Tue, Feb 9, 2010 at 1:57 PM, Uwe Schindler <u...@thetaphi.de> wrote: > >> If you don't get it working that way, then you have to ask you the >> question: Why do you want it indexed that way? Is it because you don't want >> to find all people in that field when you add ony "Mr." to a search query? >> It looks like you use StandardAnalyzer, and in this case, I would add "mr", >> not "mr!", to the stop word list and index the name field as any other >> field. Before doing this, it would be good to explain, what you are >> intending to do/prevent by indexing with NOT_ANALYZED, which is the source >> of your problem. >> >> Uwe >> >> ----- >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> >> > -----Original Message----- >> > From: Rohit Banga [mailto:iamrohitba...@gmail.com] >> > Sent: Tuesday, February 09, 2010 9:03 AM >> > To: java-user@lucene.apache.org >> > Subject: Re: Lucene fields not analyzed >> > >> > let us assume this is the only field that is relevant (others are >> > stored and >> > not indexed). >> > i tried termquery and it does not work. >> > i also tried keyword analyzer and still could not make it work. >> > >> > @Mark >> > i cannot escape the spaces in my query as i am using Lucene to identify >> > occurences of names among other things in the unstructured sentence. >> > so while adding names to the index, i used keyword analyzer and changed >> > the >> > name to be added to the index to "Mr.\\ Kumar" >> > but still couldn't get it to work. >> > >> > >> > >> > >> > >> > >> > Rohit Banga >> > >> > >> > On Tue, Feb 9, 2010 at 1:06 PM, Mark Harwood >> > <markharw...@yahoo.co.uk>wrote: >> > >> > > I suspect it is because QueryParser uses space characters to separate >> > > different clauses in a query string while you want the space to >> > represent >> > > some content in your "name" field. Try escaping the space character. >> > > >> > > Cheers >> > > Mark >> > > >> > > >> > > >> > > On 9 Feb 2010, at 07:26, Rohit Banga wrote: >> > > >> > > > Hello >> > > > >> > > > i have a field that stores names of people. i have used the >> > NOT_ANALYZED >> > > > parameter to index the names. >> > > > >> > > > this is what happens during indexing >> > > > >> > > > doc.add(new Field("name", "\"" + name + "\"", Field.Store.YES, >> > > > Field.Index.NOT_ANALYZED)); >> > > > >> > > > >> > > > >> > > > when i search it, i create a query parser using standardanalyzer >> > and >> > > append >> > > > ~0.5 to the search query. >> > > > >> > > > the problem is that if the indexed name is "Mr. Kumar", my search >> > does >> > > not >> > > > work for "Mr. Kumar" while it does work for "Mr.Kumar" (without the >> > > space). >> > > > >> > > > // searching code >> > > > File index_directory = new File(INDEX_DIR_PATH); >> > > > IndexReader reader = >> > > > IndexReader.open(FSDirectory.open(index_directory), true); >> > > > Searcher searcher = new IndexSearcher(reader); >> > > > >> > > > Analyzer analyzer = new >> > StandardAnalyzer(Version.LUCENE_CURRENT); >> > > > >> > > > QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, >> > > "name", >> > > > analyzer); >> > > > >> > > > Query query; >> > > > query = parser.parse(text + "~0.5"); >> > > > >> > > > how to make it work? >> > > > >> > > > Rohit Banga >> > > >> > > >> > > --------------------------------------------------------------------- >> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > >> > > >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >