i'll try using Luke. how i want to use Lucene?
there is a sentence that may contain the names of some people from among those in a list. the names may be incomplete or may have spelling mistakes. so i created a lucene index, with each person as a document. eg. Mr. Arun Kumar with a hit highlighter i get <B>Mr</B>. <B>Arun</B> <B>Kumar</B> what i want is <B>Mr. Arun Kumar</B> even when there are spelling mistakes. Rohit Banga On Tue, Feb 9, 2010 at 1:57 PM, Uwe Schindler <u...@thetaphi.de> wrote: > If you don't get it working that way, then you have to ask you the > question: Why do you want it indexed that way? Is it because you don't want > to find all people in that field when you add ony "Mr." to a search query? > It looks like you use StandardAnalyzer, and in this case, I would add "mr", > not "mr!", to the stop word list and index the name field as any other > field. Before doing this, it would be good to explain, what you are > intending to do/prevent by indexing with NOT_ANALYZED, which is the source > of your problem. > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -----Original Message----- > > From: Rohit Banga [mailto:iamrohitba...@gmail.com] > > Sent: Tuesday, February 09, 2010 9:03 AM > > To: java-user@lucene.apache.org > > Subject: Re: Lucene fields not analyzed > > > > let us assume this is the only field that is relevant (others are > > stored and > > not indexed). > > i tried termquery and it does not work. > > i also tried keyword analyzer and still could not make it work. > > > > @Mark > > i cannot escape the spaces in my query as i am using Lucene to identify > > occurences of names among other things in the unstructured sentence. > > so while adding names to the index, i used keyword analyzer and changed > > the > > name to be added to the index to "Mr.\\ Kumar" > > but still couldn't get it to work. > > > > > > > > > > > > > > Rohit Banga > > > > > > On Tue, Feb 9, 2010 at 1:06 PM, Mark Harwood > > <markharw...@yahoo.co.uk>wrote: > > > > > I suspect it is because QueryParser uses space characters to separate > > > different clauses in a query string while you want the space to > > represent > > > some content in your "name" field. Try escaping the space character. > > > > > > Cheers > > > Mark > > > > > > > > > > > > On 9 Feb 2010, at 07:26, Rohit Banga wrote: > > > > > > > Hello > > > > > > > > i have a field that stores names of people. i have used the > > NOT_ANALYZED > > > > parameter to index the names. > > > > > > > > this is what happens during indexing > > > > > > > > doc.add(new Field("name", "\"" + name + "\"", Field.Store.YES, > > > > Field.Index.NOT_ANALYZED)); > > > > > > > > > > > > > > > > when i search it, i create a query parser using standardanalyzer > > and > > > append > > > > ~0.5 to the search query. > > > > > > > > the problem is that if the indexed name is "Mr. Kumar", my search > > does > > > not > > > > work for "Mr. Kumar" while it does work for "Mr.Kumar" (without the > > > space). > > > > > > > > // searching code > > > > File index_directory = new File(INDEX_DIR_PATH); > > > > IndexReader reader = > > > > IndexReader.open(FSDirectory.open(index_directory), true); > > > > Searcher searcher = new IndexSearcher(reader); > > > > > > > > Analyzer analyzer = new > > StandardAnalyzer(Version.LUCENE_CURRENT); > > > > > > > > QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, > > > "name", > > > > analyzer); > > > > > > > > Query query; > > > > query = parser.parse(text + "~0.5"); > > > > > > > > how to make it work? > > > > > > > > Rohit Banga > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >