Maybe you should try posting to a Lucene mailing list? Nils-H
On Thu, Jan 1, 2009 at 9:28 PM, Amin Mohammed-Coleman <ami...@gmail.com> wrote: > Hi > > I have created a RTFHandler which takes a RTF file and creates a lucene > Document which is indexed. The RTFHandler looks like something like this: > > if (bodyText != null) { > Document document = new Document(); > Field field = new > Field(MetaDataEnum.BODY.getDescription(), bodyText.trim(), Field.Store.YES, > Field.Index.ANALYZED); > document.add(field); > > > } > > I am using Java Built in RTF text extraction. When I run my test to verify > that the document contains text that I expect this works fine. I get the > following when I print the document: > > Document<stored/uncompressed,indexed,tokenized<body:This is a test rtf > document that will be indexed. > > Amin Mohammed-Coleman> > stored/uncompressed,indexed<path:rtfDocumentToIndex.rtf> > stored/uncompressed,indexed<name:rtfDocumentToIndex.rtf> > stored/uncompressed,indexed<type:RTF_INDEXER> > stored/uncompressed,indexed<summary:This is a >> > > > The problem is when I use the following to search I get no result: > > MultiSearcher multiSearcher = new MultiSearcher(new Searchable[] > {rtfIndexSearcher}); > Term t = new Term("body", "Amin"); > TermQuery termQuery = new TermQuery(t); > TopDocs topDocs = multiSearcher.search(termQuery, 1); > System.out.println(topDocs.totalHits); > multiSearcher.close(); > > RftIndexSearcher is configured with the directory that holds rtf documents. > I have used Luke to look at the document and what I am finding in the > overview tab is the following for the document: > > 1 body test > 1 id 1234 > 1 name rtfDocumentToIndex.rtf > 1 path rtfDocumentToIndex.rtf > 1 summary This is a > 1 type RTF_INDEXER > 1 body rtf > > > However on the Document tab I am getting (in the body field): > > This is a test rtf document that will be indexed. > > Amin Mohammed-Coleman > > > I would expect to get a hit using "Amin" or even "document". I am not sure > whether the > line: > TopDocs topDocs = multiSearcher.search(termQuery, 1); > > is incorrect as I am not too sure of the meaning of "Finds the top n hits > for query." for search (Query query, int n) according to java docs. > > I would be grateful if someone may be able to advise on what I may be doing > wrong. I am using Lucene 2.4.0 > > > Cheers > Amin > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@struts.apache.org For additional commands, e-mail: user-h...@struts.apache.org