What Shashi said..... On Sun, Mar 8, 2009 at 10:22 AM, Shashi Kant <sk...@sloan.mit.edu> wrote:
> Liat, i think what Erick suggested was to use the TOKENIZED setting instead > of UN_TOKENIZED. For example your code should read something like: > > Document doc = new Document(); > doc.add(new Field(WordIndex.FIELD_WORLDS, "111 222 333", Field.Store.YES, * > Field.Index.TOKENIZED*)); > > Unless I am missing something , you do not need to modify Luke's code-base. > You could however use it to open your index. > > Shashi > > > > On Sun, Mar 8, 2009 at 10:09 AM, Erick Erickson <erickerick...@gmail.com > >wrote: > > > In Luke, you can select from among most of the analyzers provided by > > Lucene, there's a drop-down on the search tab. And I'm sure you can > > add your own custom ones, but I confess I've never tried. > > > > Best > > Erick > > > > On Sun, Mar 8, 2009 at 4:14 AM, liat oren <oren.l...@gmail.com> wrote: > > > > > Ok, thanks. > > > > > > I will have to edit the code of Luke in order to add another analyzer, > > > right? > > > > > > If I need to query a long text - for example to search for articles > that > > > are > > > textually close in their content, I need to parse to the query the > text > > of > > > the article. Then I get the error that it is too long. Is there any way > > to > > > solve it? > > > > > > Thanks a lot, > > > Liat > > > > > > 2009/3/5 Erick Erickson <erickerick...@gmail.com> > > > > > > > I think your root problem is that you're indexing UN_TOKENIZED, which > > > > means that the tokens you're adding to your index are NOT run through > > > > the analyzer. > > > > > > > > So your terms are exactly "111", "222 333" and "111 222 333", none of > > > which > > > > match "222". I expect you wanted your tokens to be "111", "222", and > > > "333", > > > > each appearing twice in your index. > > > > > > > > Try indexing them tokenized. Although note that I don't remember what > > > > StandardAnalyzer does with numbers. WhitespaceAnalyzer does the > > > > more intuitive thing, but beware that it doesn't fold case. But it > > might > > > be > > > > an easier place for you to start until you get more comfortable with > > what > > > > various analyzers do. > > > > > > > > Also, I *strongly* advise that you get a copy of Luke. It is a > > wonderful > > > > tool > > > > that allows you to examine your index, analyze queries, test queries, > > > etc. > > > > > > > > But be aware that the site that maintains Luke was having problems > > > > yesterday, > > > > look over the user list messages from yesterday if you have problems. > > > > > > > > Best > > > > Erick > > > > > > > > On Thu, Mar 5, 2009 at 8:40 AM, liat oren <oren.l...@gmail.com> > wrote: > > > > > > > > > Hi, > > > > > > > > > > I would like to do a search that will return documents that contain > a > > > > given > > > > > word. > > > > > For example, I created the following index: > > > > > > > > > > IndexWriter writer = new IndexWriter("C:/TryIndex", new > > > > > StandardAnalyzer()); > > > > > Document doc = new Document(); > > > > > doc.add(new Field(WordIndex.FIELD_WORLDS, "111 222 333", > > > > Field.Store.YES, > > > > > Field.Index.UN_TOKENIZED)); > > > > > writer.addDocument(doc); > > > > > doc = new Document(); > > > > > doc.add(new Field(WordIndex.FIELD_WORLDS, "111", Field.Store.YES, > > > > > Field.Index.UN_TOKENIZED)); > > > > > writer.addDocument(doc); > > > > > doc = new Document(); > > > > > doc.add(new Field(WordIndex.FIELD_WORLDS, "222 333", > > Field.Store.YES, > > > > > Field.Index.UN_TOKENIZED)); > > > > > writer.addDocument(doc); > > > > > writer.optimize(); > > > > > writer.close(); > > > > > > > > > > now I want to get all the documents that contain the word "222". > > > > > > > > > > I tried to run the following code but it doesn;t return any doc > > > > > > > > > > IndexSearcher searcher = new IndexSearcher(indexPath); > > > > > > > > > > // // TermQuery mapQuery = new TermQuery(new Term(FIELD_WORLDS, > > > > > worldNum)); - this one also didn't word > > > > > Analyzer analyzer = new StandardAnalyzer(); > > > > > QueryParser parser = new QueryParser(FIELD_WORLDS, analyzer); > > > > > Query query = parser.parse(worldNum); > > > > > Hits mapHits = searcher.search(query); > > > > > > > > > > > > > > > Thanks a lot, > > > > > Liat > > > > > > > > > > > > > > >