RE: LineDocMaker usage

2008-08-08 Thread Brittany Jacobs
10:23 AM To: java-user@lucene.apache.org Subject: Re: LineDocMaker usage In the example code 2 separate fields are being added to each document, the file name and the contents of one line. The fields can be queried or retrieved separately. There is a typo in the second Field line: should read "

Re: LineDocMaker usage

2008-08-08 Thread Ian Lea
essage- > From: Anshum [mailto:[EMAIL PROTECTED] > Sent: Wednesday, August 06, 2008 10:30 PM > To: java-user@lucene.apache.org > Subject: Re: LineDocMaker usage > > Hi, > How about just opening a file and parsing through it while adding doing a > doc.add on each newline? Tha

RE: LineDocMaker usage

2008-08-08 Thread Brittany Jacobs
Why do you add to the doc twice, once with the file path and once with the string? -Brittany -Original Message- From: Anshum [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 06, 2008 10:30 PM To: java-user@lucene.apache.org Subject: Re: LineDocMaker usage Hi, How about just opening a

RE: LineDocMaker usage

2008-08-08 Thread Brittany Jacobs
To: java-user@lucene.apache.org Subject: Re: LineDocMaker usage Hi, How about just opening a file and parsing through it while adding doing a doc.add on each newline? That should be pretty straight and simple. Just writing the snippet here, though this might have issues as didnt try to compile it

Re: LineDocMaker usage

2008-08-07 Thread Michael McCandless
Also, since the docs are so regular from a line file, to gain indexing throughput you can re-use the Document & Field instances, and inside the while loop just use Field.setValue(...) to change the value per document. Mike Anshum wrote: Hi, How about just opening a file and parsing thr

Re: LineDocMaker usage

2008-08-06 Thread Anshum
Hi, How about just opening a file and parsing through it while adding doing a doc.add on each newline? That should be pretty straight and simple. Just writing the snippet here, though this might have issues as didnt try to compile it. IndexWriter writer = new IndexWriter(indexDir, new Standard

Re: LineDocMaker usage

2008-08-06 Thread Michael McCandless
The format is: title date body But this is normally only used to create documents as part of an algorithm that you run under contrib/benchmark. Mike On Aug 6, 2008, at 4:12 PM, Brittany Jacobs wrote: Hello, I am new to all this. I need to read in a text file and have each line