Re: Most used words

2006-04-23 Thread Kapil Chhabra
Its a good idea. src/org/getopt/luke/HighFreqTerms.java should definately help. You may also control the number of terms[tags in your case] required as the output. Do you have a demo URL for your appl.? Regards, kapilChhabra Daniel Cortes wrote: Thks for the reply, perhaps to use something li

Re: Segments creation

2006-04-23 Thread John Paige
Thanks for the responses. For fault tolerance, we have decided to set the "mergeFactor" to 1, ie, we want to update the on-disk index every time a document is added via addDocument(). My question was that will this create a new segment every time? Or is there a separate "segmentFactor", such that a

Re: Segments creation

2006-04-23 Thread Erik Hatcher
If you use the compound format, all "files" are kept inside a single filesystem file. Erik On Apr 23, 2006, at 2:13 PM, John Paige wrote: So, if I use one indexwriter instance to index one document, will it create a segment per document? How many files per segment get added if I u

Re: demo example

2006-04-23 Thread Grant Ingersoll
Did you follow the directions at http://lucene.apache.org/java/docs/gettingstarted.html There is an ANT task that creates a WAR file for the Demo (called luceneweb.war), so I don't think you should have to do any copying of jar files. You should be able to copy this to Tomcat or whatever ser

Re: How to serach in sentence and dispaly the whole sentence

2006-04-23 Thread Grant Ingersoll
Anton, I think there are at least a couple of ways of doing this. I assume you have a program that does sentence detection already, as Lucene does not provide this. If not, I am sure a search of the web will find one that has high accuracy. You can: 1. Index each sentence as a separate Docu

How to serach in sentence and dispaly the whole sentence

2006-04-23 Thread anton feldmann
I intend, to make a search, to find a word or a word pair in a sentence or a paragraph. But then the sentence should be indicated as a whole. The question relates to the fact, that I need to extend Lucene in such a way that this is possible. But where to I make a start, because I have no idea, ho

Re: Segments creation

2006-04-23 Thread John Paige
So, if I use one indexwriter instance to index one document, will it create a segment per document? How many files per segment get added if I use compound index file format? Thanks, John On 4/23/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > If you use the compound index file format (the default

Re: Segments creation

2006-04-23 Thread Erik Hatcher
If you use the compound index file format (the default since Lucene 1.4) you'll avoid the file descriptors issue. If you add 10 documents at one time with a single IndexWriter, you will not create 10 segments, only one segment (generally speaking, based on the default segment factors).

Segments creation

2006-04-23 Thread John Paige
Hello all, In my application it is required to build an index for each user. We need to add documents to the existing index frequently. We cannot use RAMDirectory to create a RAM index and merge it with the FSDirectory index later on based on the mergefactor. We need to add each document in the