Its a good idea.
src/org/getopt/luke/HighFreqTerms.java should definately help.
You may also control the number of terms[tags in your case] required as
the output.
Do you have a demo URL for your appl.?
Regards,
kapilChhabra
Daniel Cortes wrote:
Thks for the reply, perhaps to use something li
Thanks for the responses. For fault tolerance, we have decided to set the
"mergeFactor" to 1, ie, we want to update the on-disk index every time a
document is added via addDocument(). My question was that will this create a
new segment every time? Or is there a separate "segmentFactor", such that a
If you use the compound format, all "files" are kept inside a single
filesystem file.
Erik
On Apr 23, 2006, at 2:13 PM, John Paige wrote:
So, if I use one indexwriter instance to index one document, will
it create
a segment per document?
How many files per segment get added if I u
Did you follow the directions at
http://lucene.apache.org/java/docs/gettingstarted.html
There is an ANT task that creates a WAR file for the Demo (called
luceneweb.war), so I don't think you should have to do any copying of
jar files. You should be able to copy this to Tomcat or whatever
ser
Anton,
I think there are at least a couple of ways of doing this. I assume you
have a program that does sentence detection already, as Lucene does not
provide this. If not, I am sure a search of the web will find one that
has high accuracy.
You can:
1. Index each sentence as a separate Docu
I intend, to make a search, to find a word or a word pair
in a sentence or a paragraph. But then the sentence should be indicated
as a whole. The question relates to the fact, that I need to extend Lucene
in such a way that this is possible. But where to I make a start, because
I have no idea, ho
So, if I use one indexwriter instance to index one document, will it create
a segment per document?
How many files per segment get added if I use compound index file format?
Thanks,
John
On 4/23/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:
>
> If you use the compound index file format (the default
If you use the compound index file format (the default since Lucene
1.4) you'll avoid the file descriptors issue. If you add 10
documents at one time with a single IndexWriter, you will not create
10 segments, only one segment (generally speaking, based on the
default segment factors).
Hello all,
In my application it is required to build an index for each user. We need
to add documents to the existing index frequently.
We cannot use RAMDirectory to create a RAM index and merge it with the
FSDirectory index later on based on the mergefactor. We need to add each
document in the