When I call IndexWriter.addIndexes, is there anything I can do to make it
filter out duplicates based a certain field (or group of fields)? If I
know that the id field of the document is unique, can I make addIndexes know
that if it finds a new document bat the same id, the new one is valid and
I want to shard my index, which is currently a Lucene 3.0.0 solution with my
own service wrapper around it. Looking at Katta and Solr, I can see how to
do this pretty easily, but Katta and Solr don't seem to offer any help for
managing the index (write, update, delete). With Solor, it seems I'm
I am using org.apache.lucene.analysis.snowball.SnowballAnalyzer.
Looking through luke, I see that www.fubar.com was indexed, not fubar. So,
clearly, I'm not stripping out the stop words of www and com. Any ideas?
--
View this message in context:
http://old.nabble.com/Can%27t-get-tokenizatio
I want to be able to store a doc with a field with this as a substring:
www.fubar.com
And then I want this document to get returned when I query on
fubar or
fubar.com
I assume what I should do is make www and com stop words, and make sure the
field is tokenized, so it wil break it up along
n't explain the difference).
I don't want to overwrite my index every time I start up, but I don't want
to be able to start up with a new, clean index dir. What do I do?
jchang wrote:
>
>
> When I try to start my service and construct an IndexWriter, I get this:
>
&g
When I try to start my service and construct an IndexWriter, I get this:
java.io.FileNotFoundException: no segments* file found in
org.apache.lucene.store.NIOFSDirectory@/home/jchang/IdeaProjects/index-service_trunk/target/testindexA/index/indexablemaildata:
files: [write.lock]
It is odd. The