can IndexWriter.addIndexes de-dupe documents?

2010-02-22 Thread jchang
When I call IndexWriter.addIndexes, is there anything I can do to make it filter out duplicates based a certain field (or group of fields)? If I know that the id field of the document is unique, can I make addIndexes know that if it finds a new document bat the same id, the new one is valid and

Can Zoie (or something else) help me to write/update/delete to sharded index?

2010-02-19 Thread jchang
I want to shard my index, which is currently a Lucene 3.0.0 solution with my own service wrapper around it. Looking at Katta and Solr, I can see how to do this pretty easily, but Katta and Solr don't seem to offer any help for managing the index (write, update, delete). With Solor, it seems I'm

Re: Can't get tokenization/stop works working

2010-02-02 Thread jchang
I am using org.apache.lucene.analysis.snowball.SnowballAnalyzer. Looking through luke, I see that www.fubar.com was indexed, not fubar. So, clearly, I'm not stripping out the stop words of www and com. Any ideas? -- View this message in context: http://old.nabble.com/Can%27t-get-tokenizatio

Can't get tokenization/stop works working

2010-01-31 Thread jchang
I want to be able to store a doc with a field with this as a substring: www.fubar.com And then I want this document to get returned when I query on fubar or fubar.com I assume what I should do is make www and com stop words, and make sure the field is tokenized, so it wil break it up along

Re: Can't start Lucene App: java.io.FileNotFoundException with brand new directory

2010-01-23 Thread jchang
n't explain the difference). I don't want to overwrite my index every time I start up, but I don't want to be able to start up with a new, clean index dir. What do I do? jchang wrote: > > > When I try to start my service and construct an IndexWriter, I get this: > &g

Can't start Lucene App: java.io.FileNotFoundException with brand new directory

2010-01-23 Thread jchang
When I try to start my service and construct an IndexWriter, I get this: java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.NIOFSDirectory@/home/jchang/IdeaProjects/index-service_trunk/target/testindexA/index/indexablemaildata: files: [write.lock] It is odd. The