> I've never used Lucene on windows, but if I recall correctly from past
> discussions on this topic, the IndexWriter will try to delete any file
> listed in deletable whenever it does any segment merging (ie: after adding
> some number of documents, when you call .optimize(), or when you call
> .c
Hi all,
I'm using Lucene/Digester etc for my MSc I'm quite new to these API's. I'm
trying to obtain advice but it's hard to say whether the problem is Lucene or
Digester.
Firstly:
I am trying to index the INEX collection but when I try to index repetitive
elements only the last one is indexed. F
How is Lucene handling very large queries? I have 6million documents, which
each has a "docID" field. There is a total of 2 distinct docID's, so
many documents got the same docID which consists of a filename (only name,
not path).
Sometimes, I must get all documents that has one of 10 docID's,
Hi, Trond,
It should be no problem for Lucene to handle 6 million documents.
For your query, it seems you want to do a disjunctive (or'ed) query for
multiple terms, 10 terms or 1 terms for example. The worst case I can
think of is, you can very easily write your own query class to handle this
Hi, Trond,
By the way, it appears to me that Lucene uses the iterator pattern a lot,
like SegmentTermEnum, TermDocs, TermPositions, etc. Each iterator uses the
underlying fix sized buffer to load a chunck of data at a time. So, even you
have millions of documents, you shouldn't run into memory pro