On Tue, Mar 16, 2010 at 20:45, Rene Hackl-Sommer wrote:
> Hi Daniel,
>
> Unless you have only a few documents and a small index, I don't think never
> calling optimize is going to be a means you should rely upon.
>
> What about if you reindexed the documents you are deleting, adding a field
> wit
I cannot comment on the "marked-as-deleted" documents, but for the
approach I outlined: this might impact the scores. I prefer to say
'impact' instead of 'skew', because to me 'skew' would imply that the
original scores are some kind of ideal state which is distorted. I don't
think this is nece
Wouldn't these excluded/filtered documents skew the scores even though they
are supposed to be marked as deleted? Don't the idf values used in scoring
depend on the entire document set and not just the matching hits for a
query?
Thanks,
TCK
On Tue, Mar 16, 2010 at 5:45 AM, Rene Hackl-Sommer wr
Hi Daniel,
Unless you have only a few documents and a small index, I don't think
never calling optimize is going to be a means you should rely upon.
What about if you reindexed the documents you are deleting, adding a
field with the value "true"? This would imply that
either
1) all fields
An incidental merge will delete them.
I think you'll have to maintain your own filter... but it shouldn't be
that large? Ie it's as large as deleted docs BitVector would be
anyway... except that the docs never go away.
Mike
On Mon, Mar 15, 2010 at 11:20 PM, Daniel Noll wrote:
> Hi all.
>
> I'm
Hi all.
I'm trying to implement a form of document deletion where the previous
versions are kept around forever ( a primitive form of versioning) but
excluded from the search results.
I notice that after calling IndexWriter.deleteDocuments, even if you
close and reopen the index, the documents ar