java_is_everything wrote:


Hi all.

This may seem a longish and informal mail, but do correct me if my
assumptions are wrong anywhere, otherwise my actual doubt will make no
sense.

Say I opened an IndexWriter on an initially empty directory, using
autocommit = true. Now, what I do is add and delete documents randomly. I
set "x" as maxBufferedDocs and "y" as maxBufferedDeleteTerms (x < y).

IndexWritrer starts its work. Now, I perfom the following sequences :

STAGE 1 :
Add "x-2" documents one after the another. Total docs in
memory = x-2                (1)
Delete 3 docs from memory Total docs
in memory = x-5               (2)
Add 5 docs one after another Total docs
in memory = x                   (3)

STAGE 2 :
A flush happens, sice maxBufferedDocs reached.           Total docs in
memory = 0                  (4)

One correction here: the added doc count that triggers a flush does *not* measure deletions. So, in your step (3) above, after having added 2 of the 5 docs, IW will flush. Then it has 3 added docs buffered in RAM.

Thus, it is also a commit.

Assuming you're talking about trunk at this point (not 2.3), because only trunk distinguishes commit() vs flush(): there's no guarantee exactly when IW does a commit() when autoCommit is true. Also, autoCommit is deprecated, meaning in 3.0 it will be hardwired to false, so your application must commit() or close() when it's necessary.

STAGE 3 :
Add x-10 docs one after other Total docs
in memory = x-10              (5)

Actually x-7 in memory now.

NOW ... I call deleteDocuments(Term term), which has potential matches at
two places :
a) x-15 (out of x-10) documents currently residing in memory.
b) x-20 (out of x) documents currently in the index directory.
(6)

IndexWriter.close() is called
(7)


Now, my question is, will the index contain
(i) a total of (x) + (x-10) - (x-15) documents
(ii) a total of (x) + (x-10) - (x-20) documents
(iii) a total of (x) + (x-10) - (x-15) - (x-20) documents

Index will contain (iii), corrected to (x) + (x-7) - (x-15) - (x-20). Ie the deletion always applies to any documents, flushed or buffered in RAM. The deletion is fully independent of what buffering IW is doing.

Secondly, will the answer change had I opened the IndexWriter in autocommit
= false mode?

No, you get the same result. autoCommit simply affects *when* the changes become visible/durable to an external reader, not what changes occur. Any series of changes using an IW will produce the same final result regardless of autoCommit, assuming we're not talking about JRE/ machine's crashing, etc.

Several other permutations of (autocommit mode), (points of flush call), (points of close call) exist, but I guess I will be fine if I get the answer to the first question itself. A little explanation will be highly obliged.

autoCommit, points of flushing, points of committing, points of merging, etc, should be fully independent of what changes (add/ deletes) you are doing.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to