java_is_everything wrote:
Hi all.
This may seem a longish and informal mail, but do correct me if my
assumptions are wrong anywhere, otherwise my actual doubt will make no
sense.
Say I opened an IndexWriter on an initially empty directory, using
autocommit = true. Now, what I do is add and delete documents
randomly. I
set "x" as maxBufferedDocs and "y" as maxBufferedDeleteTerms (x < y).
IndexWritrer starts its work. Now, I perfom the following sequences :
STAGE 1 :
Add "x-2" documents one after the another. Total docs
in
memory = x-2 (1)
Delete 3 docs from memory
Total docs
in memory = x-5 (2)
Add 5 docs one after another
Total docs
in memory = x (3)
STAGE 2 :
A flush happens, sice maxBufferedDocs reached. Total docs in
memory = 0 (4)
One correction here: the added doc count that triggers a flush does
*not* measure deletions. So, in your step (3) above, after having
added 2 of the 5 docs, IW will flush. Then it has 3 added docs
buffered in RAM.
Thus, it is also a commit.
Assuming you're talking about trunk at this point (not 2.3), because
only trunk distinguishes commit() vs flush(): there's no guarantee
exactly when IW does a commit() when autoCommit is true. Also,
autoCommit is deprecated, meaning in 3.0 it will be hardwired to
false, so your application must commit() or close() when it's necessary.
STAGE 3 :
Add x-10 docs one after other
Total docs
in memory = x-10 (5)
Actually x-7 in memory now.
NOW ... I call deleteDocuments(Term term), which has potential
matches at
two places :
a) x-15 (out of x-10) documents currently residing in memory.
b) x-20 (out of x) documents currently in the index directory.
(6)
IndexWriter.close() is called
(7)
Now, my question is, will the index contain
(i) a total of (x) + (x-10) - (x-15) documents
(ii) a total of (x) + (x-10) - (x-20) documents
(iii) a total of (x) + (x-10) - (x-15) - (x-20) documents
Index will contain (iii), corrected to (x) + (x-7) - (x-15) - (x-20).
Ie the deletion always applies to any documents, flushed or buffered
in RAM. The deletion is fully independent of what buffering IW is
doing.
Secondly, will the answer change had I opened the IndexWriter in
autocommit
= false mode?
No, you get the same result. autoCommit simply affects *when* the
changes become visible/durable to an external reader, not what changes
occur. Any series of changes using an IW will produce the same final
result regardless of autoCommit, assuming we're not talking about JRE/
machine's crashing, etc.
Several other permutations of (autocommit mode), (points of flush
call),
(points of close call) exist, but I guess I will be fine if I get
the answer
to the first question itself. A little explanation will be highly
obliged.
autoCommit, points of flushing, points of committing, points of
merging, etc, should be fully independent of what changes (add/
deletes) you are doing.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]