When you write to a file, modern OSs by default just buffer those writes in memory rather than actually writing them immediately to disk. Modern hard drives do the same (so, after the OS flushes to the hard drive, the hard drive actually just buffers the writes, too). Then, when it's a good time, these buffered writes are spooled to disk in the background. They do this to get better performance on write.

Then, the fsync() call, which is an OS level call, requests that all buffered bytes be flushed to the real underlying storage ("stable storage"). It is not supposed to return until all written bytes are on stable storage. Lucene relies on this by fsync'ing all referenced files in the index, before deleting the files referenced by previous commits. So, as of 2.4, this ensures the index will remain consistent even if the OS or computer crashes, or power is cut.

Unfortunately, there are apparently some devices which even when fsync () is called, return immediately even though the bytes are not actually written to stable storage. If you have such a device that lies then Lucene 2.4 won't be able to guarantee index consistency on crash/power outage.

Mike

Cam Bazz wrote:

Hello,

What do you mean by IO system lying on fsync?

Best.

On Mon, Mar 17, 2008 at 10:40 AM, Michael McCandless <
[EMAIL PROTECTED]> wrote:


Yes that's already been committed to trunk as well.

IndexWriter now has a commit() method which syncs all referenced
files in the index to stable storage (assuming your IO system doesn't
"lie" on fsync).

Mike

On Mar 17, 2008, at 4:33 AM, Cam Bazz wrote:

Nice. Thanks.

will the 2.4 have commit improvements that we previously talked about?

best regards.

-C.B.

On Mon, Mar 17, 2008 at 10:31 AM, Michael McCandless <
[EMAIL PROTECTED]> wrote:


The trunk version of Lucene (eventually 2.4) now has deletion by
query, in IndexWriter.

Mike

Cam Bazz wrote:

Hello Erick,

Has anyone found a way for deleting a document with a query? I
understand it
can be deleted via terms, but I need to delete a document with two
terms,
that is the only way I can identify my document is by looking at
two terms
not one.

best.

On Fri, Mar 14, 2008 at 4:58 PM, Erick Erickson
<[EMAIL PROTECTED]>
wrote:

Doc IDs are assigned at index time and can change over time That
is,
deleting
a document and optimizing (and other operations) can and will
change
document IDs. So, yes, you have to do a search (either use a hits
object
or one of the HitCollectors) in order to delete by doc ID.

You can also delete by terms, see the API.

There are other options, but you haven't explianed what you're
trying to accomplish enough to offer any more suggestions.

Best
Erick

On Wed, Mar 12, 2008 at 5:44 PM, varun sood <[EMAIL PROTECTED]>
wrote:

No. I haven't but I will. even though I would like to make my own
implementation. So any idea of how to get the "doc num"?

Thanks for replying.
Varun

On Wed, Mar 12, 2008 at 5:15 PM, Mark Miller
<[EMAIL PROTECTED]>
wrote:

Have you seen the work that Mark Harwood has done making a GWT
version
of Luke? I think its in the latest release.

varun sood wrote:
Hi,
  I am trying to delete a document without using the hits
object.
What is the unique field in the index that I can use to
delete the
document?

I am trying to make a web interface where index can be modified,
smaller
subset of what Luke does but using JSPs and Servlet.

to use deleteDocument(int docNum)
I need docNum how can I get this? or does it have to come
only vis
Hits?

Thanks,
Varun



--------------------------------------------------------------- --
--
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: java-user- [EMAIL PROTECTED]






------------------------------------------------------------------- --
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to