[
https://issues.apache.org/jira/browse/LUCENE-3237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13965214#comment-13965214
]
Michael McCandless commented on LUCENE-3237:
--------------------------------------------
Thanks Simon.
bq. Hey mike, thanks for reopening this.
I actually didn't reopen yet ... because I do think this really is
paranoia. The OS man pages make the semantics clear, and what we are
doing today (reopen the file for syncing) is correct.
bq. I like the fact that we get rid of the general unsynced files stuff in
Directory.
bq. given the last point we move it in the right place inside IW that is where
it should be
Yeah I really like that.
But, we could do that separately, i.e. add private tracking inside IW
of which newly written file names haven't been sync'd.
bq. the problem that the current patch has is that is holds on to the buffers
in BufferedIndexOutput. I think we need to work around this here are a couple
of ideas:
bq. introduce a SyncHandle class that we can pull from IndexOutput that allows
to close the IndexOutput but lets you fsync after the fact
I think that's a good idea. For FSDir impls this is just a thin
wrapper around FileDescriptor.
bq. this handle can be refcounted internally and we just decrement the count on
IndexOutput#close() as well as on SyncHandle#close()
bq. we can just hold on to the SyncHandle until we need to sync in IW
Ref counting may be overkill? Who else will be pulling/sharing this
sync handle? Maybe we can add a "IndexOutput.closeToSyncHandle", the
IndexOutput flushes and is unusable from then on, but returns the sync
handle which the caller must later close.
One downside of moving to this API is ... it rules out writing some
bytes, fsyncing, writing some more, fsyncing, e.g. if we wanted to add
a transaction log impl on top of Lucene. But I think that's OK
(design for today). There are other limitations in IndexOuput for
xlog impl...
bq.since this will basically close the underlying FD later we might want to
think about size-bounding the number of unsynced files and maybe let indexing
threads fsync them concurrently? maybe something we can do later.
bq.if we know we flush for commit we can already fsync directly which might
safe resources / time since it might be concurrent
Yeah we can pursue this in "phase 2". The OS will generally move
dirty buffers to stable storage anyway over time, so the cost of
fsyncing files written (relatively) long ago (10s of seconds; on linux
I think the default is usually 30 seconds) will usually be low. The
problem is on some filesystems fsync can be unexpectedly costly (there
was a "famous" case in ext3
https://bugzilla.mozilla.org/show_bug.cgi?id=421482 but this has been
fixed), so we need to be careful about this.
> FSDirectory.fsync() may not work properly
> -----------------------------------------
>
> Key: LUCENE-3237
> URL: https://issues.apache.org/jira/browse/LUCENE-3237
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/store
> Reporter: Shai Erera
> Attachments: LUCENE-3237.patch
>
>
> Spinoff from LUCENE-3230. FSDirectory.fsync() opens a new RAF, sync() its
> FileDescriptor and closes RAF. It is not clear that this syncs whatever was
> written to the file by other FileDescriptors. It would be better if we do
> this operation on the actual RAF/FileOS which wrote the data. We can add
> sync() to IndexOutput and FSIndexOutput will do that.
> Directory-wise, we should stop syncing on file names, and instead sync on the
> IOs that performed the write operations.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]