Let me try to consolidate the discussion. 1. We need to flush the metadata whenever we cause metadata changes due to data changes. Mostly Append/Filesize change. 2. I don't know any operations where we perform "metadata operation" only and we needed it to be "persisted" Even if we can think/come up with some obscure cases of #2, majority of of our use-case falls into #1. So why not put force(true) all the time? What cases we may incur penalty? If we pass true, then there is no debate on implementation/source.
What do you guys say? JV On Wed, May 10, 2017 at 4:42 PM, Sijie Guo <guosi...@gmail.com> wrote: > > > On Thu, May 11, 2017 at 2:28 AM, Charan Reddy G <reddychara...@gmail.com> > wrote: > >> @JV and @Sijie I think Android java source is not the appropriate one to >> look into. For FileChannelImpl we need to look into jdk code >> >> http://grepcode.com/file/repository.grepcode.com/java/root/ >> jdk/openjdk/8u40-b25/sun/nio/ch/FileChannelImpl.java#FileCh >> annelImpl.force%28boolean%29 >> >> >> public void force(boolean metaData) throws IOException { >> ensureOpen(); >> int rv = -1; >> int ti = -1; >> try { >> begin(); >> ti = threads.add(); >> if (!isOpen()) >> return; >> do { >> rv = nd.force(fd, metaData); >> } while ((rv == IOStatus.INTERRUPTED) && isOpen()); >> } finally { >> threads.remove(ti); >> end(rv > -1); >> assert IOStatus.check(rv); >> } >> } >> >> here 'nd' is of type FileDispatcherImpl. I tried to look into its code >> but since it is native code, we wont be able to read that >> >> http://grepcode.com/file/repository.grepcode.com/java/root/ >> jdk/openjdk/8u40-b25/sun/nio/ch/FileDispatcherImpl.java# >> FileDispatcherImpl >> >> So it is not clear if it is just difference of fdatasync() and fsync(). >> Even if we assume that it is the difference of fdatasync() and fsync(), >> man pages of fdatasync() says >> >> "fdatasync() is similar to fsync(), but does not flush modified metadata >> unless that metadata is needed in order to allow a subsequent data >> retrieval to be correctly handled. For example, changes to st_atime >> or st_mtime (respectively, time of last access and time of last >> modification; see stat(2)) do not require flushing because they are not >> necessary for a subsequent data read to be handled correctly. *On the >> other hand, a change to the file size (st_size, as made by say >> ftruncate(2)), would require a metadata flush.**"* >> >> > fdatasync - "does not flush modified metadata unless that metadata is > needed in order to allow a subsequent data retrieval to be correctly > handled" > > In practice, this means that unless the file size changed, the file > metadata will not be synced when calling fdatasync. > > >> Also, when we look into java API for force method - >> https://docs.oracle.com/javase/7/docs/api/java/nio/channel >> s/FileChannel.html#force(boolean), it says the implementation is >> unspecified and system-dependent. So I don't think it is safe to make any >> assumptions here. >> >> "The metaData parameter can be used to limit the number of I/O >> operations that this method is required to perform. Passing false for >> this parameter indicates that only updates to the file's content need be >> written to storage; passing true indicates that updates to both the >> file's content and metadata must be written, which generally requires at >> least one more I/O operation. Whether this parameter actually has any >> effect is dependent upon the underlying operating system and is therefore >> unspecified. Invoking this method may cause an I/O operation to occur >> even if the channel was only opened for reading. Some operating systems, >> for example, maintain a last-access time as part of a file's metadata, and >> this time is updated whenever the file is read. Whether or not this is >> actually done is system-dependent and is therefore unspecified." >> >> Thanks, >> Charan >> >> On Wed, May 10, 2017 at 4:07 AM, Sijie Guo <guosi...@gmail.com> wrote: >> >>> >>> >>> On May 10, 2017 12:42 PM, "Venkateswara Rao Jujjuri" <jujj...@gmail.com> >>> wrote: >>> >>> Looked at the source and it is fdatasync() >>> https://android.googlesource.com/platform/libcore/+/2496a68/ >>> luni/src/main/java/java/nio/FileChannelImpl.java >>> if (metadata) { >>> Libcore.os.fsync(fd); >>> } else { >>> Libcore.os.fdatasync(fd); >>> } >>> >>> So, agreed. fdatasync() fluesh metadata when the file size etc changes, >>> and avoids metadata flush if no data related changes but >>> but that tells me there may be no situation for us to call force(true) >>> in the code. >>> But our code calls it with true in some code path, should we change >>> everything to false for consistency >>> >>> >>> I don't remember that we called force(true) on file channel. Do you mean >>> the buffered channel? >>> >>> >>> JV >>> >>> On Tue, May 9, 2017 at 4:40 PM, Sijie Guo <guosi...@gmail.com> wrote: >>> >>>> We don't force flush metadata in bookies. Because: >>>> >>>> - flush metadata requires more I/O operations. >>>> - flush metadata usually updates file metadata information (e.g. access >>>> permissions, last modification/access time). >>>> >>>> The difference of force(false) and force(true) is the difference >>>> between fdatasync() and fsync(). The former one update last modification >>>> time, while the latter one doesn't. We don't care about the metadata >>>> information like access/modification time, that's the reason we only called >>>> force(false). >>>> >>>> - Sijie >>>> >>>> >>>> On Sat, May 6, 2017 at 9:10 AM, Charan Reddy G <reddychara...@gmail.com >>>> > wrote: >>>> >>>>> By metadata I mean metadata of file >>>>> >>>>> in BufferedChannel.java >>>>> >>>>> public void flush(boolean shouldForceWrite) throws IOException { >>>>> synchronized(this) { >>>>> flushInternal(); >>>>> } >>>>> if (shouldForceWrite) { >>>>> forceWrite(false); <------------- false here >>>>> } >>>>> } >>>>> >>>>> public long forceWrite(boolean forceMetadata) throws IOException { >>>>> // This is the point up to which we had flushed to the file >>>>> system page cache >>>>> // before issuing this force write hence is guaranteed to be >>>>> made durable by >>>>> // the force write, any flush that happens after this may or >>>>> may >>>>> // not be flushed >>>>> long positionForceWrite = writeBufferStartPosition.get(); >>>>> fileChannel.force(forceMetadata); <--------------- >>>>> forcemetadata false >>>>> return positionForceWrite; >>>>> } >>>>> >>>>> On Fri, May 5, 2017 at 5:56 PM, Charan Reddy G < >>>>> reddychara...@gmail.com> wrote: >>>>> >>>>>> Hey, >>>>>> >>>>>> In GarbageCollectorThread.doCompactEntryLogs { >>>>>> >>>>>> ... >>>>>> >>>>>> try { >>>>>> compactEntryLog(scannerFactory, meta); >>>>>> scannerFactory.flush(); <--------------- >>>>>> this will eventually call entrylogger.flushCurrentLog and it force writes >>>>>> the content of the BufferedChannel but not the metadata of the file >>>>>> >>>>>> LOG.info("Removing entry log {} after compaction", >>>>>> meta.getEntryLogId()); >>>>>> removeEntryLog(meta.getEntryLogId()); <----------- >>>>>> this will delete the compacted entrylog >>>>>> >>>>>> } >>>>>> ... >>>>>> >>>>>> >>>>>> in doCompactEntryLogs, we first write the non-deleted entries of the >>>>>> entryLog which is getting compacted to the currentLog in entryLogger, >>>>>> then >>>>>> we flush the entrylogger before deleting the compacted entrylog. But when >>>>>> currentLog is flushed by the entrylogger, it flushes only the content of >>>>>> the file but not the metadata. After flush is completed the compacted >>>>>> entrylog is deleted. It is not ok to not to force flush the metadata of >>>>>> the >>>>>> currentLog for the persisted (checkpointed) data. The filesystem >>>>>> behaviour >>>>>> is unexpected in this case and there is possibility of data loss if the >>>>>> Bookie crashes before closing that logchannel. >>>>>> >>>>>> Thanks, >>>>>> Charan >>>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> Jvrao >>> --- >>> First they ignore you, then they laugh at you, then they fight you, then >>> you win. - Mahatma Gandhi >>> >>> >>> >>> >> > -- Jvrao --- First they ignore you, then they laugh at you, then they fight you, then you win. - Mahatma Gandhi