Yes; My question is what is the impact we are taking by doing force(true) all the time. Just to avoid any concern caused by Charan's statement about "java API for force method - https://docs.oracle.com/javase/7/docs/api/java/nio/ channels/FileChannel.html#force(boolean), it says the implementation is unspecified and system-dependent. "
On Wed, May 10, 2017 at 5:09 PM, Sijie Guo <guosi...@gmail.com> wrote: > > > On Thu, May 11, 2017 at 8:02 AM, Venkateswara Rao Jujjuri < > jujj...@gmail.com> wrote: > >> That is correct. But we are talking about active entrylog file(s) and >> journal file(s). >> Do you see lot of active journal files that won't get updated but read >> for long time? >> > > I am a bit confused about this question here. When talking about > durability, entry log files and journal files should be same, right? > > > >> Anyway just trying broaden the discussion and if we have any doubt if >> java does fdatasync all the time. >> >> On Wed, May 10, 2017 at 4:57 PM, Matteo Merli <matteo.me...@gmail.com> >> wrote: >> >>> My understanding is that fsync() forces the IO on the metadata even if >>> the file size has not changed, for example to update the access timestamp. >>> >>> On Wed, May 10, 2017 at 4:47 PM Venkateswara Rao Jujjuri < >>> jujj...@gmail.com> wrote: >>> >>>> Let me try to consolidate the discussion. >>>> >>>> 1. We need to flush the metadata whenever we cause metadata changes due >>>> to data changes. Mostly Append/Filesize change. >>>> 2. I don't know any operations where we perform "metadata operation" >>>> only and we needed it to be "persisted" >>>> Even if we can think/come up with some obscure cases of #2, majority of >>>> of our use-case falls into #1. >>>> So why not put force(true) all the time? What cases we may incur >>>> penalty? If we pass true, then there is no debate on implementation/source. >>>> >>>> What do you guys say? >>>> JV >>>> >>>> On Wed, May 10, 2017 at 4:42 PM, Sijie Guo <guosi...@gmail.com> wrote: >>>> >>>>> >>>>> >>>>> On Thu, May 11, 2017 at 2:28 AM, Charan Reddy G < >>>>> reddychara...@gmail.com> wrote: >>>>> >>>>>> @JV and @Sijie I think Android java source is not the appropriate >>>>>> one to look into. For FileChannelImpl we need to look into jdk code >>>>>> >>>>>> http://grepcode.com/file/repository.grepcode.com/java/root/j >>>>>> dk/openjdk/8u40-b25/sun/nio/ch/FileChannelImpl.java#FileChan >>>>>> nelImpl.force%28boolean%29 >>>>>> >>>>>> >>>>>> public void force(boolean metaData) throws IOException { >>>>>> ensureOpen(); >>>>>> int rv = -1; >>>>>> int ti = -1; >>>>>> try { >>>>>> begin(); >>>>>> ti = threads.add(); >>>>>> if (!isOpen()) >>>>>> return; >>>>>> do { >>>>>> rv = nd.force(fd, metaData); >>>>>> } while ((rv == IOStatus.INTERRUPTED) && isOpen()); >>>>>> } finally { >>>>>> threads.remove(ti); >>>>>> end(rv > -1); >>>>>> assert IOStatus.check(rv); >>>>>> } >>>>>> } >>>>>> >>>>>> here 'nd' is of type FileDispatcherImpl. I tried to look into its >>>>>> code but since it is native code, we wont be able to read that >>>>>> >>>>>> http://grepcode.com/file/repository.grepcode.com/java/root/j >>>>>> dk/openjdk/8u40-b25/sun/nio/ch/FileDispatcherImpl.java#FileD >>>>>> ispatcherImpl >>>>>> >>>>>> So it is not clear if it is just difference of fdatasync() and >>>>>> fsync(). Even if we assume that it is the difference of fdatasync() >>>>>> and fsync(), man pages of fdatasync() says >>>>>> >>>>>> "fdatasync() is similar to fsync(), but does not flush modified >>>>>> metadata unless that metadata is needed in order to allow a subsequent >>>>>> data >>>>>> retrieval to be correctly handled. For example, changes to >>>>>> st_atime or st_mtime (respectively, time of last access and time of last >>>>>> modification; see stat(2)) do not require flushing because they are not >>>>>> necessary for a subsequent data read to be handled correctly. *On >>>>>> the other hand, a change to the file size (st_size, as made by say >>>>>> ftruncate(2)), would require a metadata flush.**"* >>>>>> >>>>>> >>>>> fdatasync - "does not flush modified metadata unless that metadata is >>>>> needed in order to allow a subsequent data retrieval to be correctly >>>>> handled" >>>>> >>>>> In practice, this means that unless the file size changed, the file >>>>> metadata will not be synced when calling fdatasync. >>>>> >>>>> >>>>>> Also, when we look into java API for force method - >>>>>> https://docs.oracle.com/javase/7/docs/api/java/nio/channel >>>>>> s/FileChannel.html#force(boolean), it says the implementation is >>>>>> unspecified and system-dependent. So I don't think it is safe to make any >>>>>> assumptions here. >>>>>> >>>>>> "The metaData parameter can be used to limit the number of I/O >>>>>> operations that this method is required to perform. Passing false for >>>>>> this parameter indicates that only updates to the file's content need be >>>>>> written to storage; passing true indicates that updates to both the >>>>>> file's content and metadata must be written, which generally requires at >>>>>> least one more I/O operation. Whether this parameter actually has any >>>>>> effect is dependent upon the underlying operating system and is therefore >>>>>> unspecified. Invoking this method may cause an I/O operation to >>>>>> occur even if the channel was only opened for reading. Some operating >>>>>> systems, for example, maintain a last-access time as part of a file's >>>>>> metadata, and this time is updated whenever the file is read. Whether or >>>>>> not this is actually done is system-dependent and is therefore >>>>>> unspecified. >>>>>> " >>>>>> >>>>>> Thanks, >>>>>> Charan >>>>>> >>>>>> On Wed, May 10, 2017 at 4:07 AM, Sijie Guo <guosi...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On May 10, 2017 12:42 PM, "Venkateswara Rao Jujjuri" < >>>>>>> jujj...@gmail.com> wrote: >>>>>>> >>>>>>> Looked at the source and it is fdatasync() >>>>>>> https://android.googlesource.com/platform/libcore/+/2496a68/ >>>>>>> luni/src/main/java/java/nio/FileChannelImpl.java >>>>>>> if (metadata) { >>>>>>> Libcore.os.fsync(fd); >>>>>>> } else { >>>>>>> Libcore.os.fdatasync(fd); >>>>>>> } >>>>>>> >>>>>>> So, agreed. fdatasync() fluesh metadata when the file size etc >>>>>>> changes, and avoids metadata flush if no data related changes but >>>>>>> but that tells me there may be no situation for us to call >>>>>>> force(true) in the code. >>>>>>> But our code calls it with true in some code path, should we change >>>>>>> everything to false for consistency >>>>>>> >>>>>>> >>>>>>> I don't remember that we called force(true) on file channel. Do you >>>>>>> mean the buffered channel? >>>>>>> >>>>>>> >>>>>>> JV >>>>>>> >>>>>>> On Tue, May 9, 2017 at 4:40 PM, Sijie Guo <guosi...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> We don't force flush metadata in bookies. Because: >>>>>>>> >>>>>>>> - flush metadata requires more I/O operations. >>>>>>>> - flush metadata usually updates file metadata information (e.g. >>>>>>>> access permissions, last modification/access time). >>>>>>>> >>>>>>>> The difference of force(false) and force(true) is the difference >>>>>>>> between fdatasync() and fsync(). The former one update last >>>>>>>> modification >>>>>>>> time, while the latter one doesn't. We don't care about the metadata >>>>>>>> information like access/modification time, that's the reason we only >>>>>>>> called >>>>>>>> force(false). >>>>>>>> >>>>>>>> - Sijie >>>>>>>> >>>>>>>> >>>>>>>> On Sat, May 6, 2017 at 9:10 AM, Charan Reddy G < >>>>>>>> reddychara...@gmail.com> wrote: >>>>>>>> >>>>>>>>> By metadata I mean metadata of file >>>>>>>>> >>>>>>>>> in BufferedChannel.java >>>>>>>>> >>>>>>>>> public void flush(boolean shouldForceWrite) throws IOException >>>>>>>>> { >>>>>>>>> synchronized(this) { >>>>>>>>> flushInternal(); >>>>>>>>> } >>>>>>>>> if (shouldForceWrite) { >>>>>>>>> forceWrite(false); <------------- false here >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> public long forceWrite(boolean forceMetadata) throws >>>>>>>>> IOException { >>>>>>>>> // This is the point up to which we had flushed to the >>>>>>>>> file system page cache >>>>>>>>> // before issuing this force write hence is guaranteed to >>>>>>>>> be made durable by >>>>>>>>> // the force write, any flush that happens after this may >>>>>>>>> or may >>>>>>>>> // not be flushed >>>>>>>>> long positionForceWrite = writeBufferStartPosition.get(); >>>>>>>>> fileChannel.force(forceMetadata); <--------------- >>>>>>>>> forcemetadata false >>>>>>>>> return positionForceWrite; >>>>>>>>> } >>>>>>>>> >>>>>>>>> On Fri, May 5, 2017 at 5:56 PM, Charan Reddy G < >>>>>>>>> reddychara...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hey, >>>>>>>>>> >>>>>>>>>> In GarbageCollectorThread.doCompactEntryLogs { >>>>>>>>>> >>>>>>>>>> ... >>>>>>>>>> >>>>>>>>>> try { >>>>>>>>>> compactEntryLog(scannerFactory, meta); >>>>>>>>>> scannerFactory.flush(); >>>>>>>>>> <--------------- this will eventually call >>>>>>>>>> entrylogger.flushCurrentLog and >>>>>>>>>> it force writes the content of the BufferedChannel but not the >>>>>>>>>> metadata of >>>>>>>>>> the file >>>>>>>>>> >>>>>>>>>> LOG.info("Removing entry log {} after >>>>>>>>>> compaction", meta.getEntryLogId()); >>>>>>>>>> removeEntryLog(meta.getEntryLogId()); >>>>>>>>>> <----------- this will delete the compacted entrylog >>>>>>>>>> >>>>>>>>>> } >>>>>>>>>> ... >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> in doCompactEntryLogs, we first write the non-deleted entries of >>>>>>>>>> the entryLog which is getting compacted to the currentLog in >>>>>>>>>> entryLogger, >>>>>>>>>> then we flush the entrylogger before deleting the compacted >>>>>>>>>> entrylog. But >>>>>>>>>> when currentLog is flushed by the entrylogger, it flushes only the >>>>>>>>>> content >>>>>>>>>> of the file but not the metadata. After flush is completed the >>>>>>>>>> compacted >>>>>>>>>> entrylog is deleted. It is not ok to not to force flush the metadata >>>>>>>>>> of the >>>>>>>>>> currentLog for the persisted (checkpointed) data. The filesystem >>>>>>>>>> behaviour >>>>>>>>>> is unexpected in this case and there is possibility of data loss if >>>>>>>>>> the >>>>>>>>>> Bookie crashes before closing that logchannel. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Charan >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Jvrao >>>>>>> --- >>>>>>> First they ignore you, then they laugh at you, then they fight you, >>>>>>> then you win. - Mahatma Gandhi >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Jvrao >>>> --- >>>> First they ignore you, then they laugh at you, then they fight you, >>>> then you win. - Mahatma Gandhi >>>> >>>> >>>> >> >> >> -- >> Jvrao >> --- >> First they ignore you, then they laugh at you, then they fight you, then >> you win. - Mahatma Gandhi >> >> >> > -- Jvrao --- First they ignore you, then they laugh at you, then they fight you, then you win. - Mahatma Gandhi