Let me try to consolidate the discussion.

1. We need to flush the metadata whenever we cause metadata changes due to
data changes. Mostly Append/Filesize change.
2. I don't know any operations where we perform "metadata operation" only
and we needed it to be "persisted"
Even if we can think/come up with some obscure cases of #2, majority of of
our use-case falls into #1.
So why not put force(true) all the time? What cases we may incur penalty?
If we pass true, then there is no debate on implementation/source.

What do you guys say?
JV

On Wed, May 10, 2017 at 4:42 PM, Sijie Guo <guosi...@gmail.com> wrote:

>
>
> On Thu, May 11, 2017 at 2:28 AM, Charan Reddy G <reddychara...@gmail.com>
> wrote:
>
>>  @JV and @Sijie I think Android java source is not the appropriate one to
>> look into. For FileChannelImpl we need to look into jdk code
>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/
>> jdk/openjdk/8u40-b25/sun/nio/ch/FileChannelImpl.java#FileCh
>> annelImpl.force%28boolean%29
>>
>>
>>     public void force(boolean metaData) throws IOException {
>>         ensureOpen();
>>         int rv = -1;
>>         int ti = -1;
>>         try {
>>             begin();
>>             ti = threads.add();
>>             if (!isOpen())
>>                 return;
>>             do {
>>                 rv = nd.force(fd, metaData);
>>             } while ((rv == IOStatus.INTERRUPTED) && isOpen());
>>         } finally {
>>             threads.remove(ti);
>>             end(rv > -1);
>>             assert IOStatus.check(rv);
>>         }
>>     }
>>
>> here 'nd' is of type FileDispatcherImpl. I tried to look into its code
>> but since it is native code, we wont be able to read that
>>
>> http://grepcode.com/file/repository.grepcode.com/java/root/
>> jdk/openjdk/8u40-b25/sun/nio/ch/FileDispatcherImpl.java#
>> FileDispatcherImpl
>>
>> So it is not clear if it is just difference of fdatasync() and fsync().
>> Even if we assume that it is the difference of fdatasync() and fsync(),
>> man pages of fdatasync() says
>>
>> "fdatasync() is similar to fsync(), but does not flush modified metadata
>> unless that metadata is needed in order to allow a subsequent data
>> retrieval to be correctly handled.   For  example, changes  to  st_atime
>> or st_mtime (respectively, time of last access and time of last
>> modification; see stat(2)) do not require flushing because they are not
>> necessary for a subsequent data read to be handled correctly.  *On the
>> other hand, a change to the file size (st_size, as made by say
>> ftruncate(2)), would require a metadata flush.**"*
>>
>>
> fdatasync - "does not flush modified metadata unless that metadata is
> needed in order to allow a subsequent data retrieval to be correctly
> handled"
>
> In practice, this means that unless the file size changed, the file
> metadata will not be synced when calling fdatasync.
>
>
>> Also, when we look into java API for force method -
>> https://docs.oracle.com/javase/7/docs/api/java/nio/channel
>> s/FileChannel.html#force(boolean), it says the implementation is
>> unspecified and system-dependent. So I don't think it is safe to make any
>> assumptions here.
>>
>> "The metaData parameter can be used to limit the number of I/O
>> operations that this method is required to perform. Passing false for
>> this parameter indicates that only updates to the file's content need be
>> written to storage; passing true indicates that updates to both the
>> file's content and metadata must be written, which generally requires at
>> least one more I/O operation. Whether this parameter actually has any
>> effect is dependent upon the underlying operating system and is therefore
>> unspecified. Invoking this method may cause an I/O operation to occur
>> even if the channel was only opened for reading. Some operating systems,
>> for example, maintain a last-access time as part of a file's metadata, and
>> this time is updated whenever the file is read. Whether or not this is
>> actually done is system-dependent and is therefore unspecified."
>>
>> Thanks,
>> Charan
>>
>> On Wed, May 10, 2017 at 4:07 AM, Sijie Guo <guosi...@gmail.com> wrote:
>>
>>>
>>>
>>> On May 10, 2017 12:42 PM, "Venkateswara Rao Jujjuri" <jujj...@gmail.com>
>>> wrote:
>>>
>>> Looked at the source and it is fdatasync()
>>> https://android.googlesource.com/platform/libcore/+/2496a68/
>>> luni/src/main/java/java/nio/FileChannelImpl.java
>>> if (metadata) {
>>>     Libcore.os.fsync(fd);
>>> } else {
>>>      Libcore.os.fdatasync(fd);
>>> }
>>>
>>> So, agreed. fdatasync() fluesh metadata when the file size etc changes,
>>> and avoids metadata flush if no data related changes but
>>> but that tells me there may be no situation for us to call force(true)
>>> in the code.
>>> But our code calls it with true in some code path, should we change
>>> everything to false for consistency
>>>
>>>
>>> I don't remember that we called force(true) on file channel. Do you mean
>>> the buffered channel?
>>>
>>>
>>> JV
>>>
>>> On Tue, May 9, 2017 at 4:40 PM, Sijie Guo <guosi...@gmail.com> wrote:
>>>
>>>> We don't force flush metadata in bookies. Because:
>>>>
>>>> - flush metadata requires more I/O operations.
>>>> - flush metadata usually updates file metadata information (e.g. access
>>>> permissions, last modification/access time).
>>>>
>>>> The difference of force(false) and force(true) is the difference
>>>> between fdatasync() and fsync(). The former one update last modification
>>>> time, while the latter one doesn't. We don't care about the metadata
>>>> information like access/modification time, that's the reason we only called
>>>> force(false).
>>>>
>>>> - Sijie
>>>>
>>>>
>>>> On Sat, May 6, 2017 at 9:10 AM, Charan Reddy G <reddychara...@gmail.com
>>>> > wrote:
>>>>
>>>>> By metadata I mean metadata of file
>>>>>
>>>>> in BufferedChannel.java
>>>>>
>>>>>     public void flush(boolean shouldForceWrite) throws IOException {
>>>>>         synchronized(this) {
>>>>>             flushInternal();
>>>>>         }
>>>>>         if (shouldForceWrite) {
>>>>>             forceWrite(false);          <------------- false here
>>>>>         }
>>>>>     }
>>>>>
>>>>>     public long forceWrite(boolean forceMetadata) throws IOException {
>>>>>         // This is the point up to which we had flushed to the file
>>>>> system page cache
>>>>>         // before issuing this force write hence is guaranteed to be
>>>>> made durable by
>>>>>         // the force write, any flush that happens after this may or
>>>>> may
>>>>>         // not be flushed
>>>>>         long positionForceWrite = writeBufferStartPosition.get();
>>>>>         fileChannel.force(forceMetadata);    <---------------
>>>>> forcemetadata false
>>>>>         return positionForceWrite;
>>>>>     }
>>>>>
>>>>> On Fri, May 5, 2017 at 5:56 PM, Charan Reddy G <
>>>>> reddychara...@gmail.com> wrote:
>>>>>
>>>>>> Hey,
>>>>>>
>>>>>> In GarbageCollectorThread.doCompactEntryLogs {
>>>>>>
>>>>>> ...
>>>>>>
>>>>>>             try {
>>>>>>                 compactEntryLog(scannerFactory, meta);
>>>>>>                 scannerFactory.flush();            <---------------
>>>>>> this will eventually call entrylogger.flushCurrentLog and it force writes
>>>>>> the content of the BufferedChannel but not the metadata of the file
>>>>>>
>>>>>>                 LOG.info("Removing entry log {} after compaction",
>>>>>> meta.getEntryLogId());
>>>>>>                 removeEntryLog(meta.getEntryLogId());  <-----------
>>>>>> this will delete the compacted entrylog
>>>>>>
>>>>>>             }
>>>>>> ...
>>>>>>
>>>>>>
>>>>>> in doCompactEntryLogs, we first write the non-deleted entries of the
>>>>>> entryLog which is getting compacted to the currentLog in entryLogger, 
>>>>>> then
>>>>>> we flush the entrylogger before deleting the compacted entrylog. But when
>>>>>> currentLog is flushed by the entrylogger, it flushes only the content of
>>>>>> the file but not the metadata. After flush is completed the compacted
>>>>>> entrylog is deleted. It is not ok to not to force flush the metadata of 
>>>>>> the
>>>>>> currentLog for the persisted (checkpointed) data. The filesystem 
>>>>>> behaviour
>>>>>> is unexpected in this case and there is possibility of data loss if the
>>>>>> Bookie crashes before closing that logchannel.
>>>>>>
>>>>>> Thanks,
>>>>>> Charan
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Jvrao
>>> ---
>>> First they ignore you, then they laugh at you, then they fight you, then
>>> you win. - Mahatma Gandhi
>>>
>>>
>>>
>>>
>>
>


-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Reply via email to