On Fri, Jan 17, 2014 at 1:11 AM, Trond Myklebust
<trond.mykleb...@primarydata.com> wrote:
>
> On Jan 16, 2014, at 10:49, Peng Tao <bergw...@gmail.com> wrote:
>> On Tue, Jan 14, 2014 at 2:45 AM, Trond Myklebust
>> <trond.mykleb...@primarydata.com> wrote:
>>> void pnfs_set_lo_fail(struct pnfs_layout_segment *lseg)
>>> @@ -1881,43 +1887,37 @@ pnfs_layoutcommit_inode(struct inode *inode, bool 
>>> sync)
>>>        struct nfs4_layoutcommit_data *data;
>>>        struct nfs_inode *nfsi = NFS_I(inode);
>>>        loff_t end_pos;
>>> -       int status = 0;
>>> +       int status;
>>>
>>> -       dprintk("--> %s inode %lu\n", __func__, inode->i_ino);
>>> -
>>> -       if (!test_bit(NFS_INO_LAYOUTCOMMIT, &nfsi->flags))
>>> +       if (!pnfs_layoutcommit_outstanding(inode))
>> This might be a problem. If nfsi->flags has !NFS_INO_LAYOUTCOMMIT and
>> NFS_INO_LAYOUTCOMMITTING, client cannot issue a new layoutcommit after
>> the inflight one finishes. It might not be an issue for file layout as
>> long as we only use layoutcommit to update time, but it can cause data
>> corruption for block layout.
>
> I don’t understand.
>
> With the new patch, if _either_ NFS_INO_LAYOUTCOMMIT or 
> NFS_INO_LAYOUTCOMMITTING are set, then the client will wait until 
> NFS_INO_LAYOUTCOMMITTING can be locked, it will test for 
> NFS_INO_LAYOUTCOMMIT, and then either issue a new layout commit or exit. How 
> can that cause new breakage for blocks?
>
ah, sorry, I read the old code as your patch... yeah, so you actually
fixed the issue instead of introducing it.

> The only issues that I’m aware of with the blocks layout and LAYOUTCOMMIT 
> today are:
> 1. encode_pnfs_block_layoutupdate() runs out of XDR buffer space after 4-5 
> iterations in the list_for_each_entry_safe() loop. That is because nobody has 
> yet added support for preallocating a page buffer to store the (potentially 
> very large) array of extents. BTW: that array looks like a perfect candidate 
> for xdr_encode_array2() if we could teach the latter about xdr_stream...
This can also be fixed by limiting the number of extents allowed to be
sent in one single layoutcommit.

> 2. the blocks layout also needs to be able handle the case where the list of 
> extents is so large that a single LAYOUTCOMMIT is not sufficient. There is no 
> reason why it should not be able to send multiple LAYOUTCOMMIT rpc calls when 
> the size exceeds the session forward channel's negotiated max_rqst_sz.
>
You are absolutely right.

Thanks,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to