On 27.09.19 г. 20:10 ч., Josef Bacik wrote:
> On Fri, Sep 27, 2019 at 01:23:16PM +0300, Nikolay Borisov wrote:
>> Modifying the file position is done on a per-file basis. This renders
>> holding the inode lock for writing useless and makes the performance of
>> concurrent llseek's abysmal.
>>
>> Fix this by holding the inode for read. This provides protection against
>> concurrent truncates and find_desired_extent already includes proper
>> extent locking for the range which ensures proper locking against
>> concurrent writes. SEEK_CUR and SEEK_END can be done lockessly.
>> The former is synchronized by file::f_lock spinlock. SEEK_END is not
>> synchronized but atomic, but that's OK since there is not guarantee
>> that SEEK_END will always be at the end of the file in the face of
>> tail modifications.
>>
>> This change brings ~82% performance improvement when doing a lot of
>> parallel fseeks. The workload essentially does:
>>
>>                     for (d=0; d<num_seek_read; d++)
>>                       {
>>                         /* offset %= 16777216; */
>>                         fseek (f, 256 * d % 16777216, SEEK_SET);
>>                         fread (buffer, 64, 1, f);
>>                       }
>>
>> Without patch:
>>
>> num workprocesses = 16
>> num fseek/fread = 8000000
>> step = 256
>> fork 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
>>
>> real 0m41.412s
>> user 0m28.777s
>> sys  2m16.510s
>>
>> With patch:
>>
>> num workprocesses = 16
>> num fseek/fread = 8000000
>> step = 256
>> fork 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
>>
>> real 0m11.479s
>> user 0m27.629s
>> sys  0m21.040s
>>
>> Signed-off-by: Nikolay Borisov <[email protected]>
>> ---
>>  fs/btrfs/file.c | 26 ++++++++++----------------
>>  1 file changed, 10 insertions(+), 16 deletions(-)
>>
>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>> index 12688ae6e6f2..000b7bd89bf0 100644
>> --- a/fs/btrfs/file.c
>> +++ b/fs/btrfs/file.c
>> @@ -3347,13 +3347,14 @@ static int find_desired_extent(struct inode *inode, 
>> loff_t *offset, int whence)
>>      struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>>      struct extent_map *em = NULL;
>>      struct extent_state *cached_state = NULL;
>> +    loff_t i_size = inode->i_size;
> 
> We don't actually need to do all this now that we're holding the inode_lock
> right?  Also I've gone through and looked at stuff and we're good with just a
> shared lock here, the only thing that adjusts i_size outsize of the extent 
> lock
> is truncate, so we're safe.  Thanks,

Yeah, holding the shared inode lock means we can just do inode->i_size
but dunno if the multiple dereferences gets optimised. Though at this
point we are entering into microoptimisation territory. For the sake of
completeness I will check on monday what's the difference in assembly
and if there is none I'll revert the code back to accessing inode->i_size.

> 
> Josef
> 

Reply via email to