On 02/21/2019 07:30 PM, Jan Kara wrote:
> On Thu 21-02-19 12:17:35, Dongli Zhang wrote:
>> Commit 0da03cab87e6
>> ("loop: Fix deadlock when calling blkdev_reread_part()") moves
>> blkdev_reread_part() out of the loop_ctl_mutex. However,
>> GENHD_FL_NO_PART_SCAN is set before __blkdev_reread_part(). As a result,
>> __blkdev_reread_part() will fail the check of GENHD_FL_NO_PART_SCAN and
>> will not rescan the loop device to delete all partitions.
>>
>> Below are steps to reproduce the issue:
>>
>> step1 # dd if=/dev/zero of=tmp.raw bs=1M count=100
>> step2 # losetup -P /dev/loop0 tmp.raw
>> step3 # parted /dev/loop0 mklabel gpt
>> step4 # parted -a none -s /dev/loop0 mkpart primary 64s 1
>> step5 # losetup -d /dev/loop0
>
> Can you perhaps write a blktest for this? Thanks!
I will write a blktest for above case. Thanks for the suggestion.
>
>> Step5 will not be able to delete /dev/loop0p1 (introduced by step4) and
>> there is below kernel warning message:
>>
>> [ 464.414043] __loop_clr_fd: partition scan of loop0 failed (rc=-22)
>>
>> This patch sets GENHD_FL_NO_PART_SCAN after blkdev_reread_part().
>>
>> Fixes: 0da03cab87e6 ("loop: Fix deadlock when calling blkdev_reread_part()")
>> Signed-off-by: Dongli Zhang <dongli.zh...@oracle.com>
>> ---
>> drivers/block/loop.c | 15 ++++++++++++---
>> 1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
>> index 7908673..736e55b 100644
>> --- a/drivers/block/loop.c
>> +++ b/drivers/block/loop.c
>> @@ -1034,6 +1034,15 @@ loop_init_xfer(struct loop_device *lo, struct
>> loop_func_table *xfer,
>> return err;
>> }
>>
>> +static void loop_disable_partscan(struct loop_device *lo)
>> +{
>> + mutex_lock(&loop_ctl_mutex);
>> + lo->lo_flags = 0;
>> + if (!part_shift)
>> + lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
>> + mutex_unlock(&loop_ctl_mutex);
>> +}
>> +
>> static int __loop_clr_fd(struct loop_device *lo, bool release)
>> {
>> struct file *filp = NULL;
>> @@ -1096,9 +1105,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool
>> release)
>>
>> partscan = lo->lo_flags & LO_FLAGS_PARTSCAN && bdev;
>> lo_number = lo->lo_number;
>> - lo->lo_flags = 0;
>> - if (!part_shift)
>> - lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
>> loop_unprepare_queue(lo);
>> out_unlock:
>> mutex_unlock(&loop_ctl_mutex);
>> @@ -1121,6 +1127,9 @@ static int __loop_clr_fd(struct loop_device *lo, bool
>> release)
>> /* Device is gone, no point in returning error */
>> err = 0;
>> }
>> +
>> + loop_disable_partscan(lo);
>> +
>> /*
>> * Need not hold loop_ctl_mutex to fput backing file.
>> * Calling fput holding loop_ctl_mutex triggers a circular
>
> So I don't think this change is actually correct. The problem is that once
> lo->lo_state is set to Lo_unbound and loop_ctl_mutex is unlocked, the loop
> device structure can be reused for a new device (bound to a new file). So
> you cannot safely manipulate flags on lo->lo_disk anymore. But I think we
> can just move the setting of lo->lo_state to Lo_unbound after partscan has
> finished as well. There cannot be anybody else entering __loop_clr_fd() as
> lo->lo_backing_file is already cleared and Lo_rundown state protects us
> from all the other places trying to change the 'lo' device (please make
> this last sentence into a comment in the code explaining why setting
> lo->lo_state so late is fine). Thanks!
I will set lo->lo_state to Lo_unbound after partscan in v2.
Thank you very much!
Dongli Zhang
>
> Honza
>