On Wed, Nov 26, 2014 at 09:36:30PM +0800, zhangxingcai wrote:
> __get_mtd_device() is called to increment mtd->usecount when we access
> mtd via /dev/mtd1 or /dev/mtdblock1, but mtd_table_mutex lock is used in the 
> former
> via get_mtd_device(), while &dev->lock lock is used in the latter.  Therefore 
> mtd->usecount is
> not properly protected if we access /dev/mtd1 and /dev/mtdblock1 at the same 
> time.
> 
> call graph as follows:
> /dev/mtd1 --> mtdchar_open() --> get_mtd_device() --> <hold mtd_table_mutex> 
> --> __get_mtd_device() --> <increment mtd->usecount>
> /dev/mtdblock1 --> blktrans_open() --><hold &dev->lock> --> 
> __get_mtd_device() --> <increment mtd->usecount>
> 
> Actually we triggerd the BUG_ON in put_mtd_device() on 2.6.34 kernel
> due to this race.

Have you retested and seen this on any more recent kernel? The locking
schemes here have changed a bit since then.

> To fix this convert mtd->usecount from int to atomic_t.

Is mtd->usecount the *only* important race in __get_mtd_device() and
__put_mtd_device()? Sometimes a race on a counter just shows that there
are other concurrency issues nearby that would be better served by a
lock. But your fix may be sufficient in this case.

> <0>------------[ cut here ]------------
> <2>kernel BUG at drivers/mtd/mtdcore.c:565!
> Oops: Exception in kernel mode, sig: 5 [#1]
> PREEMPT SMP NR_CPUS=4 LTT NESTING LEVEL : 0
> P2041 RDB
> <0>last sysfs file: /sys/mbe_detect/ecc_mbe_detect
> ...
> NIP [c037a808] put_mtd_device+0x58/0x80
> LR [c037a808] put_mtd_device+0x58/0x80
> Call Trace:
> [ca453e90] [c037a808] put_mtd_device+0x58/0x80 (unreliable)
> [ca453eb0] [c037ced8] mtd_close+0x48/0x70
> [ca453ed0] [c0119078] __fput+0xe8/0x220
> [ca453ef0] [c01144fc] filp_close+0x6c/0xb0
> [ca453f10] [c01145fc] sys_close+0xbc/0x180
> [ca453f40] [c0010ae8] ret_from_syscall+0x0/0x4
> 
> Cc: <sta...@vger.kernel.org>
> Signed-off-by: Zhang Xingcai <zhangxing...@huawei.com>
> ---
>  drivers/mtd/maps/vmu-flash.c |  2 +-
>  drivers/mtd/mtdcore.c        | 12 ++++++------
>  include/linux/mtd/mtd.h      |  2 +-
>  3 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/mtd/maps/vmu-flash.c b/drivers/mtd/maps/vmu-flash.c
> index 6b223cf..0a10779 100644
> --- a/drivers/mtd/maps/vmu-flash.c
> +++ b/drivers/mtd/maps/vmu-flash.c
> @@ -721,7 +721,7 @@ static int vmu_can_unload(struct maple_device *mdev)
>       card = maple_get_drvdata(mdev);
>       for (x = 0; x < card->partitions; x++) {
>               mtd = &((card->mtd)[x]);
> -             if (mtd->usecount > 0)
> +             if (atomic_read(&mtd->usecount) > 0)

Hmm, the use of mtd->usecount here seems kinda wrong. I think this
driver should be implementing mtd->_get_device() and mtd->_put_device()
instead.

>                       return 0;
>       }
>       return 1;
> diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
> index 4c61187..95e7cfc 100644
> --- a/drivers/mtd/mtdcore.c
> +++ b/drivers/mtd/mtdcore.c
> @@ -402,7 +402,7 @@ int add_mtd_device(struct mtd_info *mtd)
>               goto fail_locked;
>  
>       mtd->index = i;
> -     mtd->usecount = 0;
> +     atomic_set(&mtd->usecount, 0);
>  
>       /* default value if not set by driver */
>       if (mtd->bitflip_threshold == 0)
> @@ -492,9 +492,9 @@ int del_mtd_device(struct mtd_info *mtd)
>       list_for_each_entry(not, &mtd_notifiers, list)
>               not->remove(mtd);
>  
> -     if (mtd->usecount) {
> +     if (atomic_read(&mtd->usecount)) {

If we're using atomic_read(), wouldn't it make more sense just to read
once and save the result?

>               printk(KERN_NOTICE "Removing MTD device #%d (%s) with use count 
> %d\n",
> -                    mtd->index, mtd->name, mtd->usecount);
> +                    mtd->index, mtd->name, atomic_read(&mtd->usecount));
>               ret = -EBUSY;
>       } else {
>               device_unregister(&mtd->dev);
> @@ -702,7 +702,7 @@ int __get_mtd_device(struct mtd_info *mtd)
>                       return err;
>               }
>       }
> -     mtd->usecount++;
> +     atomic_inc(&mtd->usecount);
>       return 0;
>  }
>  EXPORT_SYMBOL_GPL(__get_mtd_device);
> @@ -756,8 +756,8 @@ EXPORT_SYMBOL_GPL(put_mtd_device);
>  
>  void __put_mtd_device(struct mtd_info *mtd)
>  {
> -     --mtd->usecount;
> -     BUG_ON(mtd->usecount < 0);
> +     atomic_dec(&mtd->usecount);
> +     BUG_ON(atomic_read(&mtd->usecount) < 0);

Again, two atomic operations in a row don't make a lot of sense. Try
using atomic_dec_return():

        int count = atomic_dec_return(&mtd->usecount);
        
        BUG_ON(count < 0);

>  
>       if (mtd->_put_device)
>               mtd->_put_device(mtd);
> diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h
> index 031ff3a..af98132 100644
> --- a/include/linux/mtd/mtd.h
> +++ b/include/linux/mtd/mtd.h
> @@ -250,7 +250,7 @@ struct mtd_info {
>  
>       struct module *owner;
>       struct device dev;
> -     int usecount;
> +     atomic_t usecount;
>  };
>  
>  int mtd_erase(struct mtd_info *mtd, struct erase_info *instr);

Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to