On Wed, Jul 01 2015 at 12:57pm -0400,
Mike Snitzer <snit...@redhat.com> wrote:

> bio_integrity_alloc() and bio_integrity_free() assume that if a bio was
> allocated from a bioset that that bioset also had its bio_integrity_pool
> allocated using bioset_integrity_create().  This is a very bad
> assumption given that bioset_create() and bioset_integrity_create() are
> completely disjoint.  Not all callers of bioset_create() have been
> trained to also call bioset_integrity_create() -- and they may not care
> to be.
> 
> Fix this by falling back to kmalloc'ing 'struct bio_integrity_payload'
> rather than force all bioset consumers to (wastefully) preallocate a
> bio_integrity_pool that they very likely won't actually need (given the
> niche nature of the current block integrity support).
> 
> Otherwise, a NULL pointer "Kernel BUG" with a trace like the following
> will be observed (as seen on s390x using zfcp storage) because dm-io
> doesn't use bioset_integrity_create() when creating its bioset:
> 
>     [  791.643338] Call Trace:
>     [  791.643339] ([<00000003df98b848>] 0x3df98b848)
>     [  791.643341]  [<00000000002c5de8>] bio_integrity_alloc+0x48/0xf8
>     [  791.643348]  [<00000000002c6486>] bio_integrity_prep+0xae/0x2f0
>     [  791.643349]  [<0000000000371e38>] blk_queue_bio+0x1c8/0x3d8
>     [  791.643355]  [<000000000036f8d0>] generic_make_request+0xc0/0x100
>     [  791.643357]  [<000000000036f9b2>] submit_bio+0xa2/0x198
>     [  791.643406]  [<000003ff801f9774>] dispatch_io+0x15c/0x3b0 [dm_mod]
>     [  791.643419]  [<000003ff801f9b3e>] dm_io+0x176/0x2f0 [dm_mod]
>     [  791.643423]  [<000003ff8074b28a>] do_reads+0x13a/0x1a8 [dm_mirror]
>     [  791.643425]  [<000003ff8074b43a>] do_mirror+0x142/0x298 [dm_mirror]
>     [  791.643428]  [<0000000000154fca>] process_one_work+0x18a/0x3f8
>     [  791.643432]  [<000000000015598a>] worker_thread+0x132/0x3b0
>     [  791.643435]  [<000000000015d49a>] kthread+0xd2/0xd8
>     [  791.643438]  [<00000000005bc0ca>] kernel_thread_starter+0x6/0xc
>     [  791.643446]  [<00000000005bc0c4>] kernel_thread_starter+0x0/0xc
> 
> Signed-off-by: Mike Snitzer <snit...@redhat.com>
> Cc: sta...@vger.kernel.org
> ---
>  block/bio-integrity.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> NOTE: this serves as a more generic fix in the block layer rather than
> the dm-io specific fix which isn't ideal (due to potential for memory
> waste), see:
> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=17dbe96d4f8a6f87004e6cfb5944872dfe2edb9f
> 
> diff --git a/block/bio-integrity.c b/block/bio-integrity.c
> index 0436c21..719b715 100644
> --- a/block/bio-integrity.c
> +++ b/block/bio-integrity.c
> @@ -51,7 +51,7 @@ struct bio_integrity_payload *bio_integrity_alloc(struct 
> bio *bio,
>       unsigned long idx = BIO_POOL_NONE;
>       unsigned inline_vecs;
>  
> -     if (!bs) {
> +     if (!bs || !bs->bio_integrity_pool) {
>               bip = kmalloc(sizeof(struct bio_integrity_payload) +
>                             sizeof(struct bio_vec) * nr_vecs, gfp_mask);
>               inline_vecs = nr_vecs;
> @@ -104,7 +104,7 @@ void bio_integrity_free(struct bio *bio)
>               kfree(page_address(bip->bip_vec->bv_page) +
>                     bip->bip_vec->bv_offset);
>  
> -     if (bs) {
> +     if (bs && bs->bio_integrity_pool) {
>               if (bip->bip_slab != BIO_POOL_NONE)
>                       bvec_free(bs->bvec_integrity_pool, bip->bip_vec,
>                                 bip->bip_slab);
> -- 
> 2.3.2 (Apple Git-55)
> 
> 

Both the above block patch and the referenced dm-io patch fix the
following issue.  I really prefer the block fix over the dm-io one.  As
such I'd like to see it go upstream during 4.2-rc.

Jens, what do you think?

Here is an updated NULL pointer trace using 4.2-rc1 (which is easily hit
using the attached reproducer script):

[  239.425111] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000048
[  239.426010] IP: [<ffffffff811b2bb0>] mempool_alloc+0x60/0x180
[  239.426010] PGD 0
[  239.426010] Oops: 0000 [#1] SMP
[  239.426010] Modules linked in: dm_mirror dm_region_hash dm_log scsi_debug sg 
nfsv3 nfs fscache crct10dif_pclmul crc32_pclmul crc32c_intel 
ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul ablk_helper dm_mod 
cryptd pcspkr serio_raw virtio_balloon 8139too i2c_piix4 nfsd auth_rpcgss 
nfs_acl lockd grace sunrpc ext4 mbcache jbd2 ata_generic sd_mod pata_acpi 
cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper virtio_scsi ttm 
virtio_blk drm ata_piix libata virtio_pci virtio_ring 8139cp virtio mii 
i2c_core floppy
[  239.426010] CPU: 2 PID: 134 Comm: kworker/2:2 Not tainted 4.2.0-rc1+ #59
[  239.426010] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  239.426010] Workqueue: kcopyd do_work [dm_mod]
[  239.426010] task: ffff88011a701bc0 ti: ffff880036a10000 task.ti: 
ffff880036a10000
[  239.426010] RIP: 0010:[<ffffffff811b2bb0>]  [<ffffffff811b2bb0>] 
mempool_alloc+0x60/0x180
[  239.426010] RSP: 0018:ffff880036a13878  EFLAGS: 00010206
[  239.426010] RAX: ffff880036a13888 RBX: ffff88011a701bc0 RCX: 0000000000000000
[  239.426010] RDX: 0000000000000086 RSI: ffffffff81a88940 RDI: 0000000000000246
[  239.426010] RBP: ffff880036a138e8 R08: 000000000001a9b0 R09: 0000000000000080
[  239.426010] R10: ffff88011b001500 R11: 0000000000000000 R12: 0000000000000060
[  239.426010] R13: ffff880036a138a0 R14: 0000000000011200 R15: 0000000000000000
[  239.426010] FS:  0000000000000000(0000) GS:ffff88011fd00000(0000) 
knlGS:0000000000000000
[  239.426010] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  239.426010] CR2: 0000000000000048 CR3: 00000000d8797000 CR4: 00000000000407e0
[  239.426010] Stack:
[  239.426010]  0000000000000001 0001121000000010 ffffffff81a88940 
ffff88011b001500
[  239.426010]  ffff880036a138c8 ffffffff810e01cd 00000000000004f2 
00000000627d6696
[  239.426010]  0000000000000246 0000000000000400 ffff8800da8cfd00 
0000000000000010
[  239.426010] Call Trace:
[  239.426010]  [<ffffffff810e01cd>] ? __lock_is_held+0x4d/0x70
[  239.426010]  [<ffffffff8136b00f>] bio_integrity_alloc+0x4f/0x1d0
[  239.426010]  [<ffffffff8136b803>] bio_integrity_prep+0xc3/0x220
[  239.426010]  [<ffffffff8134c56f>] blk_sq_make_request+0x10f/0x570
[  239.426010]  [<ffffffff810e01cd>] ? __lock_is_held+0x4d/0x70
[  239.426010]  [<ffffffff8133b236>] generic_make_request+0xd6/0x110
[  239.426010]  [<ffffffff8133b2e7>] submit_bio+0x77/0x150
[  239.426010]  [<ffffffff81335a3f>] ? bio_alloc_bioset+0x1df/0x2e0
[  239.426010]  [<ffffffffa036cd48>] dispatch_io+0x1a8/0x3a0 [dm_mod]
[  239.426010]  [<ffffffffa036c840>] ? dm_copy_name_and_uuid+0xc0/0xc0 [dm_mod]
[  239.426010]  [<ffffffffa036c870>] ? list_get_page+0x30/0x30 [dm_mod]
[  239.426010]  [<ffffffffa036d7eb>] ? run_io_job+0x9b/0x190 [dm_mod]
[  239.426010]  [<ffffffffa036d4c0>] ? dm_kcopyd_do_callback+0x50/0x50 [dm_mod]
[  239.426010]  [<ffffffffa036d183>] dm_io+0x103/0x210 [dm_mod]
[  239.426010]  [<ffffffffa036c840>] ? dm_copy_name_and_uuid+0xc0/0xc0 [dm_mod]
[  239.426010]  [<ffffffffa036c870>] ? list_get_page+0x30/0x30 [dm_mod]
[  239.426010]  [<ffffffffa036d834>] run_io_job+0xe4/0x190 [dm_mod]
[  239.426010]  [<ffffffffa036d4c0>] ? dm_kcopyd_do_callback+0x50/0x50 [dm_mod]
[  239.426010]  [<ffffffffa036da7a>] process_jobs+0x5a/0x100 [dm_mod]
[  239.426010]  [<ffffffffa036d750>] ? segment_complete+0x170/0x170 [dm_mod]
[  239.426010]  [<ffffffffa036db91>] do_work+0x71/0xa0 [dm_mod]
[  239.426010]  [<ffffffff810a79d7>] process_one_work+0x1e7/0x7e0
[  239.426010]  [<ffffffff810a794a>] ? process_one_work+0x15a/0x7e0
[  239.426010]  [<ffffffff810a80e4>] worker_thread+0x114/0x460
[  239.426010]  [<ffffffff810a7fd0>] ? process_one_work+0x7e0/0x7e0
[  239.426010]  [<ffffffff810aebd7>] kthread+0x107/0x120
[  239.426010]  [<ffffffff817065f0>] ? _raw_spin_unlock_irq+0x30/0x50
[  239.426010]  [<ffffffff810aead0>] ? kthread_create_on_node+0x240/0x240
[  239.426010]  [<ffffffff8170731f>] ret_from_fork+0x3f/0x70
[  239.426010]  [<ffffffff810aead0>] ? kthread_create_on_node+0x240/0x240
[  239.426010] Code: d8 83 e3 af 4d 8d 67 60 0d 00 12 01 00 41 89 de 89 45 9c 
48 8d 45 a0 41 81 ce 00 12 01 00 65 48 8b 1c 25 00 ba 00 00 4c 8d 68 18 <49> 8b 
77 48 44 89 f7 41 ff 57 50 48 85 c0 74 45 48 8b 4d c8 65
[  239.426010] RIP  [<ffffffff811b2bb0>] mempool_alloc+0x60/0x180
[  239.426010]  RSP <ffff880036a13878>
[  239.426010] CR2: 0000000000000048
[  239.426010] ---[ end trace a4a18906f4031b09 ]---

Attachment: bip_bioset_NULL_ptr.sh
Description: Bourne shell script

Reply via email to