Hi Chris,

This is Arvind Kumar from VMware. Recently the issue discussed in this
bug was brought into VMware's notice. We looked at the patch
(https://lkml.org/lkml/2014/9/23/509) which was done to address the
issue. Since the patch is done in mptsas driver, it addresses the issue
only on lsilogic controller, if user uses some other controller e.g.
pvscsi or buslogic then the issue remains. Moreover the patch disables
the WRITE SAME completely on the lsilogic which indicates that VMware
will never be able to support WRITE SAME on lsilogic. As I understand
from the bug, it is concluded that the WRITE SAME is not properly
implemented by VMware. Actually we don't support WRITE SAME at all.

We internally investigated the issue and as per our understanding the
issue is not VMware specific and rather seems to be with the kernel,
which could very well happen on real hardware too in case the disk
doesn't support WRITE SAME command. Below are the details of the
investigation by Petr Vandrovec.

--

In blk-lib.c on line 294 it checks whether bdev supports write_same.
With LVM, bdev here is dm-0. It says yes, it is supported, and so
write_same is invoked (note that check is racy in case device loses
write_same capability between test and moment bio is issued):

    291  int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
    292                           sector_t nr_sects, gfp_t gfp_mask)
    293  {
    294          if (bdev_write_same(bdev)) {
    295                  unsigned char bdn[BDEVNAME_SIZE];
    296
    297                  if (!blkdev_issue_write_same(bdev, sector, nr_sects, 
gfp_mask,
    298                                               ZERO_PAGE(0)))
    299                          return 0;
    300
    301                  bdevname(bdev, bdn);
    302                  pr_err("%s: WRITE SAME failed. Manually zeroing.\n", 
bdn);
    303          }
    304
    305          return __blkdev_issue_zeroout(bdev, sector, nr_sects, 
gfp_mask);
    306  }
    307  EXPORT_SYMBOL(blkdev_issue_zeroout);

Then it gets to LVM, and LVM forwards request to sda. When it fails,
kernel clears bdev_write_same() on sda, and returns -121 (EREMOTEIO).

Now next request comes. Nobody cleared bdev_write_same() on dm-0, it got
cleared only on sda, so request gets to LVM, which forwards it to sda.
Where it hits a snag in blk-core.c:

   1824          if (bio->bi_rw & REQ_WRITE_SAME && 
!bdev_write_same(bio->bi_bdev)) {
   1825                  err = -EOPNOTSUPP;
   1826                  goto end_io;
   1827          }

bi_bdev here is sda, and I/O fails with EOPNOTSUPP, without WRITE_SAME
ever being issued. And then it hits completion code that treats
EOPNOTSUPP as success:

     18  static void bio_batch_end_io(struct bio *bio, int err)
     19  {
     20          struct bio_batch *bb = bio->bi_private;
     21
     22          if (err && (err != -EOPNOTSUPP))
     23                  clear_bit(BIO_UPTODATE, &bb->flags);
     24          if (atomic_dec_and_test(&bb->done))
     25                  complete(bb->wait);
     26          bio_put(bio);
     27  }

So everybody outside of blkdev_issue_write_same() thinks that I/O
succeeded, while in reality kernel even did not issue request!

Fix should:

1. Use different error code if WRITE_SAME request is thrown away. Or
remove special EOPNOTSUPP handling from end_io - I assume EOPNOTSUPP is
supposed to ignore failures from discarded commands, but then nobody
else should be using EOPNOTSUPP, and

2. WRITE_SAME failure should propagate from sda to dm-0.

--

Our understanding is that we should revert the fix in mptsas driver and
try to do the right fix as described above. I am attaching the patch
from Petr who did the investigation. CC'ing all involved people from
VMware too. Could you please evaluate the patch and suggest on further
steps?

Thanks!
Arvind

** Patch added: "0001-Do-not-silently-discard-WRITE_SAME-requests.patch"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1371591/+attachment/4230953/+files/0001-Do-not-silently-discard-WRITE_SAME-requests.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1371591

Title:
  file not initialized to 0s under some conditions on VMWare

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1371591/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to