Your patch only fixed it for sg devices. S_ISBLK() will never be true when sg_get_max_segments() is only called in raw_refresh_limits() when bs->sg is true.
With that said, it's not like we cannot ditch the bs->sg check in raw_refresh_limits() and switch to S_ISBLK()/S_ISCHR(). (Although not strictly necessary, I would also do S_ISCHR() for SG_GET_SG_TABLESIZE.) I don't know if it's a better approach though. Seems repetitive to me (and bs->sg is not only used/checked in file-posix/raw_refresh_limits): https://github.com/qemu/qemu/blob/v5.1.0/block/file-posix.c#L3390 https://github.com/qemu/qemu/blob/v5.1.0/block/file-posix.c#L3504 On Mon, 7 Sep 2020 at 03:50, Dmitry Fomichev <dmitry.fomic...@wdc.com> wrote: > > > > > -----Original Message----- > > From: Qemu-block <qemu-block- > > bounces+dmitry.fomichev=wdc....@nongnu.org> On Behalf Of Maxim > > Levitsky > > Sent: Sunday, September 6, 2020 1:06 PM > > To: Tom Yan <tom.t...@gmail.com>; ebl...@redhat.com; > > pbonz...@redhat.com; f...@euphon.net; anie...@linux.vnet.ibm.com; > > kw...@redhat.com; mre...@redhat.com > > Cc: qemu-devel@nongnu.org; qemu-bl...@nongnu.org > > Subject: Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that > > actually works with sg > > > > On Sun, 2020-09-06 at 23:26 +0800, Tom Yan wrote: > > > I don't disagree with your proposal, but the thing is, do we even need > > > another field/limit for case 1? For example, do we *really* need to > > > clamp sizes[2] (NBD_INFO_BLOCK_SIZE) in nbd/server.c and > > > max_io_sectors (SCSIBlockLimits) in hw/scsi/scsi-disk.c to to any kind > > > of "dynamic" limit? > > > > It depends on if we have other block drivers that have IO segment/max > > transfer sizes, > > that we need to enforce, other than the SG_IO case. > > > > Knowing that the commit that added these limits to file-posix, which I was > > fixing is relatively > > new, I guess there are other cases for these limits. > > > > I'll check this tomorrow. > > > > > > > > Either way I can add another field (`max_pwrite`, maybe?) to > > > BlockLimits, as an infrastructure/hint, but what should be switched to > > > it, and what value should each block driver set it to, is probably > > > beyond what I can take. > > > > Implementation wise, I think I can do this, but I'll wait few days for the > > feedback on my proposal. > > > > Thanks for finding this bug! > > There was also > https://lore.kernel.org/qemu-devel/20200811225122.17342-2-dmitry.fomic...@wdc.com/ > posted three weeks ago, but, seemingly, it was ignored... sorry if this > bug is already being fixed in a different way, but I think the fix presented > in that patch is worth considering. > > Very best, > Dmitry > > > > > Best regards, > > Maxim Levitsky > > > > > > > > IMHO my patches should be merged first (they are really just fixing a > > > regression and some bugs). If anyone finds it necessary and is capable > > > and willing to fix the insufficiency of BlockLimits, they can do it > > > afterwards with another patch series as an improvement. I can add a > > > patch to remove the clamping in nbd/server.c as a perhaps-temporary > > > measure to address the original issue, if desired. > > > > > > On Sun, 6 Sep 2020 at 20:53, Maxim Levitsky <mlevi...@redhat.com> > > wrote: > > > > On Sun, 2020-09-06 at 19:04 +0800, Tom Yan wrote: > > > > > Maybe you want to add some condition for this: > > > > > https://github.com/qemu/qemu/blob/v5.1.0/nbd/server.c#L659 > > > > > Or not clamp it at all. > > > > > > > > > > On Sun, 6 Sep 2020 at 18:58, Tom Yan <tom.t...@gmail.com> wrote: > > > > > > In commit 867eccfed84f96b54f4a432c510a02c2ce03b430, Levitsky > > appears > > > > > > to have assumed that the only "SCSI Passthrough" is `-device > > > > > > scsi-generic`, while the fact is there's also `-device scsi-block` > > > > > > (passthrough without the sg driver). Unlike `-device scsi-hd`, > > > > > > getting > > > > > > max_sectors is necessary to it (more precisely, hw_max_sectors > > might > > > > > > what matters, but BLKSECTGET reports max_sectors, so). > > > > > > > > > > > > I'm unsure about how qemu-nbd works, but the commit clearly > > wasn't the > > > > > > right approach to fix the original issue it addresses. (It should > > > > > > for > > > > > > example ignore the max_transfer if it will never matter in to it, or > > > > > > overrides it in certain cases; when I glimpsed over > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1647104, I don't see > > how > > > > > > it could be file-posix problem when it is reporting the right thing, > > > > > > regardless of whether "removing" the code helps.) > > > > > > > > > > > > I don't think we want to "mark" `-device scsi-block` as sg either. > > > > > > It > > > > > > will probably bring even more unexpected problems, because they > > are > > > > > > literally different sets of things behind the scene / in the kernel. > > > > > > > > Yes, I did overlook the fact that scsi-block is kind of hybrid > > > > passthough > > device, > > > > doing SG_IO for some things and regular IO for others. > > > > > > > > I don't think that my approach was wrong way to fix the problem, but as > > you found > > > > out, abusing 'bs->sg' hack (which I would be very happy to remove > > completely) > > > > wasn't a good idea. > > > > I actualy was aware of scsi-block and that it does SG_IO but it > > > > was forgotten some where on the way. > > > > > > > > So in summary what the problem is and what I think is the right > > > > solution: > > > > > > > > > > > > Each qemu block driver exposes block limits and assumes that they are > > the same > > > > for two IO interfaces the block driver can expose: > > > > > > > > 1. Regular qemu blk_pread/pwrite alike functions > > > > 2. blk_ioctl (tiny wrapper against SG_IO), supported by posix-raw on > > > > host block devices/sg char devices, and by iscsi > > > > > > > > The problem is that these two interfaces can have different block > > > > limits. > > > > > > > > I don't know about iscsi, but for files, doing regular IO is always > > > > unlimited, > > > > since it passes through the kernel block layer and segemented there on > > > > demand which is faster that doing it in userspace, while SG_IO is passed > > as is > > > > to the underlying SCSI device and lacks this segmentation. > > > > > > > > Regardless of how NBD uses these limits, I think that these limits > > > > should > > be correctly > > > > exposed by the block drivers, and therefore I propose is that each qemu > > block driver > > > > would expose a pair of block limits. > > > > One for the regular block IO, and other for SG_IO. > > > > > > > > Then block driver clients (like scsi devices that you mention, nbd, etc) > > > > can choose which limit to use, depending on which IO api they use. > > > > The scsi-generic, and scsi-block can use the SG_IO limits, > > > > while the rest can use the normal (unlimited for file I/O) limits, > > > > including > > the NBD. > > > > > > > > Best regards, > > > > Maxim Levitsky > > > > > > > > > > On Fri, 4 Sep 2020 at 10:09, Tom Yan <tom.t...@gmail.com> wrote: > > > > > > > sg devices have different major/minor than their corresponding > > > > > > > block devices. Using sysfs to get max segments never really worked > > > > > > > for them. > > > > > > > > > > > > > > Fortunately the sg driver provides an ioctl to get sg_tablesize, > > > > > > > which is apparently equivalent to max segments. > > > > > > > > > > > > > > Signed-off-by: Tom Yan <tom.t...@gmail.com> > > > > > > > --- > > > > > > > block/file-posix.c | 17 ++++++++++++++++- > > > > > > > 1 file changed, 16 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/block/file-posix.c b/block/file-posix.c > > > > > > > index 411a23cf99..9e37594145 100644 > > > > > > > --- a/block/file-posix.c > > > > > > > +++ b/block/file-posix.c > > > > > > > @@ -1178,6 +1178,21 @@ static int > > sg_get_max_transfer_length(int fd) > > > > > > > #endif > > > > > > > } > > > > > > > > > > > > > > +static int sg_get_max_segments(int fd) > > > > > > > +{ > > > > > > > +#ifdef SG_GET_SG_TABLESIZE > > > > > > > + long max_segments = 0; > > > > > > > + > > > > > > > + if (ioctl(fd, SG_GET_SG_TABLESIZE, &max_segments) == 0) { > > > > > > > + return max_segments; > > > > > > > + } else { > > > > > > > + return -errno; > > > > > > > + } > > > > > > > +#else > > > > > > > + return -ENOSYS; > > > > > > > +#endif > > > > > > > +} > > > > > > > + > > > > > > > static int get_max_transfer_length(int fd) > > > > > > > { > > > > > > > #if defined(BLKSECTGET) && defined(BLKSSZGET) > > > > > > > @@ -1268,7 +1283,7 @@ static void > > hdev_refresh_limits(BlockDriverState *bs, Error **errp) > > > > > > > bs->bl.max_transfer = pow2floor(ret); > > > > > > > } > > > > > > > > > > > > > > - ret = get_max_segments(s->fd); > > > > > > > + ret = bs->sg ? sg_get_max_segments(s->fd) : > > get_max_segments(s->fd); > > > > > > > if (ret > 0) { > > > > > > > bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, > > > > > > > ret * > > > > > > > qemu_real_host_page_size); > > > > > > > -- > > > > > > > 2.28.0 > > > > > > > > > > > >