Re: [Qemu-devel] [PATCH 00/16] QEMU vhost-scsi support

Nicholas A. Bellinger Fri, 20 Apr 2012 00:01:18 -0700

On Thu, 2012-04-19 at 19:20 -0500, Anthony Liguori wrote:
> Hi Nicholas,
> 
> On 04/19/2012 06:53 PM, Nicholas A. Bellinger wrote:
> > On Thu, 2012-04-19 at 07:30 -0500, Anthony Liguori wrote:
> >> However, for storage, be it scsi or direct access, the same problem really
> >> doesn't exist.  There isn't an obvious benefit to being in the kernel.
> >>
> >
> > In the modern Linux v3.x tree, it was decided there is an obvious
> > benefit to fabric drivers developers for going ahead and putting proper
> > SCSI target logic directly into the kernel..  ;)
> 
> I'm sure there are obvious benefits to having the kernel have SCSI target 
> logic. 
>   I'm not claiming that there isn't.
> 
> But there is not an obvious benefit to doing SCSI emulation *for virtual 
> machine* guests in the kernel.
> 
> Guests are unconditionally hostile.  There is no qualification here.  Public 
> clouds are the obvious example of this.
> 
> TCM runs in the absolute most privileged context possible.  When you're 
> dealing 
> with extremely hostile input, it's pretty obvious that you want to run it in 
> the 
> lowest privileged context as humanly possible.
>


The argument that a SCSI target for virtual machines is so complex that
it can't possibly be implemented properly in the kernel is a bunch of
non-sense.

> Good or bad, QEMU runs as a very unprivileged user confined by SELinux and 
> very 
> soon, sandboxed with seccomp.  There's an obvious benefit to putting complex 
> code into an environment like this.
> 

Being able to identify which virtio-scsi guests can actually connect via
vhost-scsi into individual tcm_vhost endpoints is step one here.

tcm_vhost (as well as it's older sibling tcm_loop) are currently both
using a virtual initiator WWPN that is set via configfs before attaching
the virtual machine to tcm_vhost fabric endpoint + LUNs.

Using vhost-scsi initiator WWPNs to enforce what clients can connect to
individual tcm_vhost endpoints is one option for restricting access.  We
are already doing something similar with iscsi-target and tcm_fc(FCoE)
endpoints to restrict fabric login access from remote SCSI initiator
ports..

<SNIP>

> >
> >> So before we get too deep in patches, we need some solid justification 
> >> first.
> >>
> >
> > So the potential performance benefit is one thing that will be in favor
> > of vhost-scsi,
> 
> Why?  Why would vhost-scsi be any faster than doing target emulation in 
> userspace and then doing O_DIRECT with linux-aio to a raw device?
> 
> ?

Well, using a raw device from userspace there is still going to be a
SG-IO memcpy going on here between user <-> kernel in current code,
yes..?

Being able to deliver interrupts and SGL memory directly into tcm_vhost
cmwq kernel context for backend device execution w/o QEMU userspace
involvement or extra SGL memcpy is the perceived performance benefit
here.

How much benefit will this actually provide across single port and multi
port tcm_vhost LUNs into a single guest..?  That still remains to be
demonstrated with performance+throughput benchmarks..

>  I think the ability to utilize the underlying TCM fabric
> > and run concurrent ALUA multipath using multiple virtio-scsi LUNs to the
> > same /sys/kernel/config/target/core/$HBA/$DEV/ backend can potentially
> > give us some nice flexibility when dynamically managing paths into the
> > virtio-scsi guest.
> 
> The thing is, you can always setup this kind of infrastructure and expose a 
> raw 
> block device to userspace and then have QEMU emulate a target and turn that 
> into 
> O_DIRECT + linux-aio.
> 
> We can also use SG_IO to inject SCSI commands if we really need to.  I'd 
> rather 
> we optimize this path.  If nothing else, QEMU should be filtering SCSI 
> requests 
> before the kernel sees them.  If something is going to SEGV, it's far better 
> that it's QEMU than the kernel.
> 

QEMU SG-IO and BSG drivers are fine for tcm_loop SCSI LUNs with QEMU HBA
emulation, but they still aren't tied directly to an individual guest
instance.

That is, the raw devices being passed into SG-IO / BSG are still locally
accessible on host via SCSI devices w/o guest access restrictions, while
a tcm_vhost endpoint is not exposing any host accessible block device,
that could also restrict access to an authorized list of virtio-scsi
clients.

> We cannot avoid doing SCSI emulation in QEMU.  SCSI is too fundamental to far 
> too many devices.  So the prospect of not having good SCSI emulation in QEMU 
> is 
> not realistic.
> 

I'm certainly not advocating for a lack of decent SCSI emulation in
QEMU.  Being able to support this across all host platform is something
QEMU certainly needs to take seriously.

Quite the opposite, I think virtio-scsi <-> vhost-scsi is a mechanism by
which it will be (eventually) possible to support T10 DIF protection for
storage blocks directly between Linux KVM guest <-> host.

In order for QEMU userspace to support this, Linux would need to expose
a method to userspace for issuing DIF protected CDBs.  This userspace
API currently does not exist AFAIK, so a kernel-level approach is the
currently the only option when it comes to supporting end-to-end block
protection information originating from within Linux guests.  (Note this
is going to involve a virtio-scsi spec rev as well)

--nab

Re: [Qemu-devel] [PATCH 00/16] QEMU vhost-scsi support

Reply via email to