Hello Ben, hello Christoph, On Wed, 2025-05-14 at 12:23 -0400, Benjamin Marzinski wrote: > On Tue, May 13, 2025 at 09:57:51PM -0700, Christoph Hellwig wrote: > > > > SG_IO is fine and the only way for SCSI passthrough. But doing > > SCSI passthrough through md-multipath just doesn't work. SCSI > > isn't > > built for layering, and ALUA and it's vendor-specific variants and > > alternatives certainly isn't. If you try that you're playing with > > fire and is not chance of ever moving properly. > > Could you be a bit more specific. All multipath is doing here is > forwarding the ioctls to an underlying scsi device, and passing back > up > the result. Admittedly, it doesn't always make sense to pass the > ioctl > on from the multipath device to just one scsi device. Persistent > Reservations are perfect example of this, and that's why QEMU doesn't > use DMs ioctl passthrough code to handle them.
I'd go one step further. Christoph is right to say that what we're currently doing in qemu – passing through every command except the PRIN/PROUT to a multipath device – is a dangerous thing to do. Passthrough from a dm-multipath device to a SCSI device makes sense only for a small subset of the SCSI command set. Basically just for the regular IO commands like the various READ and WRITE variants and the occasional UNMAP. However, in practice these commands account for 99.y% percent of the actual commands sent to devices. The fact that customers have been running these setups in large deployments over many years suggests that, if other commands ever get passed through to member devices, it has rarely had fatal consequences. Nobody would seriously consider sending ALUA commands to the multipath devices. TUR and REQUEST SENSE are other examples for commands that can't be reasonably passed through to random member devices of a multipath map. There are certainly many more examples. I guess it would make sense to review the command set and add some filtering in the qemu passthrough code. AFAIK the only commands that we really need to pass through (except the standard ones) are the reservation commands, which get special handling by qemu anyway. @Ben, @Kevin, are you aware of anything else? So: admittedly we're using a framework for passing through any command, where we actually need to pass through only a tiny subset of commands. Thinking about it this way, it really doesn't look like the perfect tool for the job, and we may want to look into a different approach for the future. > Also, when you have ALUA > setups, not all the scsi devices are equal. But multipath isn't > naievely > assuming that they are. It's only passing ioctls to the highest > priority > activated paths, just like it does for IO, and multipath is in charge > of > handling explicit alua devices. This hasn't proved to be problematic > in > practice. > > The reality of the situation is that customers have been using this > for > a while, and the only issue that they run into is that multipath > can't > tell when a SG_IO has failed due to a retryable error. Currently, > they're left with waiting for multipathd's preemptive path checking > to > fail the path so they can retry down a new one. The purpose of this > patchset and Martin's previous one is to handle this problem. If > there > are unavoidable critical problems that you see with this setup, it > would > be really helpful to know what they are. I'd also be interested in understanding this better. As noted above, I'm aware that passing through everything is dangerous and wrong in principle. But in practice, we haven't observed anything serious except (as Ben already said) the failure to do path failover in the SG_IO code path, which both this patch set and my set from the past are intended to fix. While I am open for looking for better alternatives, I still hope that we can find an agreement for a short/mid-term solution that would allow us to serve our customers who currently use SCSI passthrough setups. That would not just benefit us (the enterprise distros), because it would also help us fund upstream contributions. Regards Martin