On Mon, May 05, 2025 at 01:05:54PM -0300, Jason Gunthorpe wrote: > On Mon, May 05, 2025 at 09:03:28AM -0700, Nicolin Chen wrote: > > On Mon, May 05, 2025 at 12:55:05PM -0300, Jason Gunthorpe wrote: > > > On Mon, May 05, 2025 at 08:44:22AM -0700, Nicolin Chen wrote: > > > > On Mon, May 05, 2025 at 12:01:09PM -0300, Jason Gunthorpe wrote: > > > > > On Mon, Apr 28, 2025 at 10:41:45AM -0700, Nicolin Chen wrote: > > > > > > > I'm uncertain, but perhaps pr_warn_ratelimited() would be a better > > > > > > > alternative to WARN_ON() here? WARN_ON_ONCE() generates warning > > > > > > > messages > > > > > > > with kernel call traces in the kernel messages, which might lead > > > > > > > users > > > > > > > to believe that something serious has happened in the kernel. > > > > > > > > > > > > We already have similar practice, e.g. iommufd_hwpt_nested_alloc. > > > > > > > > > > > > In my review, a WARN_ON/WARN_ON_ONCE means there is a kernel bug, > > > > > > which shouldn't occur in the first place and isn't something that > > > > > > > > > > Right, so it should never happen from any ioctl path and syzkaller > > > > > should never trigger it based on system call randomization > > > > > > > > > > Is that what this achieves? > > > > > > > > The functions would be still used in the kernel path. So, I think > > > > we need to retain these warnings for that. But given that an ioctl > > > > could trigger a series of WARN_ONs, WARN_ON_ONCE is something that > > > > wouldn't bother user space a lot while it provides the kernel path > > > > enough info to debug. > > > > > > No, it does bother userspace, we must not have ioctl triggerable > > > WARN_ON at all. > > > > You mean we have to eliminate any WARN_ON in a call path of an > > ioctl? > > Yes, not one that derives from user information. You can WARN_ON if > internal kernel structures are corrupted but not if user ioctl > arguments are bad. > > > We can drop them, just that would mute any kernel bug. > > That maybe the right answer > > > Btw, IIRC, the destroy ioctl could trigger some WARN_ON in the > > remove() when the object's refcount isn't correctly decreased. > > Should that be a problem too? > > That is kernel data structures being corrupted.
Oh, then these two WARN_ONs in the unpin() will be only triggered in the destroy path when the vcmdq kernel structure is corrupted, because the two inputs are not given directly by a destroy ioctl. In this case, I think it's similar, so we should retain them, as they were? Thanks Nicolin