On Wed, May 30, 2018 at 05:15:19PM +0200, Halil Pasic wrote: > > > On 05/30/2018 06:47 AM, Michael S. Tsirkin wrote: > > On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote: > > > There are many error_report()s that can be used in frequently called > > > functions, especially on IO paths. That can be unideal in that > > > malicious guest can try to trigger the error tons of time which might > > > use up the log space on the host (e.g., libvirt can capture the stderr > > > of QEMU and put it persistently onto disk). > > > > I think the problem is real enough but I think the API > > isn't great as it stresses the mechanism. Which fundamentally does > > not matter - we can print once or 10 times, or whatever. > > > > What happens here is a guest bug as opposed to hypervisor > > bug. So I think a better name would be guest_error. > > I don't agree with your argument against the name report_once > Michael. In my reading the commit message describes one of use > cases for which the infrastructure introduced by this patch is > a supposed to be a good fit. But report_once is not restricted > to this example.
All I'm saying is that we should distinguish between guest and host errors at code level. > In my previous life in the userspace I had to debug problems > where the original error message got log-rotated away because of an > onslaught of error messages that were a consequence of the original > one, and not very helpful. > > IMHO raising the issue of guest_error is a very sane thing to do, > but it is a different problem. I think guest_error is about how and > to whom the error is to be reported. IMHO report the error to the > ones that are affected by it and to the ones that can do something > about it (e.g. fix it) is a good rule of thumb. The latter may be > different for hypervisor and for guest bugs. > > In my understanding this is really about spamming the log problem. > Of course one can try to solve/mitigate the problem at different > levels. It could be declared > 1) a problem to be solved in the logging library more or less > transparently > 2) a problem to be solved by the environment and it's admin (e.g. > log aggregation, filtering, and rotation) > 3) a problem that the client code of the logging library has to > explicitly deal with > > The once and rate_limited are 3). > > To sum it up guest error or not and once or not are orthogonal > problems in my view. > > Regards, > Halil Right. But as long as we are changing this code, I'd like to see guest errors reported in a way that makes it easy to distinguish them from host errors. > > > > Internally we can still have something similar to this > > mechanism. > > > > Another idea is to reset these guest error counters on guest reset. > > Device reset too? I'm not 100% sure as guest can trigger device resets. > > > > > [..]