On Fri, May 02, 2025 at 05:48:02PM +0100, Peter Maydell wrote: > On Wed, 2 Apr 2025 at 14:28, Daniel P. Berrangé <berra...@redhat.com> wrote: > > On Wed, Apr 02, 2025 at 09:33:16AM +0000, Bernhard Beschow wrote: > > > Am 31. März 2025 09:18:05 UTC schrieb "Daniel P. Berrangé" > > > <berra...@redhat.com>: > > > >General conceptual question ..... I've never understood what the > > > >dividing > > > >line is between use of 'qemu_log_mask' and trace points. > > > > > > I *think* it's the perspective: If you want to see any issues, regardless > > > of which device, use the -l option, i.e. qemu_log_mask(). If, however, > > > you want to see what a particular device does, use tracepoints. > > > > I guess I'd say that the latter ought to be capable of satisfying the > > former use case too, given a suitable trace point selection. If it > > can't, then perhaps that's telling us the way we select trace points > > is insufficiently expressive ? > > Yeah; you can turn on and off a tracepoint, and you can select > them by wildcard, but there's no categorization of them > (into eg "this is basically the equivalent of a debug printf" > vs "this is something that is a guest error you probably > want to know about").
I wonder if there's any value in enhancing the trace support to let us tag certain probes with some kind of "severity"/"level" concept, such that when the 'log' trace backend is enabled we can wire them through to the logging backend with useful categorization ? > There's also no way to say "turn on > this logging with one switch, and it will print multiple lines > or more than one thing" (at least not in the spirit of what > the tracepoint API expects; you could have a trace_in_asm > tracepoint that took a "%s" and output whatever you liked as > the string, of course). And debug-logging is more documented: > '-d help' shows what you can turn on and off and has at least > a brief description of what it is you're getting. IMHO the documentation benefit of '-d help' is somewhat inconsistent. I tried a crude grep for different usage of logging 2 CPU_LOG_EXEC 122 CPU_LOG_INT 103 CPU_LOG_MMU 6 CPU_LOG_PAGE 1 CPU_LOG_PLUGIN 8 CPU_LOG_RESET 10 CPU_LOG_TB_IN_ASM 4 CPU_LOG_TB_OP 1715 LOG_GUEST_ERROR 4 LOG_INVALID_MEM 753 LOG_UNIMP So the overwhealming majority of usage is accumulated under two "catch all" categories - "guest error" and "unimplemented" with no ability to filter - its all or nothing. We ought to be able to do a better job at documentation the trace events than we do today, given we have them in an easily extractable format and can associate them with particular files easily. The 'qemu-trace-stap list' command can list all available probes in a binary, but it only works for the systemtap backend. We ought to do better with other backends. > For tracepoints > you're hoping that the name is vaguely descriptive and also > hoping that the device/subsystem/etc named its tracepoints in > a way that lets you usefully wildcard them. Yep, we're somewhat inconsistent in our prefix naming conventions. It would be nice to try to enforce some greater standard there, but its hard to do programmatically. > Also, the qemu_log() logging assumes "we're sending text to > a logfile, we can format e.g. register dumps and disassembly > as arbitrary laid out plaintext". That's fine if your tracepoint > backend is also "we just send the text to a logfile/etc", but > I don't know if all of the tracepoint backends would be so happy > with that. I think emitting multi-line blocks of text probably ought to be considered out of scope for tracing. The 'log' backend is the only one where that would be a reasonable semantic match. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|