On Thu, Sep 11, 2025 at 02:10:15PM +0200, Markus Armbruster wrote:
> I fell into another rabbit hole: logging.  Here's my trip report.
> 
> We have:
> 
> * util/error-report.c
> 
>   - To stderr (or current monitor, but that's not relevant here)
> 
>   - Format:
>         [TIME ][GUEST-NAME ]PROGNAME:LOCATION: (|warning: |info: )MESSAGE
>     where the optional parts are controlled by -msg
> 
>   - Maintained: see MAINTAINERS "Error reporting"
> 
> * util/log.c
> 
>   - Control:
> 
>     · Some logging is always on
> 
>     · Some logging is off by default, enabled per "log item" with -d
> 
>     · Some can additionally be limited to address ranges with -dfilter
> 
>   - To the file given by -D (default stderr), or with -d tid to files
>     with names derived from the pattern given by -D (%d is replaced by
>     the thread ID)
> 
>   - Format:
>         [TIME ]MESSAGE
>     where the optional part is controlled by -msg timestamp (-msg
>     guest-name has no effect here)
> 
>   - Not covered by MAINTAINERS
> 
> * Tracing
> 
>   - Control: off by default, enabled with -trace per trace point (and
>     also -d trace: --- I find that duplication bizarre)
> 
>   - Run time control with HMP commands trace-event, trace-file, QMP
>     commands trace-event-get-state, trace-event-set-state
> 
>   - To whatever backends are compiled in:
> 
>     · "log" wraps around util/log.c's always-on logging
> 
>     · "simple" logs to a binary file given by -trace file=
> 
>     · "syslog" logs to POSIX syslog
> 
>     · "ftrace", "dtrace", "ust" integrate with Linux ftrace, DTrace /
>       SystemTap, and LTTng userspace tracing, respectively
> 
>   - Maintained: see MAINTAINERS "Traicing"
> 
> * Unstructured ad hoc printing to stdout or stderr
> 
> I think that's it.
> 
> Observations:
> 
> * Mixing logs with different formats in one destination makes them
>   needlessly hard to read and parse.
> 
>   Daniel's "util: sync error_report & qemu_log output more closely"
>   series makes util/error-report.c and util/log.c use a common message
>   prefix
> 
>       [TIME ][GUEST-NAME ]PROGNAME\(THREAD-NAME\):...
> 
>   where the optional parts are controlled by -msg.  Continues prior work
>   in that direction: commit 012842c0755 (log: make '-msg timestamp=on'
>   apply to all qemu_log usage).
> 
> * Spreading logs over multiple destinations can make the order hard to
>   determine.
> 
>   -D splits util/log.c from util/error-report.c.
> 
>   -d tid further splits per thread.
> 
>   Splitting per thread feels fairly redundant once the prefix includes
>   the thread name.  Thread names is easier to make sense of than numeric
>   thread IDs in the split filenames.

Also gives you a chronological record of the log across all threads
if we have it in one file avoiding cross-correlation needs.

> 
>   To keep the complete log in one place, avoid -D, and enable trace
>   backend "log".
> 
> * In my view, util/log.c consists of three layers.  Form bottom to top:
> 
>   * Logging messages: qemu_log()
> 
>   * Control of what to log ("log items"): qemu_log_mask()
> 
>   * Address filtering: qemu_log_in_addr_range()
> 
>   The bottom layer is also used by trace backend "log".
> 
> * Tracing vs. "log items"
> 
>   Tracing and "log items" both provide opt-in logging.
> 
>   Why do we have two?  Feels like an accident to me.
> 
>   When to use which?  I guess people pick whatever they're familiar
>   with.
> 
>   Would we be better off with just one?  I wish...
> 
>   Which one?  Clearly tracing, because tracing backends provide plenty
>   of useful functionality "log items" lack.
> 
>   "Log items" support per-thread log files.  With tracing, you could
>   instead split by thread names once Daniel's series is in.
> 
>   Address range filtering is only used with "log items".  It could just
>   as easily be used with tracing.

There is another trap door (or perhaps better described as back door).

qemu_log_trylock() returns the raw FILE * handle, which the callers
will directly write to. This is used instead of qemu_log() in a number
of places, and often has large data dumps written to it. This bypasses
any prefixes like timestamps, thread id, etc.  This direct FILE usage
would not map well over to tracing.

Also tracing wouldn't  allow qemu-user to provide its '-strace' CLI
arg, because we can't guarantee that QEMU is built with a trace
backend that will print to the console.

This makes it challenging (impossible) to entirely replace qemu_log
with tracing.


The other thing I find slightly wierd is 'info_report'. error vs
warning is clear - one is fatal, one is not, but the warning vs
info distinction is far from clearcut.

So warn_report vs info_report, also involves a choice of qemu_log(),
error_[v]printf(), fprintf(stderr, ...), g_printerr,  g_message,
g_critical, g_warning, g_info, and g_debug

... all of them are used somewhere in QEMU :-)

I feel like qemu_log ought to be obsoleted by glib's logging APIs,
but again we have the problem with the the raw FILE* usage for
large data dumps. Some glib APIs we use will already be using
glib's logging APIs, and we wire upa  custom backend for glib
to call into  info_report/warn_report/error_report, with glib's
DEBUG level squashed into info_report

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to