On 2025/08/09 17:30, Markus Armbruster wrote:
Akihiko Odaki <od...@rsg.ci.i.u-tokyo.ac.jp> writes:
On 2025/08/08 17:08, Markus Armbruster wrote:
We added @error_warn some two years ago in commit 3ffef1a55ca (error:
add global &error_warn destination). It has multiple issues:
* error.h's big comment was not updated for it.
* Function contracts were not updated for it.
* ERRP_GUARD() is unaware of @error_warn, and fails to mask it from
error_prepend() and such. These crash on @error_warn, as pointed
out by Akihiko Odaki.
All fixable. However, after more than two years, we had just of 15
uses, of which the last few patches removed eight as unclean or
otherwise undesirable. I didn't look closely enough at the remaining
seven to decide whether they are desirable or not.
I don't think this feature earns its keep. Drop it.
I want to note that the following patch series temporarily use &error_warn
during its conversion to add errp parameters:
https://lore.kernel.org/qemu-devel/20250808-propagate_tpm_error-v10-0-3e81a1d41...@redhat.com/
("[PATCH v10 00/27] migration: propagate vTPM errors using Error objects")
I think this series needs to be rebased on top of the migration change.
Thanks for the heads-up.
I'm not sure if seven uses are insufficient to keep it.
I also have a general impression that perhaps a special error destination for
error_report_err() is more useful. Today, there are many functions use Error,
but there are also functions that still follow old error handling patterns.
When legacy functions call functions with Error, a common pattern is to use
error_report_err() and return -1.
You mean like &error_fatal less the exit(1)?
Yes, that's what I meant.
"[PATCH 01/12] monitor: Clean up HMP gdbserver error reporting" and "[PATCH 09/12]
ui/pixman: Consistent error handling in qemu_pixman_shareable_free()" are examples that will
benefit from error_report_err() as an error destination. The migration patch series I mentioned
earlier can also use one.
Perhaps it is nicer if there is an infrastructure shared by the special
destinations. In particular, we can have common solutions for the three
problems you pointed out:
* error.h's big comment was not updated for it.
* Function contracts were not updated for it.
For these two problems, they can refer to "special error destinations" instead
of listing them, and delegate explanations of them to corresponding ones.
* ERRP_GUARD() is unaware of @error_warn, and fails to mask it from
error_prepend() and such. These crash on @error_warn, as pointed
out by Akihiko Odaki.
For this problem, Error can tell that it is a special destination by leaving msg
NULL, for example. ERRP_GUARD() then rewrites errp it is not &error_abort and
msg is NULL.
As I wrote, the defects are all fixable. However, there has been so
little use of &error_warn, and so much of it has been unclean or
otherwise undesirable. It's obviously prone to misuse. I think we're
better off without it.
See also the memo "Abuse of warnings for unhandled errors and
programming errors" I posted yesterday.
It is insightful. The cover letter can have a reference to it and this
patch can have similar description.
However I still have a few counterarguments:
A reason of the &error_warn usage may be caused by the complexity
to deal with unhandled errors and programming errors. If so, adding
mechanisms to ease that may naturally sufficiently reduce the wrong
usage added in the future.
The "&error_fatal without exit(1)" may be good enough for unhandled
errors. For example, "[PATCH v10 00/27] migration: propagate vTPM errors
using Error objects" would not have used &error_warn if there is the
"&error_fatal without exit(1)".
There are also warn_report*() functions which also have the same
problem. A comprehensive solution needs to deal with them all. Removing
all of them will do but may not be practical. Another possibility is
that to write a documentation telling warning should be avoided for
unhandled/programming errors and let all refer to it.
I agree that there is little valid usage of &error_warn and removing it
may not cause a problem at all, but it is still nice if there is a
reasoning for the removal, ideally addressing these counterarguments as
well.