On 8/14/25 19:13, Peter Maydell wrote:
Earlier this year, the Linux kernel's kernel-doc script was rewritten
from the old Perl version into a shiny and hopefully more maintainable
Python version. This commit series updates our copy of this script
to the latest kernel version. I have tested it by comparing the
generated HTML documentation and checking that there are no
unexpected changes.
Luckily we are carrying very few local modifications to the Perl
script, so this is fairly straightforward. The structure of the
patchset is:
* a minor update to the kerneldoc.py Sphinx extension so it
will work with both old and new kernel-doc script output
* a fix to a doc comment markup error that I noticed while comparing
the HTML output from the two versions of the script
* import the new Python script, unmodified from the kernel's version
(conveniently the kernel calls it kernel-doc.py, so it doesn't
clash with the existing script)
* make the changes to that library code that correspond to the
two local QEMU-specific changes we carry
* tell sphinx to use the Python version
* delete the Perl script (I have put a diff of our local mods
to the Perl script in the commit message of this commit, for
posterity)
The diffstat looks big, but almost all of it is "import the
kernel's new script that we trust and don't need to review in
detail" and "delete the old script".
My immediate motivation for doing this update is that I noticed
that the submitter of https://gitlab.com/qemu-project/qemu/-/issues/3077
is using a Perl that complains about a construct in the perl script,
which prompted me to check if the kernel folks had already fixed
it, which it turned out that they had, by rewriting the whole thing :-)
More generally, if we don't do this update, then we're effectively
going to drift down the same path we did with checkpatch.pl, where
we have our own version that diverges from the kernel's version
and we have to maintain it ourselves.
Yep - for checkpatch.pl that makes sense, since we have more differences
in what we test and we have backported the most pressing parser fixes,
but kerneldoc has no reason to diverge.
Thanks for doing this! For the whole series...
Reviewed-by: Paolo Bonzini <pbonz...@redhat.com>
Paolo
We should also update the Sphinx plugin itself (i.e.
docs/sphinx/kerneldoc.py), but because I did not need to do
that to update the main kernel-doc script, I have left that as
a separate todo item.
Testing
-------
I looked at the HTML output of the old kernel-doc script versus the
new one, using the following diff command which mechanically excludes
a couple of "same minor change" everywhere diffs, and eyeballing the
resulting ~150 lines of diff.
diff -w -I '^<div class="kernelindent docutils container">$' -I '^</div>$' -I
'^<p><strong>Definition</strong>' -r -u -x searchindex.js build/x86/docs-old-kerneldoc/manual
build/x86/docs/manual
The HTML changes are:
(1) some paras now have ID tags, eg:
-<p><strong>Functions operating on arrays of bits</strong></p>
+<p id="functions-operating-on-arrays-of-bits"><strong>Functions operating on arrays of
bits</strong></p>
(2) Some extra named <div>s, eg:
+<div class="kernelindent docutils container">
<p><strong>Parameters</strong></p>
<dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">long</span> <span
class="pre">nr</span></code></dt><dd><p>the bit to set</p>
@@ -144,12 +145,14 @@
<dt><code class="docutils literal notranslate"><span class="pre">unsigned</span> <span class="pre">long</span> <span
class="pre">*addr</span></code></dt><dd><p>the address to start counting from</p>
</dd>
</dl>
+</div>
(3) The new version correctly parses the multi-line Return: block for
the memory_translate_iotlb() doc comment. You can see that the
old HTML here had dt/dd markup, and it mis-renders in the HTML at
https://www.qemu.org/docs/master/devel/memory.html#c.memory_translate_iotlb
<p><strong>Return</strong></p>
-<dl class="simple">
-<dt>On success, return the MemoryRegion containing the <strong>iotlb</strong>
translated</dt><dd><p>addr. The MemoryRegion must not be
accessed after rcu_read_unlock.
+<p>On success, return the MemoryRegion containing the <strong>iotlb</strong>
translated
+addr. The MemoryRegion must not be accessed after rcu_read_unlock.
On failure, return NULL, setting <strong>errp</strong> with error.</p>
-</dd>
-</dl>
+</div>
"Definition" sections now get output with a trailing colon:
-<p><strong>Definition</strong></p>
+<div class="kernelindent docutils container">
+<p><strong>Definition</strong>:</p>
This seems like it might be a bug in kernel-doc since the Parameters,
Return, etc sections don't get the trailing colon. I don't think it's
important enough to worry about.
thanks
-- PMM
Peter Maydell (8):
docs/sphinx/kerneldoc.py: Handle new LINENO syntax
tests/qtest/libqtest.h: Remove stray space from doc comment
scripts: Import Python kerneldoc from Linux kernel
scripts/kernel-doc: strip QEMU_ from function definitions
scripts/kernel-doc: tweak for QEMU coding standards
scripts/kerneldoc: Switch to the Python kernel-doc script
scripts/kernel-doc: Delete the old Perl kernel-doc script
MAINTAINERS: Put kernel-doc under the "docs build machinery" section
MAINTAINERS | 2 +
docs/conf.py | 4 +-
docs/sphinx/kerneldoc.py | 7 +-
tests/qtest/libqtest.h | 2 +-
.editorconfig | 2 +-
scripts/kernel-doc | 2442 -------------------------------
scripts/kernel-doc.py | 325 ++++
scripts/lib/kdoc/kdoc_files.py | 291 ++++
scripts/lib/kdoc/kdoc_item.py | 42 +
scripts/lib/kdoc/kdoc_output.py | 749 ++++++++++
scripts/lib/kdoc/kdoc_parser.py | 1670 +++++++++++++++++++++
scripts/lib/kdoc/kdoc_re.py | 270 ++++
12 files changed, 3355 insertions(+), 2451 deletions(-)
delete mode 100755 scripts/kernel-doc
create mode 100755 scripts/kernel-doc.py
create mode 100644 scripts/lib/kdoc/kdoc_files.py
create mode 100644 scripts/lib/kdoc/kdoc_item.py
create mode 100644 scripts/lib/kdoc/kdoc_output.py
create mode 100644 scripts/lib/kdoc/kdoc_parser.py
create mode 100644 scripts/lib/kdoc/kdoc_re.py