The heading elements in our website contain "id" information, but currently to find them you to look at the page source, whereas in the generated HTML for the manual we have e.g.:
<a class="copiable-link" href="#index-mabi-1"> ¶</a> which shows up nicely in the browser in e.g. https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html as a pilcrow character when you hover over the link, which you can then use to copy the URL to the clipboard. It's *very* helpful to have easily shareable links to within pages. The attached patch adds a postprocessing step to "bin" that turns e.g. <h1 id="ID">TEXT</h1> to: <h1 id="ID"><a href="#ID">TEXT</a></h1> which makes it very easy to copy links in the generated website. I didn't bother adding any CSS. I've never managed to build MetaHTML and have always just crossed my fingers and hoped when making edits to the GCC website; bin/preprocess just errors out for me immediately due to not finding mhc. So this patch as written replaces the invocation of mhc with an invocation of the python script, which of course drops various features. I've uploaded a build of the website with this to: https://dmalcolm.fedorapeople.org/gcc/2025-01-15/htdocs/ You can see e.g. the easily clickable heading ids here: https://dmalcolm.fedorapeople.org/gcc/2025-01-15/htdocs/gcc-15/changes.html compared to: https://gcc.gnu.org/gcc-15/porting_to.html and, for now, the loss of the mhc stuff here: https://dmalcolm.fedorapeople.org/gcc/2025-01-15/htdocs/ compared to: https://gcc.gnu.org/ Gerald: if you have mhc working, can you please try adjusting the bin/ so it runs *both*. mhc and the python script. Thoughts? Dave --- bin/preprocess | 13 +++---------- bin/process_html.py | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 10 deletions(-) create mode 100644 bin/process_html.py diff --git a/bin/preprocess b/bin/preprocess index 2e474b0c..c64bc97b 100755 --- a/bin/preprocess +++ b/bin/preprocess @@ -33,8 +33,6 @@ # # By Gerald Pfeifer <pfei...@dbai.tuwien.ac.at> 1999-12-29. -MHC=${MHC-/usr/local/bin/mhc} - SOURCETREE=${SOURCETREE-/www/gcc/htdocs-preformatted} DESTTREE=${DESTTREE-/www/gcc/htdocs} @@ -114,9 +112,9 @@ process_html_file() printf '<set-var MHTML::INCLUDE-PREFIX="%s">\n' `pwd` >> $TMPDIR/input cat $f >> $TMPDIR/input - if ! ${MHC} $TMPDIR/input > $TMPDIR/output.raw; then - echo "${MHC} failed; aborting." - exit 1 + if ! python3 $SOURCETREE/bin/process_html.py $TMPDIR/input $TMPDIR/output.raw; then + echo "bin/process_html.py failed; aborting." + exit 1 fi # Use sed to work around makeinfo 4.7 brokenness. @@ -227,11 +225,6 @@ shift `expr ${OPTIND} - 1` ## Various safety checks. -if ! ${MHC} --version >/dev/null; then - echo "Something does not look right with \"${MHC}\"; aborting." - exit 1 -fi - if [ ! -d $SOURCETREE ]; then echo "Source tree \"$SOURCETREE\" does not exist." exit 1 diff --git a/bin/process_html.py b/bin/process_html.py new file mode 100644 index 00000000..8a36a587 --- /dev/null +++ b/bin/process_html.py @@ -0,0 +1,32 @@ +#! /usr/bin/python3 +# +# Python 3 script to preprocess .html files below htdocs + +import re +import sys + +input_path = sys.argv[1] +output_path = sys.argv[2] + +with open(input_path) as f_in: + with open(output_path, 'w') as f_out: + for line in f_in: + # Convert from e.g. + # <h1 id="ID">TEXT</h1> + # to: + # <h1 id="ID"><a href="#ID">TEXT</a></h1> + for element_name in {'h1', 'h2', 'h3', 'h4'}: + pattern = \ + (r'<' + + element_name + + r' id="(.+)">(.+)</' + + element_name + + '>') + replacement = \ + (r'<' + + element_name + + r' id="\1"><a href="#\1">\2</a></' + + element_name + + '>') + line = re.sub(pattern, replacement, line) + f_out.write(line) -- 2.46.0