[
https://issues.apache.org/jira/browse/SOLR-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-10934:
----------------------------
Attachment: SOLR-10934.patch
bq. What we might want to consider, is refactoring our build.xml, so that the
same <asciidoctor:convert/> task options use to generate the PDF, could also be
used to generate a bare bones version of the html-site – ie: not using jekyll,
just using raw asciidoctor with the "html5" output option. Then we could (in
theory) run the same HTML link checking code we currently have against that
output dir – just for the purpose of checking the links, not with any plan to
ever publish it.
I'm attaching a path that takes this approach -- i think it works pretty well.
Unfortunately refactoring just the build.xml file proved to be insufficient to
be able to re-use the existing {{<ascidoctor;convert>}} in a macro because of
how the underlying Task class works -- it has some hard assumptions about XML
element attributes like "sourceDocumentName" not being used even if they are ht
empty string because of ant property expansion -- but i was able to deal with
that by adding out own little AntTask subclass into the tools jar.
i also did a little more refactoring of the build.xml file so running building
both the PDF & jekyll site via {{ant}} wouldn't waste time redudently also
building & validating the bare-bones HTML version. (unfortunately if you
explicitly run {{ant build-pdf build-site}} this still happens, but hey: baby
steps)
like the previous patch, this includes some "nocommit" annotated intentional
anchor/link errors in the {{*.adoc}} files. If you apply the patch as is, and
run {{ant}} or {{ant build-pdf}} or {{ant build-site}} you'll get all the same
validation errors that we want to see happen with this kind of bad content. If
you refer the {{solr/solr-ref-guide/src}} changes then everything will start
building happily.
what do folks think of this approach?
> create a link+anchor checker for the ref-guide PDF using PDFBox
> ---------------------------------------------------------------
>
> Key: SOLR-10934
> URL: https://issues.apache.org/jira/browse/SOLR-10934
> Project: Solr
> Issue Type: Sub-task
> Security Level: Public(Default Security Level. Issues are Public)
> Components: documentation
> Reporter: Hoss Man
> Priority: Major
> Attachments: SOLR-10934.patch, SOLR-10934.patch
>
>
> We currently have CheckLinksAndAnchors.java which is automatically run
> against the ref-guide HTML as part of the build to use JSoup to find bad
> links/anchors that asciidoctor doesn't complain about -- but not everyone
> does/can build the HTML version of the ref-guide sincif we can e it requires
> manually installing jekyll.
> The PDF build only requires things installed by ivy (via JRuby) and we
> already have some PDFBox based code in ReducePDFSize.java that operates on
> this PDF every time it's run -- so if we can find a way to do similar checks
> using the PDFBox API we could catch these broken links faster.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]