Hello,

Martin Schroeder wrote on Mon, Jul 21, 2025 at 09:32:02PM +0200:
> Am Fr., 18. Juli 2025 um 18:07 Uhr schrieb Ingo Schwarze:

>> If anyone has an idea how to regress/ test PostScript or PDF
>> output, i'd be interested in hearing about it.  Diffing complete

> On Linux I use

Thanks for the suggestions!

I'm replying in some detail mainly to make it clearer what kind
of problem i'm actually trying to solve.

> diffpdf: https://mark-summerfield.github.io/diffpdf.html

My impression is that maintained versions of that software are
unusable because they are proprietary, closed-source, and
Windows-only. software.  (I may be misunderstanding, though.)

> (older versions are available with GPL)

Using outdated, unmaintained software from people who have gone
commercial sounds terrible, certainly not good enough for official
OpenBSD regression tests.  Apart from that, if i understand
correctly, what this software does does not help at all with
regression testing.

The problem that needs to be solved is to distinguish relevant
changes from irrelevant changes.  Yes, i know that distinction
is fuzzy - that is precisely the problem.  As i said, the job
is to avoid over-testing for irrelevant details.  The problem
is that i somehow need a way to specify which changes are
relevant and which are not, and i'm lacking ideas how to
specify that, so i'm looking for software supporting me in
writing such specifications of relevance, and then checking
whether the files are similar or different with respect to
these specifications, and if they are different, reporting the
result in the form of a (preferably concise) UTF-8 *.txt file.

Even if the diffpdf software would address the actual problem,
i.e. separating relevant and irrelevant changes, the output format
is completely useless.  If i understand correctly, if it finds
differences, it produces a report in the form of a PDF file -
and a PDF report is obviously unusable for regression testing.

> Here are some tools:
> https://www.baeldung.com/linux/pdf-cli-gui-compare

That writeup suggests three options:

 * diffpdf
   That's the same GUI program for visually highlighting
   differences, and as such obviously not suitable for
   regression testing.

 * pdftotext + (meld or diff)
   I'm using pdftotext and (of course) diff(1) on a regular basis,
   but obviously, pdftotext cannot be used for testing mandoc(1)
   -T pdf output because *some* sufficiently severe formatting
   changes would be relevant, but are lost in pdftotext.
   Essentially, using pdftotext for the tests would imply that
   what the PDF tests would actually test would be a (small!)
   subset of what the -T ascii and -T uft8 tests test anyway.

 * Draftable
   Let me quote: "Draftable is a Web-based platform".
   Certainly not usable for regression testing.

So no, unfortunately, none of this is viable, and none of it
addresses the actual problem of how to prevent over-testing.

Yours,
  Ingo

Reply via email to