On 7/3/25 18:11, Russ Allbery wrote:
Karl Berry <k...@freefriends.org> writes:
[...]
I'm far from familiar with the ins and outs of TAP and its conventional
usage, but it seems like a matter of semantics to me. It can be
convenient and intuitive to write a test such that it "fails", like
Sohan's example of a syscall with invalid arguments. The failure is
expected. Thus it's really a success, and the test could/should simply
be written to succeed? Syscall fails -> test succeeds.
Oh, I see. The expected *behavior* is identical: the test is expected to
fail, and if the test passes, that's an error that the harness should
report. But the human-directed *meaning* is different: the test does not
represent some known-to-not-work bug or missing feature that will
eventually be fixed, but rather is just a more convenient way to write the
test.
This confusion comes from not clearly thinking through what the test
*is*. A proper test verifies conformance to a specification. The
example from the DejaGnu manual is a specification "The sun shall
shine." and a test "The sun is shining." which passes or fails depending
on whether it is day or night.
In this case, the specification is "The syscall shall return an error if
given an invalid argument." and the test provides an invalid argument
and expects an error return code. The test *passes* ("ok" in TAP) *if*
*and* *only* *if* the syscall reports failure when given an invalid
argument.
I haven't encountered that scenario before, and indeed I don't believe TAP
includes any semantics for that.
TAP *should* *not* have semantics for that. TAP scripts are supposed to
be runnable directly or through trivial harnesses like piping the output
through `grep 'not ok'`. If the meaning of "ok" and "not ok" could be
inverted on a test-by-test basis, there would be room for endless confusion.
-- Jacob