Re: [computer-go] regression testing

Gunnar Farnebäck Wed, 03 Dec 2008 14:14:06 -0800

Mark Boon wrote:
> Yes, I asked that question.
>
> So GnuGo already uses a comment-node to store the information in the SGF
> tree. But twogtp uses information from the .tst file. So why the
> difference?


No, GNU Go does not put the tests in the sgf files. We did so for a
short while long ago, but it wasn't manageable. Instead we started to
write tests as GTP commands (which was one of the reasons for
introducing GTP in the first place) and soon also to include the
correct answers as GTP comments in .tst files. This format has served
us well.

For those who need more context it can look like this, from
endgame.tst:

# E5 is two points gote, J3 one point sente, and J2 three points gote.
# J2 wins the game by 0.5, all other moves lose.
loadsgf games/endgame14.sgf
1000 reg_genmove black
#? [J2]

The first two lines are comments about the test, the third line loads
the position, the fourth line specifies the test, and the fifth line
is a specially formatted comment defining the correct answer. There
may also be multiple correct answers like in this test:

# See also 9x9:250.
loadsgf games/nngs/evand-gnugo-3.5.2gf1-200312161910.sgf 52
207 attack A6
#? [3 (B4|C4|C1)]*

This is a tactical reading test which is expected to give the result
code 3 (win by ko where the attacker must make the first ko threat)
for either of the moves B4, C4, and C1. There is also an asterisk at
the end indicating that this test is expected to fail.

Tests can also be more complex like this:

# See also connection:119.
loadsgf games/kgs/llk-GNU.sgf 150
trymove W N10
trymove B M10
trymove W M9
trymove B L10
trymove W M11
trymove B L14
219 defend K12
#? [1 M14]*
popgo
popgo
popgo
popgo
popgo
popgo

This is derived from a failed connection test where it turns out that
the problem is a tactical misreading starting six moves deep in the
connection reading, which is set up with the six trymove commands and
cleared up with the corresponding popgo commands.


There are a number of reasons why it's useful to separate the test
definitions from the sgf files. I'll list a few of them:

1. When working in a distributed group you need a succinct way of
referring to specific tests, such as endgame:1000 for the first
example above (the test numbered 1000 in the file
endgame.tst). Talking about "the fifth test at move number 150 in the
file games/kgs/11k-GNU.sgf" is way too inconvenient. Short names are
also good in status reports like
http://trac.gnugo.org/gnugo/wiki/RegressionResults
or
http://trac.gnugo.org/gnugo/ticket/199 (click one of the "details" links)

2. The marks for expected failures need to be updated periodically
(e.g. at the time of a release) when the state changes and it's nicer
to have your revision control showing repeated changes in a few tens
of tst files than in thousands of sgf files. Yes, you really want to
keep track of the expected status. For the record GNU Go 3.7.12
contains 86 tst files with a total of 5445 tests from 1760 sgf files.

3. It's practical to be able to say "run the tests in semeai.tst" for
a quick sanity check of a change to the semeai reading.

Btw, the sgf file with a test defined in a comment, that Terry dug up,
was kindly contributed by Tristan Cazenave in a collection of test
cases from his Golois program. That sgf comment has never been used by
GNU Go.

/Gunnar
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] regression testing

Reply via email to