This is nice idea and  this is to a degree what GnuGo regression test
does. But as there is more than one way to skin the cat, it will not
capture true strength of the programs. If you put Mogo to solve the
tests from a book for example 501 opening problems, it will probably
fail minimum of 75% of them because it has completely different style
from what is expected from humans. This will also lead to different
solution in connecting and capturing. MC program may reach the
conclusion that it best option to ignore the whole connection issue.
They often do exactly that.

Petri



2009/4/8 Zhiheng Zheng <zhiheng.zh...@gmail.com>:
> I have some ideas.
>
> When I learned go, I saw some small go books with many test suits for Life
> and Death, Connect etc. And each test have 4 candidate moves, each move has
> a score, best is 10 2nd is 6 3nd is 3 and the worst one is 0. After I finish
> a test, I will go a score. And a range of score is mapped to a level like 1d
> or 3d etc. Maybe we can gather these test suits from existed go books, and
> run tests on our program , then we can know how strong it is.
>
> Another idea is more automatical one. Every computer go program has its own
> regression test suit. We can gather them togather, and let different existed
> programs  to run the tests. And we can use our own program to run the test,
> and compare our results to exsited programs results, then we can know if our
> program is stronger or weaker than the existed programs.
>
> Zhiheng Zheng
>
>
>
> 2009/4/8 terry mcintyre <terrymcint...@yahoo.com>
>>
>> Amen to that. When using positions to judge the strength of a program, one
>> would need to test not just one "pro move", but a sequence of plays --
>> including some which don't appear in pro games. A pro knows how to deal
>> decisively not only with the optimal plays of other pros, but also with
>> suboptimal plays from the rest of us. Programs are often even stranger than
>> human players.
>> If I were designing a test set, I'd ask pros to defeat the program, and
>> would convert the blunders into a test set. To improve, the program would
>> have to generalize the lessons learned from those test cases.
>>
>> Terry McIntyre <terrymcint...@yahoo.com>
>>
>> -- People never lie so much as after a hunt, during a war or before an
>> election. -
>> Otto von Bismarck
>>
>> ________________________________
>> From: steve uurtamo <uurt...@gmail.com>
>> To: computer-go <computer-go@computer-go.org>
>> Sent: Tuesday, April 7, 2009 5:12:27 PM
>> Subject: Re: [computer-go] Fast ways to evaluate program strength.
>>
>> otherwise pair-go wouldn't be as funny to watch.
>>
>> s.
>>
>> On Tue, Apr 7, 2009 at 8:05 PM, Michael Williams
>> <michaelwilliam...@gmail.com> wrote:
>> > Łukasz Lew wrote:
>> >>
>> >> I would like to rephrase my question:
>> >> Let's measure prediction of pro moves of a whole engine while
>> >> modifying heavy playouts / MCTS in the engine.
>> >> How well might it work?
>> >
>> > Probably not well.  Because what matters is not how often you play
>> > strong
>> > moves, but how often you avoid blunders.
>> >
>> > _______________________________________________
>> > computer-go mailing list
>> > computer-go@computer-go.org
>> > http://www.computer-go.org/mailman/listinfo/computer-go/
>> >
>> _______________________________________________
>> computer-go mailing list
>> computer-go@computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>
>>
>> _______________________________________________
>> computer-go mailing list
>> computer-go@computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>



-- 
Petri Pitkänen
e-mail: petri.t.pitka...@gmail.com
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to