Could use late-move reductions to eliminate the hard pruning. Given the accuracy rate of the policy network, I would guess that even move 2 should be reduced.
-----Original Message----- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Hiroshi Yamashita Sent: Saturday, May 20, 2017 3:42 PM To: computer-go@computer-go.org Subject: [Computer-go] mini-max with Policy and Value network Hi, HiraBot author reported mini-max search with Policy and Value network. It does not use monte-carlo. Only top 8 moves from Policy is searched in root node. In other depth, top 4 moves is searched. Game result against Policy network best move (without search) Win Loss winrate MaxDepth=1, (558-442) 0.558 +40 Elo MaxDepth=2, (351-150) 0.701 +148 Elo MaxDepth=3, (406-116) 0.778 +218 Elo MaxDepth=4, (670- 78) 0.896 +374 Elo MaxDepth=5, (490- 57) 0.896 +374 Elo MaxDepth=6, (520- 20) 0.963 +556 Elo Search is simple alpha-beta. There is a modification Policy network high probability moves tend to be selected. MaxDepth=6 takes one second/move on i7-4790k + GTX1060. His nega-max code http://kiyoshifk.dip.jp/kiyoshifk/apk/negamax.zip CGOS result, MaxDepth=6 http://www.yss-aya.com/cgos/19x19/cross/minimax-depth6.html His Policy network(without search) is maybe http://www.yss-aya.com/cgos/19x19/cross/DCNN-No336-tygem.html His Policy and Value network(MCTS) is maybe http://www.yss-aya.com/cgos/19x19/cross/Hiratuka10_38B100.html Thanks, Hiroshi Yamashita _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go