Before trying to bias / weight the MC playouts, it would be worth trying a pure-MC approach. As you've described it below, this would be "Give all players random cards, then play the game out randomly". If you have access to the rules-based bot, that is ideal, as you have a fixed-strength opponent you can test against. Although pure-MC in Go has been left behind by MCTS, it should be a good place to start to validate the approach. The fact that the players will respond with bad moves most of the time doesn't invalidate the approach (at least in Go). I wouldn't go down the route of playing out with deterministic rules as the choice of these could have a major influence on the validity of the playout results.
The problem of imperfect information reminds me a bit of backgammon, where there is perfect information, but the dice rolls mean a huge branching factor and similar uncertainty about possible future moves. A neural network that did TD learning through self-play was a successful approach in that domain (google TD Gammon to find the paper by Tesauro). Oliver On Fri, May 14, 2010 at 4:30 PM, Isaac Deutsch <[email protected]> wrote: > Hi all, > > I'm thinking about creating a computer player for Tichu ( > http://en.wikipedia.org/wiki/Tichu), a game that is rather widespread here > in Switzerland. However, to my knowledge there exists no official bot that > plays it. Someone I know has created a bot that plays using rules only (if > X, play cards YZ), but his findings were that the bot plays pretty weak. > > Of course, the game is solvable when there are only 2 players left because > then, the distribution of cards is clear and the best strategy can be > calculated. > > With 3 or even all 4 players still in the game, it is clear that it is not > clear which player has which cards. :) It is a game of imperfect > information. I was wondering if Monte Carlo (MC) can and should be used at > this stage of the game. It seems that there has been some success of MC in > poker. > If yes, how would MC be used? When a move is to be made, should all legal > moves be generated, and should a number of playouts be simulated for each > (1-ply MC)? What do the playouts look like? Give all players (weighted) > random cards, then play the game out with deterministic rules? Give all > players (weighted) random cards, then play the game out (weighted) randomly? > I'm pretty much just trying to think of something based on what 'works' in > Go. :) Because of the uncertainty in the distribution of cards gives so many > possible combinations, it seems impracticable to create a tree with all > possible distributions. > > If no, what alternatives are there? Neural networks? > > Greetings, > Isaac > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
