[computer-go] Mogo Opening, Building Strategy ?
Hi. Is there any theoretical reasons for the Mogo Opening being built out of self play, rather than by spending time increasing the number of simulations at the root, and after a time, keeping what seems to be the best ? Obviously one could object that increasing the number of simulations would lead to out of memory. But then it seems that there are ways of killing the less explored branches. Another thing also, is that it may be far easier to get a massive parallel processing of self play. So i really wondered what would be best, both for time efficiency, and strength : spending a lot of computer power refining the tree, or using self play from a position, to assess its value ? What was the reason behind the choice of the Mogo team ? _ Inédit ! Des Emoticônes Déjantées! Installez les dans votre Messenger ! http://www.ilovemessenger.fr/Emoticones/EmoticonesDejantees.aspx___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Mogo Opening, Building Strategy ?
> > Is there any theoretical reasons for the Mogo Opening being built out of > self play, rather than by spending time increasing the number of > simulations > at the root, and after a time, keeping what seems to be the best ? > There are practical reasons: our approach can be used with humans or other programs as opponent as well; we can benefit from games launched for other purposes than opening-building; and we can easily parallelize the algorithm on grids. No other reason, at least for me - but these reasons are enough I guess. The alternate approach is nice, but is difficult to use for tenths of years of CPU - whereas using preemptable mode in grids, we can have access to a huge computational power. >From a more technical point of view, I think that the idea of using results of games of strong versions of mogo is better for avoiding biases in the MC. But it's only a conjecture. Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Mogo Opening, Building Strategy ?
Thanks a lot for your quick answer. By conjecture, i suppose you mean that no experiments yet has been ran as to assess this hypothesis ? I think Sylvain (and maybe just everyone else) has tried at some point to use a UCT decision bot, as a way to get the simulation done. Then using those high level simulations in an other UCT decision tree (or AMAF, or FirstMove wining stats) >From what i recall, the results were disappointing. Also don dailey tried to build an AMAF over a bot that would use AMAF as a decision with very little expectation that this would lead to anything worthy. I don't know how hard Sylvain tried at his time. Yet you have this feeling that using High level mogo games as a way to get a simulation done could lead to interesting results. I also have this feeling. For example, it is well known that Mogo-style decision (or crazy stone) lead to very poor understanding of seki (and/or semeai ?) Would'nt the use of high level game as simulation get to better understanding of those really nasty situations ? Then, i guess that if nobody has ever run any experiments, as to get measure of the efficiency of increasing the UCT tree against using high-level-simulation, there must be a reason ... Is it that that it is known it would consume to much time and resources ? Is it that knowing the results of this measure would prove of little value ? If there is a point where high-level-simulations really give a stronger evaluation function, wouldn't it be good to know about it ? Date: Sun, 30 Nov 2008 10:10:14 +0100 From: [EMAIL PROTECTED] To: computer-go@computer-go.org Subject: Re: [computer-go] Mogo Opening, Building Strategy ? Is there any theoretical reasons for the Mogo Opening being built out of self play, rather than by spending time increasing the number of simulations at the root, and after a time, keeping what seems to be the best ? There are practical reasons: our approach can be used with humans or other programs as opponent as well; we can benefit from games launched for other purposes than opening-building; and we can easily parallelize the algorithm on grids. No other reason, at least for me - but these reasons are enough I guess. The alternate approach is nice, but is difficult to use for tenths of years of CPU - whereas using preemptable mode in grids, we can have access to a huge computational power. >From a more technical point of view, I think that the idea of using results of >games of strong versions of mogo is better for avoiding biases in the MC. But it's only a conjecture. Olivier _ Email envoyé avec Windows Live Hotmail. Dites adieux aux spam et virus, passez à Hotmail ! C'est gratuit ! http://www.windowslive.fr/hotmail/default.asp___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Mogo Opening, Building Strategy ?
> > > By conjecture, i suppose you mean that > no experiments yet has been ran as > to assess this hypothesis ? > Yes. The other reasons were sufficient :-) I think Sylvain (and maybe just everyone else) has tried > at some point to use a UCT decision bot, as a way to > get the simulation done. Then using those high level > simulations in an other UCT decision tree (or AMAF, > or FirstMove wining stats) > From what i recall, the results were disappointing. > At least, it has been clearly established that "replacing the random player by a stronger player" does not imply "the Monte-Carlo program built on top of the random player becomes stronger" (even with fixed number of simulations) But, it is also clearly established that the building of the opening book by self-play clearly works, whereas it is roughly the same idea. I guess the reason is the difference of strength of the player - a MCTS (Monte-Carlo Tree Search - I don't write UCT here as it is not UCT in mogo) built on top of a perfect player should be a perfect player (this is formally obvious). So perhaps for huge computational power, this approach (building MCTS on top of MCTS) is consistent. > > For example, it is well known that Mogo-style decision > (or crazy stone) lead to very poor understanding of seki > (and/or semeai ?) Would'nt the use of high level game > as simulation get to better understanding of those > really nasty situations ? > I hope so, at least for 9x9 and with really huge computational power. (by the way I'm afraid we have to patch semeai manually) > > Is it that that it is known it would consume to much time and resources ? > I think it is really difficult to do that - you have to dump your results unless you have a machine for a huge time without any reboot, also if you want to have a huge computational effort you have to parallelize it - I think this is really hard. But perhaps trying mogo against mogo built on top of a mogo with e.g. 1s/move would be fine... this is easy to organize. Well, it requires some time, and time is always expensive :-) Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Mogo Opening, Building Strategy ?
On Sun, 2008-11-30 at 13:38 +0100, Olivier Teytaud wrote: > But, it is also clearly established that the building of the opening > book by self-play > clearly works, whereas it is roughly the same idea. I guess the reason > is the > difference of strength of the player - a MCTS (Monte-Carlo Tree Search > - I don't > write UCT here as it is not UCT in mogo) built on top of a perfect > player should > be a perfect player (this is formally obvious). So perhaps for huge > computational > power, this approach (building MCTS on top of MCTS) is consistent. I've always had this idea that the best way to build an opening book is the best way to build a general playing engine. You are trying to solve the same exact problem - what is the best move in this position? - Don signature.asc Description: This is a digitally signed message part ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RAVE formula of David Silver (reposted)
Indeed, the scaling question is very important. Even though I think I have AMAF/ RAVE working now, it's still not so clear-cut what it's worth. With just 2,000 playouts I'm seeing a 88% win-rate against plain old UCT tree-search without RAVE. At 10,000 playouts this win- rate drops to 75%. With 50,000 to 69%. All of these results have a margin of error of a few points, but the trend is obvious. UCT plays weaker than UCT+RAVE but it scales a little better. This doesn't necessarily mean they converge. From the few data-points that I have it looks like UCT+RAVE might converge to a winning rate of 66% against plain UCT search with playouts in the hundred-thousands or millions. Is that about 100 ELO points? That in itself would be justification enough to keep it. But there's a computation-cost as well. Plus, as soon as you start to introduce other move-selection procedures it may eat into the gain RAVE provides even further. Anyhow, the way I have it set now I can easily switch between using AMAF information to compute RAVE or not. There are also still some parameters to tune. So this is not the last word on it by far. It's more like a good starting point. Also, even if it's not something to use in a final playing-engine, it's good to have a 'base-line' that provides the best possible time/strength combination to run quick tests against. Is there actually a well-understood basis for the diminishing return of UCT+RAVE vs. UCT? I have given it a little thought, but it's not entirely obvious to me why UCT+RAVE wouldn't scale better than what I'm seeing. I've also run into a few 'fluke' results. Winning streaks of a dozen games in a row (or more) happen between equally strong programs. So to be reasonably sure I'd like to get about 1,000 games. If you want to make sure two implementations are equivalent, like in case of the ref-bots, I'd recommend 10,000 games. If all I want to know is whether something is an improvement or not, then I usually settle for fewer games. If after a (few) hundred games I see a win-rate of 50% or less I decide it's not an improvement (not one worth anything anyway), if I see a winning-rate of around 60% or more I keep it. Anything in between I might decide to let it run a bit more. The improvements that I keep I run with longer thinking times overnight to see if they scale. After all, the only real test worth anything is under realistic playing circumstances. Mark On 29-nov-08, at 11:32, Don Dailey wrote: On Sat, 2008-11-29 at 11:58 +0100, Denis fidaali wrote: From my own experience, an important testing case whenever trying to get AMAF to work, is the scaling study. No truer words ever spoken. This is one of the secrets to strong programs, if they scale, they are probably soundly designed. I do that with Chess. I find that some program changes scale up, particular sound algorithms that reduce the branching factor. I have to run tests pretty fast in order to get results I can interpret, but I also plot the result visually with gnuplot. As many here will recall, my own Fatman study vs Leela showed that Leela scaled better with increasing depth than Fatman.Nothing like a graph to reveal this very clearly, although you can also look at the numbers if you are careful. It's important to point out that you will be completely mislead if you don't get enough samples. It's very rare that 100 or 200 games are enough to draw any conclusions (unless it's really lopsided.) I remember once thinking I had found a clear scalable improvement but decided that it must run longer - but I was hopeful. When the improvement started to decline, I discovered that I had by accident been running the same exact program against itself. The point is that it is not uncommon to get really "lucky", and have an equal program look substantially superior - for a while. - Don My prototype was quite strong considering that it used only 1000 light playout (and score 25-30 % win against gnugo lvl 0), but it seemed to not get much over that as the number of playout grew ... (also there had a serious exponential complexity problem, which i never get into the trouble of investigating :) ) I know that Zoe was about 2000 elo with i think 50k simulations, and ... never got any real better as the number of simulations increased. Both prototype were toying with AMAF, so i really think you need a bit of scalability study whenever trying to asses an engine employing it. (although it could very well be that the scalability trouble came out of some nasty bugs. Both aforementioned prototype where quite messy ...) From: [EMAIL PROTECTED] Subject: Re: [computer-go] RAVE formula of David Silver (reposted) Date: Sat, 29 Nov 2008 03:39:58 -0200 To: computer-go@computer-go.org CC: On 28-nov-08, at 17:28, [EMAIL PROTECTED] wrote: I would be very interested to see the RAVE code from Valkyria. I'm sure others woul
Re: [computer-go] Mogo Opening, Building Strategy ?
> From: Don Dailey <[EMAIL PROTECTED]> > > I've always had this idea that the best way to build an opening book is > the best way to build a general playing engine. You are trying to > solve the same exact problem - what is the best move in this position? When building an opening book, you have the advantage of not playing against the clock. In fact, a good opening book ( one which your game-playing engine knows how to use ) can shave time during the game itself. "Knows how to use" can be subtle; many joseki depend on knowing the exact line of play, the move choices depend on knowing the exact ( not approximate ) results of ladders and semeai. Against better players, approximate answers tend toward disaster. A book move might work sometimes, not others, and the program won't know the difference. I think the "opening book" and "general playing engine" solve similar problems: what is the best move which can be discovered with finite resources? The opening problem must solve an additional side constraint: it must suggest moves which can be correctly exploited by the playing engine, which may have less time and computational power available. A sufficiently broad and deep book can make up for lack of computational resources during the game; such a book needs to know the best refutations for each of many possible plays. Joseki and fuseki books for humans are full of annotations: "move a is used when the ladder works; otherwise, b is recommended; c is a mistake, which can be exploited by ..." A book for computer programs might need similar annotations. Some programs may need different books, depending on whether they have a fast or slow clock, one or many processors. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RAVE formula of David Silver (reposted)
On Nov 30, 2008, at 11:49 AM, Mark Boon <[EMAIL PROTECTED]> wrote: Indeed, the scaling question is very important. Even though I think I have AMAF/ RAVE working now, it's still not so clear-cut what it's worth. With just 2,000 playouts I'm seeing a 88% win-rate against plain old UCT tree-search without RAVE. At 10,000 playouts this win- rate drops to 75%. With 50,000 to 69%. All of these results have a margin of error of a few points, but the trend is obvious. UCT plays weaker than UCT+RAVE but it scales a little better. This doesn't necessarily mean they converge. From the few data-points that I have it looks like UCT+RAVE might converge to a winning rate of 66% against plain UCT search with playouts in the hundred-thousands or millions. Is that about 100 ELO points? That in itself would be justification enough to keep it. But there's a computation-cost as well. Plus, as soon as you start to introduce other move-selection procedures it may eat into the gain RAVE provides even further. Anyhow, the way I have it set now I can easily switch between using AMAF information to compute RAVE or not. There are also still some parameters to tune. So this is not the last word on it by far. It's more like a good starting point. Also, even if it's not something to use in a final playing-engine, it's good to have a 'base-line' that provides the best possible time/strength combination to run quick tests against. Is there actually a well-understood basis for the diminishing return of UCT+RAVE vs. UCT? I have given it a little thought, but it's not entirely obvious to me why UCT+RAVE wouldn't scale better than what I'm seeing. I've also run into a few 'fluke' results. Winning streaks of a dozen games in a row (or more) happen between equally strong programs. So to be reasonably sure I'd like to get about 1,000 games. If you want to make sure two implementations are equivalent, like in case of the ref-bots, I'd recommend 10,000 games. If all I want to know is whether something is an improvement or not, then I usually settle for fewer games. If after a (few) hundred games I see a win-rate of 50% or less I decide it's not an improvement (not one worth anything anyway), if I see a winning-rate of around 60% or more I keep it. Anything in between I might decide to let it run a bit more. The improvements that I keep I run with longer thinking times overnight to see if they scale. After all, the only real test worth anything is under realistic playing circumstances. You've claimed to be non-statistical, so I'm hoping the following is useful... You can compute the likelihood that you made an improvement as: erf(# of standard deviations) Where # of standard deviations = (win rate - 0.5)/sqrt(#games) Erf is ill-defined, and in practice, people use lookup tables to translate between standard deviations and confidence levels. In practice, people set a goal confidence and directly translate it to a number of standard deviations (3.0 for 99.85%). This situation requires the one-tailed p test. After about 20 or 30 games, this approximation is accurate and can be used for early termination of your test. Mark On 29-nov-08, at 11:32, Don Dailey wrote: On Sat, 2008-11-29 at 11:58 +0100, Denis fidaali wrote: From my own experience, an important testing case whenever trying to get AMAF to work, is the scaling study. No truer words ever spoken. This is one of the secrets to strong programs, if they scale, they are probably soundly designed. I do that with Chess. I find that some program changes scale up, particular sound algorithms that reduce the branching factor. I have to run tests pretty fast in order to get results I can interpret, but I also plot the result visually with gnuplot. As many here will recall, my own Fatman study vs Leela showed that Leela scaled better with increasing depth than Fatman.Nothing like a graph to reveal this very clearly, although you can also look at the numbers if you are careful. It's important to point out that you will be completely mislead if you don't get enough samples. It's very rare that 100 or 200 games are enough to draw any conclusions (unless it's really lopsided.) I remember once thinking I had found a clear scalable improvement but decided that it must run longer - but I was hopeful. When the improvement started to decline, I discovered that I had by accident been running the same exact program against itself. The point is that it is not uncommon to get really "lucky", and have an equal program look substantially superior - for a while. - Don My prototype was quite strong considering that it used only 1000 light playout (and score 25-30 % win against gnugo lvl 0), but it seemed to not get much over that as the number of playout grew ... (also there had a serious exponential complexity problem, which i never get into the trouble of inves
Re: [computer-go] RAVE formula of David Silver (reposted)
On Sun, 2008-11-30 at 14:49 -0200, Mark Boon wrote: > Indeed, the scaling question is very important. Even though I think I > have AMAF/ RAVE working now, it's still not so clear-cut what it's > worth. With just 2,000 playouts I'm seeing a 88% win-rate against > plain old UCT tree-search without RAVE. At 10,000 playouts this win- > rate drops to 75%. With 50,000 to 69%. All of these results have a > margin of error of a few points, but the trend is obvious. UCT plays > weaker than UCT+RAVE but it scales a little better. Not necessarily. Start with the UCT and UCT+RAVE and find a level where they are equivalent in strength. Whatever point this is, the UCT version will take much longer to search. Now, test a version with some constant factor more playouts (or time.) For instance test a version of each that does 3X more work. Which is stronger now? That is how you tell if it scales better or not. You must start from the same point ELO-wise. My idea is that if it doesn't scale, there may be some fundamental improvement still waiting to be found.In the worst case you can gradually phase it out. Of course as you say they may converge, perhaps in som asymptotic way so that at any given level of effort, the RAVE enhancement always helps, but less and less so. - Don > This doesn't > necessarily mean they converge. From the few data-points that I have > it looks like UCT+RAVE might converge to a winning rate of 66% > against plain UCT search with playouts in the hundred-thousands or > millions. Is that about 100 ELO points? That in itself would be > justification enough to keep it. But there's a computation-cost as > well. Plus, as soon as you start to introduce other move-selection > procedures it may eat into the gain RAVE provides even further. > > Anyhow, the way I have it set now I can easily switch between using > AMAF information to compute RAVE or not. There are also still some > parameters to tune. So this is not the last word on it by far. It's > more like a good starting point. Also, even if it's not something to > use in a final playing-engine, it's good to have a 'base-line' that > provides the best possible time/strength combination to run quick > tests against. > > Is there actually a well-understood basis for the diminishing return > of UCT+RAVE vs. UCT? I have given it a little thought, but it's not > entirely obvious to me why UCT+RAVE wouldn't scale better than what > I'm seeing. > > I've also run into a few 'fluke' results. Winning streaks of a dozen > games in a row (or more) happen between equally strong programs. So > to be reasonably sure I'd like to get about 1,000 games. If you want > to make sure two implementations are equivalent, like in case of the > ref-bots, I'd recommend 10,000 games. > > If all I want to know is whether something is an improvement or not, > then I usually settle for fewer games. If after a (few) hundred games > I see a win-rate of 50% or less I decide it's not an improvement (not > one worth anything anyway), if I see a winning-rate of around 60% or > more I keep it. Anything in between I might decide to let it run a > bit more. The improvements that I keep I run with longer thinking > times overnight to see if they scale. After all, the only real test > worth anything is under realistic playing circumstances. > > Mark > > On 29-nov-08, at 11:32, Don Dailey wrote: > > > On Sat, 2008-11-29 at 11:58 +0100, Denis fidaali wrote: > >> > >> From my own experience, an important testing case whenever trying > >> to get AMAF to work, is the scaling study. > >> > > > > No truer words ever spoken. This is one of the secrets to strong > > programs, if they scale, they are probably soundly designed. I do > > that > > with Chess. I find that some program changes scale up, particular > > sound > > algorithms that reduce the branching factor. I have to run tests > > pretty > > fast in order to get results I can interpret, but I also plot the > > result > > visually with gnuplot. > > > > As many here will recall, my own Fatman study vs Leela showed that > > Leela scaled better with increasing depth than Fatman.Nothing > > like a > > graph to reveal this very clearly, although you can also look at the > > numbers if you are careful. > > > > It's important to point out that you will be completely mislead if you > > don't get enough samples. It's very rare that 100 or 200 games are > > enough to draw any conclusions (unless it's really lopsided.) I > > remember once thinking I had found a clear scalable improvement but > > decided that it must run longer - but I was hopeful. When the > > improvement started to decline, I discovered that I had by accident > > been > > running the same exact program against itself. > > > > The point is that it is not uncommon to get really "lucky", and > > have an > > equal program look substantially superior - for a while. > >
Re: [computer-go] Mogo Opening, Building Strategy ?
It's true that building an opening book in an automated way can be done off-line which gives us more resources. That's really the basis for this thought that we are trying to solve the same problem. As a thought experiment, imagine some day in the future, when computers are 1 thousand times faster. What would you do with that extra "time" to make your program more scalable?You might be able to use your book building algorithm, whatever it might be, to find moves for you. If there is a better way, then you would use it instead. But if there is a better way why are you not using it for creating book moves? Of course the details of how to do it may not be exactly the same - because we may be forced to simulate in memory hash tables with disk space for instance, or to deal with other real-world constraints. However I think that at least in principle, we should be doing the same thing. MCTS really feels to me like a superb book building algorithm. Computer Chess books (at least the automated part) are built essentially by taking millions of games from master play and picking out the ones that seem to work best. Those games are like playouts. The moves that score the best are played the most. We have a kind of MCTS here. - Don On Sun, 2008-11-30 at 09:15 -0800, terry mcintyre wrote: > > From: Don Dailey <[EMAIL PROTECTED]> > > > > I've always had this idea that the best way to build an opening book is > > the best way to build a general playing engine. You are trying to > > solve the same exact problem - what is the best move in this position? > > When building an opening book, you have the advantage of not playing against > the clock. In fact, a good opening book ( one which your game-playing engine > knows how to use ) can shave time during the game itself. > > "Knows how to use" can be subtle; many joseki depend on knowing the exact > line of play, the move choices depend on knowing the exact ( not approximate > ) results of ladders and semeai. Against better players, approximate answers > tend toward disaster. A book move might work sometimes, not others, and the > program won't know the difference. > > I think the "opening book" and "general playing engine" solve similar > problems: what is the best move which can be discovered with finite > resources? The opening problem must solve an additional side constraint: it > must suggest moves which can be correctly exploited by the playing engine, > which may have less time and computational power available. A sufficiently > broad and deep book can make up for lack of computational resources during > the game; such a book needs to know the best refutations for each of many > possible plays. Joseki and fuseki books for humans are full of annotations: > "move a > is used when the ladder works; otherwise, b is recommended; c is a > mistake, which can be exploited by ..." A book for computer programs > might need similar annotations. > > Some programs may need different books, depending on whether they have a fast > or slow clock, one or many processors. > > > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ signature.asc Description: This is a digitally signed message part ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Mogo Opening, Building Strategy ?
- Original Message > From: Don Dailey <[EMAIL PROTECTED]> > > MCTS really feels to me like a superb book building algorithm. > Computer Chess books (at least the automated part) are built essentially > by taking millions of games from master play and picking out the ones > that seem to work best. Those games are like playouts. The moves > that score the best are played the most. We have a kind of MCTS here. Interesting, the book moves are not generated by random playouts, but by professional ( or highly skilled ) players. In the Chess world, what is meant by "picking out the ones that seem to work best?" My impression is that computer go programs do not, at this stage in their evolution, make good use of professional book moves; too much professional knowledge is actually in the part of the tree which is almost never played in pro-pro games - the "how to beat up on mistakes" and "what not to do when playing against a pro" parts of the tree. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Mogo Opening, Building Strategy ?
On Sun, 2008-11-30 at 14:33 -0800, terry mcintyre wrote: > > - Original Message > > From: Don Dailey <[EMAIL PROTECTED]> > > > > MCTS really feels to me like a superb book building algorithm. > > Computer Chess books (at least the automated part) are built essentially > > by taking millions of games from master play and picking out the ones > > that seem to work best. Those games are like playouts. The moves > > that score the best are played the most. We have a kind of MCTS here. > > > Interesting, the book moves are not generated by random playouts, but by > professional ( or highly skilled ) players. Many chess opening books are created by a statistical analysis of lots of human games. Some books are created by hand. Very tediously. Even the ones created with the aid of human games are sometimes modified by hand, at least the top notch books. But in the context of this discussion we note that books can and are created solely from databases of top quality games. You can get a reasonable book that way. > In the Chess world, what is meant by "picking out the ones that seem to work > best?" What I mean is that you look at the statistics of the moves and base your opening book on the moves that gave the best results. You can also go by the moves that are played the most - with the assumption that if they are played a lot they must be good. It is typical to do a combination of both - if it's played a lot and also scores good, use it. I think some have tried mini-max too. It's possible that a move seems to has great success in general, but not if it's responded to in a certain way. It could be that in recent months or years a refutation has been found, and that a move that used to work really well has been found to be bad. > > My impression is that computer go programs do not, at this stage in their > evolution, make good use of professional book moves; too much professional > knowledge is actually in the part of the tree which is almost never played in > pro-pro games - the "how to beat up on mistakes" and "what not to do when > playing against a pro" parts of the tree. Even in chess, despite the awesome strength of the programs, human knowledge of the openings still reigns supreme, although it's now the case that computers are helping to build opening theory by finding new moves - in chess these are called "theoretical novelties" and computers have produced many of them from what I understand. - Don > > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ signature.asc Description: This is a digitally signed message part ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Mogo Opening, Building Strategy ?
You can read about some such novelties found using Rybka here: http://www.rybkachess.com/index.php?auswahl=Rybka+3+book Don Dailey wrote: On Sun, 2008-11-30 at 14:33 -0800, terry mcintyre wrote: - Original Message From: Don Dailey <[EMAIL PROTECTED]> MCTS really feels to me like a superb book building algorithm. Computer Chess books (at least the automated part) are built essentially by taking millions of games from master play and picking out the ones that seem to work best. Those games are like playouts. The moves that score the best are played the most. We have a kind of MCTS here. Interesting, the book moves are not generated by random playouts, but by professional ( or highly skilled ) players. Many chess opening books are created by a statistical analysis of lots of human games. Some books are created by hand. Very tediously. Even the ones created with the aid of human games are sometimes modified by hand, at least the top notch books. But in the context of this discussion we note that books can and are created solely from databases of top quality games. You can get a reasonable book that way. In the Chess world, what is meant by "picking out the ones that seem to work best?" What I mean is that you look at the statistics of the moves and base your opening book on the moves that gave the best results. You can also go by the moves that are played the most - with the assumption that if they are played a lot they must be good. It is typical to do a combination of both - if it's played a lot and also scores good, use it. I think some have tried mini-max too. It's possible that a move seems to has great success in general, but not if it's responded to in a certain way. It could be that in recent months or years a refutation has been found, and that a move that used to work really well has been found to be bad. My impression is that computer go programs do not, at this stage in their evolution, make good use of professional book moves; too much professional knowledge is actually in the part of the tree which is almost never played in pro-pro games - the "how to beat up on mistakes" and "what not to do when playing against a pro" parts of the tree. Even in chess, despite the awesome strength of the programs, human knowledge of the openings still reigns supreme, although it's now the case that computers are helping to build opening theory by finding new moves - in chess these are called "theoretical novelties" and computers have produced many of them from what I understand. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RAVE formula of David Silver (reposted)
On 30-nov-08, at 16:51, Jason House wrote: You've claimed to be non-statistical, so I'm hoping the following is useful... You can compute the likelihood that you made an improvement as: erf(# of standard deviations) Where # of standard deviations = (win rate - 0.5)/sqrt(#games) Erf is ill-defined, and in practice, people use lookup tables to translate between standard deviations and confidence levels. In practice, people set a goal confidence and directly translate it to a number of standard deviations (3.0 for 99.85%). This situation requires the one-tailed p test. After about 20 or 30 games, this approximation is accurate and can be used for early termination of your test. Lately I use twogtp for my test runs. It computes the winning percentage and puts a ± value after it in parenthesis. Is that the value of one standard deviation? (I had always assumed so.) Even after a 1,000 games it stays in the 1.5% neighbourhood. Maybe 20-30 games is usually an accurate approximation. But if you perform tests often, you'll occasionally bump into that unlikely event where what you thought was a big improvement turned out to be no improvement at all. Or the other way around. Only when I see 20+ games with a zero winning percentage do I stop it, assuming I made a mistake. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Test set for reference bots
I think I've seen some discussion a while ago about a set of tests used to test playing engines. I suppose it would make sense to compile such a set for the reference bots. Matching a win-rate and game-length after 1,000,000 playouts is a quick first milestone for verification that can be done quickly. A winning rate of something very close to 50% after thousands of games takes hours if not a whole day. A set of situations with their expected result can provide something in between. Maybe the original set posted by Yamato can be used as a starting point. But I'd also include some very simple situations that check whether the engines properly implement things like ko, super-ko, seki, end of game etc... So far I've been using unit-tests for things like this. But that's not very easy to transfer to other engine implementations. I haven't used the GoGui regression-test application yet but I'm planning to look into it. The one thing I did notice is that the problem diagram and the problem's solution are separated. Wouldn't it have been easier to include the latter into the SGF? A simple format like "#c? [result-set]" where c is the color to play, similar as the format already used. Put that as a comment in the first or last SGF node should be easy enough to parse. Or you could even define a special SGF property for this information. Maybe there's a well thought out reason why it is the way it is? Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/