Re: [computer-go] Re: Should 9x9 komi be 8.0 ?]
> # One question: where _aya_ comes from or stands for? If my guess is > correct, you are confusing Hiroshi, author of Aya, and I, Hideki, > author of GGMC :). I'm sorry if I'm wrong. I did. Sorry for the confusion. :( Jonas ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] March KGS bot tournament
Reminder - it's later today In message <[EMAIL PROTECTED]>, Nick Wedd <[EMAIL PROTECTED]> writes Registration is now open for this Sunday's bot tournament. This will use full-sized boards for both divisions. It will start at 16:00 GMT, and take place in the Asian night, European evening, and American daytime. Time limits will be 45 minutes each, sudden death. It will end around half an hour before midnight GMT. Registration is as usual, and is as described at http://www.weddslist.com/kgs/how/index.html The tournaments themselves are on the KGS site at http://www.gokgs.com/tournInfo.jsp?id=366 and http://www.gokgs.com/tournInfo.jsp?id=367. Nick -- Nick Wedd[EMAIL PROTECTED] ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: Should 9x9 komi be 8.0 ?]
> From my observaion, mc chooses good moves if and only if the winning > rate is near 50%. Once it gets loosing, it plays bad moves. Surely > it's an illusion but it helps to prevent them. If it's more important to avoid being too pessimistic (ie low estimated winning rates), there are two ways to bias the formulas below: either you use a bigger lambda when the estmated true winning rate is less than .5 or you replace .5 by .55 everywhere. > >new_komi_aya = > >truncation[(.5 - old_winning_rate) + > >C * old_komi_aya - lambda / (2 * C)]^{+} > > > >if the estimated winning rate with true komi is more than .5, and > > > >new_komi_aya = > >truncation[(.5 - old_winning_rate) + > >C * old_komi_aya + lambda / (2 * C)]^{-} > > ^^ > >if the estimated winning rate with true komi is less than .5. Oups ! I forgot to divide by C... The equations here are not even homogeneous ! It should read: new_komi_ggmc = truncation[(.5 - old_winning-rate) / C + old_komi_ggmc - lambda / (2 C^2) ]^{+} if .5 - old_winning_rate < 0, and new_komi_ggmc = truncation[(.5 - old_winning-rate) / C + old_komi_ggmc + lambda / (2 C^2) ]^{-} if .5 - old_winning_rate > 0. Don't forget C < 0, so the equations make sense. Jonas ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [EMAIL PROTECTED]: Re: [computer-go] Should 9x9 komi be 8.0 ?]
Petr Baudis wrote: > The point here is to prevent the program from playing the "MC-hamete" > moves that in most cases have no hope of working, but instead still aim > at a close game and wait for some opponent's yose mistake. This closely > matches human approach to the game as well - if you are confident in > yose and see you are only little behind, you frequently don't start > overplaying hopelessly, but instead still play the best you can and hope > for an overturn in yose. That's right. This isn't just a theoretical case: it happens in practice when I play the publicly available mogo on 9x9. If I reach an endgame position where it thinks it's unlikely to win, it goes into self-destruct mode and loses. But it is better than me at the 9x9 endgame, and if it were to lower its sights and wait for me to go wrong it could win (I can verify this after the end of the game by going back to an earlier position and fiddling the komi). One way to think of this is that the computer's opponent model has broken down (ie, its effective assumption that it's playing against another instance of mogo). This also suggests that trying to investigate such a rule using self-play is unlikely to work well. -M- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: Should 9x9 komi be 8.0 ?]
Hideki Kato wrote: > [EMAIL PROTECTED]: <[EMAIL PROTECTED]>: > >>> delta_komi = 10^(K * (number_of_empty_points / 400 - 1)), >>> where K is 1 if winnig and is 2 if loosing. Also, if expected >>> winning rate is between 45% and 65%, Komi is unmodified. >>> >> There's one thing I don't like at all, there: you could get positive >> evaluation when losing, and hence play conservatively when you should >> not. That could especially be true near the end, when the situation >> drifted from bad to less bad. I think that if this happens you're >> doomed... The other way around would probably be painful too. >> > > From my observaion, mc chooses good moves if and only if the winning > rate is near 50%. Once it gets loosing, it plays bad moves. Surely > it's an illusion but it helps to prevent them. > > I don't see that, but then again I am not a very strong player myself. What I notice is that it plays very "normal" until it's pretty obvious that it's losing, not just when it varies slightly from 50% but when it doesn't vary much from zero. However, it does play more desperately once it varies significantly from 50% but certainly not "meaninglessly." I don't like using the words "good" and "bad" when describing the quality of the moves because I try to use terminology that's more descriptive (although I fail miserably many times.)In a lost position how do you distinguish one move from another when they all lose? It sounds funny to me when you say (in so many words) that once the program is losing it starts playing "bad moves." Since this is a subjective quality can we use a subjective term such as "normal" to describe moves that are cosmetically appealing to us? And perhaps "ugly" to describe moves that are not? My feeling is that in lost positions, the only thing we are trying to accomplish is to make the moves more cosmetically appealing (normal) and at best improve the programs chances of winning against weak players. After all, if the program is in bad shape, then to be completely realistic it's probably going to lose to the player that put it in this bad shape. Unless of course the program is being upset by a much weaker player which can occasionally happen too.We can't reasonably expect that if a program is quite sure that it is losing that the program that it is beating it is not going to be aware of this too. It's also a bad mistake in my opinion to try to coerce it into playing moves that are "normal" when an increasing amount of "desperation" is indeed called for.I have presented anecdotes before about how chess players have won games based on not playing as if things are normal when they are losing, but instead suddenly playing differently which usually consists of violating general principles and "normal" play. Again, I feel that this effect of moves that are not normal kick in mostly when the position is very close to 0 or 1. So what we are looking for is AT BEST a very minor improvement and we are wasting a lot of energy on this.If the goal is to make the moves more cosmetically appealing I can respect that more - that is realistic and probably even easy to accomplish (and then the goal is to do it without weakening the program too much.) It's also being considered to use this to cover over some other weakness such as nakade where the program doesn't understand the actual end of the game and is thinking it has lost by 2 or 3 stones when in fact it has a win.Aside from the fact that this is a fairly rare occurrence, I believe it should be addressed directly, not with a superficial treatment of the symptoms. So if you can make it win slightly more lost games by playing as if nothing is wrong, then more power to you. It doesn't seem reasonable to me that you should be able to do this by feeding the program false information. You are effectively saying, "you are losing, be happy with that." By the way, if this is to work (for instance for cosmetic reasons) I don't think you can apply this gradually or based on previous information. What if you are losing and the opponent plays a blunder? After all, this is what has to happen since the program is losing.You have to apply this based on information learned from the current move you are searching. You can't gradually fold it in as the game progresses and expect anything useful. >> After all, the aim of tinkering with komi is to avoid that the computer >> plays nonsensical moves, but it should know whether he must fight or >> calm down. >> > > Agree. So, it's important _when_ adjust komi or apply my method. My > object is to keep winning rate around 50%, which yields good moves. > First of all, you won't keep the rate at 50% no matter what you do. At some point the programs are able to completely resolve the position and this happens surprisingly early in many cases with good programs. If it's actually winning, then if you deduct a
Re: [computer-go] Re: Should 9x9 komi be 8.0 ?]
> I don't see that, but then again I am not a very strong player > myself. What I notice is that it plays very "normal" until it's > pretty obvious that it's losing, not just when it varies slightly from > 50% but when it doesn't vary much from zero. However, it does play > more desperately once it varies significantly from 50% but certainly not > "meaninglessly." > > I don't like using the words "good" and "bad" when describing the > quality of the moves because I try to use terminology that's more > descriptive (although I fail miserably many times.)In a lost > position how do you distinguish one move from another when they all > lose? It sounds funny to me when you say (in so many words) that > once the program is losing it starts playing "bad moves." At first move, the position is lost or won for one of the two players. Yet I am sure you could consider that some of the moves are good or bad. The only thing that matters is: Does this move increase my probability of winning against this opponent ? If two moves have the same result, which one is more beautiful ? I do not expect a computer to play Lasker way any time soon, so we might have to change the first criterion into ``against a generic opponent". Now when MC goes to the rampage, it's usually LESS efficient than if it was a little less desperate. The post of Baudis explains that well. I think the problem is that MC is overevaluating the probability that an opponent does not answer an easy threat, and, if playing against a human, lessen the probability of small imprecisions. When losing, look for an overplay. But a reasonable overplay... > Since this is a subjective quality can we use a subjective term such as > "normal" to describe moves that are cosmetically appealing to us? > And perhaps "ugly" to describe moves that are not? > > My feeling is that in lost positions, the only thing we are trying to > accomplish is to make the moves more cosmetically appealing (normal) and > at best improve the programs chances of winning against weak players. > After all, if the program is in bad shape, then to be completely > realistic it's probably going to lose to the player that put it in this > bad shape. Unless of course the program is being upset by a much > weaker player which can occasionally happen too.We can't reasonably > expect that if a program is quite sure that it is losing that the > program that it is beating it is not going to be aware of this too. > > It's also a bad mistake in my opinion to try to coerce it into playing > moves that are "normal" when an increasing amount of "desperation" is > indeed called for.I have presented anecdotes before about how chess > players have won games based on not playing as if things are normal when > they are losing, but instead suddenly playing differently which usually > consists of violating general principles and "normal" play. > > Again, I feel that this effect of moves that are not normal kick in > mostly when the position is very close to 0 or 1. So what we are > looking for is AT BEST a very minor improvement and we are wasting a lot > of energy on this. Agreed. > If the goal is to make the moves more cosmetically > appealing I can respect that more - that is realistic and probably even > easy to accomplish (and then the goal is to do it without weakening the > program too much.) Here I don't agree. Why should that weaken noticeably the program ? Same situation as before, it happens for won-lost positions: minor change in performance. > It's also being considered to use this to cover over some other weakness > such as nakade where the program doesn't understand the actual end of > the game and is thinking it has lost by 2 or 3 stones when in fact it > has a win.Aside from the fact that this is a fairly rare > occurrence, I believe it should be addressed directly, not with a > superficial treatment of the symptoms. > > So if you can make it win slightly more lost games by playing as if > nothing is wrong, then more power to you. It doesn't seem reasonable > to me that you should be able to do this by feeding the program false > information. You are effectively saying, "you are losing, be happy > with that." > > By the way, if this is to work (for instance for cosmetic reasons) I > don't think you can apply this gradually or based on previous > information. What if you are losing and the opponent plays a > blunder? After all, this is what has to happen since the program is > losing.You have to apply this based on information learned from the > current move you are searching. You can't gradually fold it in as the > game progresses and expect anything useful. I also agree about that. In fact, I think that in my previous suggestion to Hideki, winning rate after say 1000 simulations should be checked to be about that expected, and if not, komi should be adjusted again. > >> After all, the aim of tinkering with komi
RE: [computer-go] Re: Should 9x9 komi be 8.0 ?]
> > I don't like using the words "good" and "bad" when describing the > quality of the moves because I try to use terminology that's more > descriptive (although I fail miserably many times.)In a lost > position how do you distinguish one move from another when they all > lose? It sounds funny to me when you say (in so many words) that > once the program is losing it starts playing "bad moves." > > Since this is a subjective quality can we use a subjective term such as > "normal" to describe moves that are cosmetically appealing to us? > And perhaps "ugly" to describe moves that are not? > > My feeling is that in lost positions, the only thing we are trying to > accomplish is to make the moves more cosmetically appealing (normal) > and > at best improve the programs chances of winning against weak players. > After all, if the program is in bad shape, then to be completely > realistic it's probably going to lose to the player that put it in this > bad shape. This is chess thinking and it is not true for go. In chess if you have a clearly lost position (like down a piece without compensation), you can only hope for a miracle. But go is a game of accumulating points. Every player, even professionals, make mistakes in the endgame and play moves that don't optimize the score. I'm talking about endgame positions, which by definition have no unsettled groups, so we aren't talking about moves that have different probabilities of causing an opponent mistake. I'm talking about making a move that gains 2 points when there is another one that gains 3 points. If you are objectively 3 points behind with perfect play with 100 endgame moves to go, it is quite likely that you can catch up against a high Dan player. Against a low Dan player you can likely catch up 5 to 10 points in the endgame. Of course you can only catch up if you play the moves that gain the most points every time you move. If you make a move that costs a point while making an obvious threat, you are falling further behind. Good moves are the moves that gain the most points in the local situation. Often there are several goo d local moves, for example the one that gains the most points but lets the opponent move first elsewhere, and the one that gains fewer points but lets me move first elsewhere. If I can gain 5 points locally, but play a move that only gains 4 points and otherwise has no difference, it's clear that the 4 point move is a bad move. The 5 point move might not be the best move on the whole board (if there is a 6 point move somewhere else), but we can still say that the 5 point move is good and the 4 point move is bad. This has nothing to do with "cosmetically appealing". Once the endgame starts and groups are solid and endgame regions become independent, then it is all about making the move that gains the most points. This kind of endgame play is not obvious on 9x9 sine that board is so small there isn't much endgame. David ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Tactical information within simulations
There is much high-level data to be found within the MC runs, such as whether a group is alive or not, etc. Now, I don't know if it is easy to inject it back within the simulations. Another approach (not excluding the first one) would be to gather much lower-level data. It's especially sad that the playouts have to discover over and over again that particular sequences are good or bad; we certainly could incorporate this knowledge in the following playouts. To know a sequence, it is sufficient to know which move follows each move. We could store that in a way similar to ``All moves as first'' (even if I thought about that before knowing AMAF). We could record the answers to each move on the board, with the results, in the simulations. Exemple: on 1 simulations, Black plays B2 6000 times. White answers A2 2000 times (1500 wins, 500 losses) C2 3000 times (500 wins, 2500 losses) B3 1000 times (500 wins, 500 losses) Now, there could be many variants. The easiest way, and one of the least memory-hungry (already very bad from that point of view, though), would be to keep after each move how many times each possible following move has been played, and with what overall overall result. We could also keep tables of the previous move by the same player. A combination of black's previous move + (white's) last move should be enough to distinguish many situations. A more ambitious variant would be to build a tree from each possible move, similarly to the main tree. In such a case, I gather that nodes should be expanded after being OFTEN visited. In both cases, we could/should keep the information from the previous moves, while letting it dim out slowly. A simple idea would be to multiply by say .9 or .95 the number of games each move has been recorded as playing at each stage (or if storage is wins-losses, multiply this number of losses and wins). There are many difficulties, especially since we must not use too much computing power. Notably: - Globally how to blend this with the choice of moves ? It depends on the implementation. - All moves are not simultaneously available: some might already have been played. I don't know if this would really be a handicap. - A move could be an only move in one situation, while another is the only move in another situation, with the same previous moves. I gather this should not be important. This rule would encourage using the move corresponding to the more frequent situation, but the winning rate would decrease because of the other situation, and encouragement should be kept at a reasonable level. - In the case of trees, how do we take into account the sequences in the other trees leading to the same place. I mean, suppose we have a tree N4-M2-O5-M6 and a tree M2-O5-M6. It might be heavy to update everything and keep coherence. - It's probably much too heavy to check whether a win or a loss occured when the move had low or high probability to be chosen. - We have frequencies and winning rates associated to moves. Either we can compute a probabiliy distribution from this data, and blend it in some way with the existing one, or we can compute bonus/malus to each of the moves appearing, and add that to log_prob of the original distribution. The problem of the former is: how do we deal with moves that seldom/never appear after the previous moves ? We cannot really get useful information on them. The problem with the latter is that we might end up giving a bonus to already much favoured moves. I think the way we use this information should satisfy to the following properties: - If there is an only move, the recommendation should be strong enough to mainly overcome all other suggestions. Say chosen 90% of the time. - Symmetrically, a move that always loses should be banned. - If a move gets a new victory, it should get slightly more probable. Mutatis mutandis, the same for loss. - Quick to compute ! Ideally using a Bradley-Terry model or something like that would be a good idea, but unusable during a game. In any case, the ``coefficients'' we use should not be updated often. Maybe once in 1000/1 simulations if it's quick enough, otherwise once at the beginning of each move. If anyone is interested, I can try to go down to specifics, matching as best I can the existing implementation. As finding a good mathematical procedure is not easy, I shall not work on that if it does not interest anyone. Jonas ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: Should 9x9 komi be 8.0 ?]
David Fotland wrote: >> I don't like using the words "good" and "bad" when describing the >> quality of the moves because I try to use terminology that's more >> descriptive (although I fail miserably many times.)In a lost >> position how do you distinguish one move from another when they all >> lose? It sounds funny to me when you say (in so many words) that >> once the program is losing it starts playing "bad moves." >> >> Since this is a subjective quality can we use a subjective term such as >> "normal" to describe moves that are cosmetically appealing to us? >> And perhaps "ugly" to describe moves that are not? >> >> My feeling is that in lost positions, the only thing we are trying to >> accomplish is to make the moves more cosmetically appealing (normal) >> and >> at best improve the programs chances of winning against weak players. >> After all, if the program is in bad shape, then to be completely >> realistic it's probably going to lose to the player that put it in this >> bad shape. >> > > This is chess thinking and it is not true for go. In chess if you have a > clearly lost position (like down a piece without compensation), you can only > hope for a miracle. > This is true in GO too. I'm talking about the kinds of position where go program start to play "aimlessly" and they only do that when the result is like being down a queen in chess.Even being down a piece in chess is playable if there is some compensation. > But go is a game of accumulating points. Every player, even professionals, > make mistakes in the endgame and play moves that don't optimize the score. > I'm talking about endgame positions, which by definition have no unsettled > groups, so we aren't talking about moves that have different probabilities > of causing an opponent mistake. I'm talking about making a move that gains > 2 points when there is another one that gains 3 points. > I still say that go programs are not that smart, so when one of these positions arise it isn't a position that is difficult for pro's to win. If an MC programs thinks it has 5% chance to win, it's probably still playing for that 5% chance and will still play reasonable (although perhaps what we might consider "foolishly risky" moves.) At the point that it is pretty sure it is losing, perhaps 5% or less, it's like being a queen down in chess - it's going to lose and the nice nuances you are talking about don't exist. The positions you are talking about are still playable. I still keep seeing this misconception that MC programs are really sloppy and do not care about that last few points - that could not be farther from the truth.If the game is hanging on a critical move where 1 or 2 stones makes a difference, the programs is going to entirely focused on that. We forget the program is trying to be absolutely SAFE and will do anything it can to increase the winning probability.It is NOT playing foolishly risky and "squandering away" a win because it doesn't care about a few unimportant stones. THAT is precisely what a human player might do because he loses track of the exact score, or the game is close and he misses an extra critical stone here and there. > If you are objectively 3 points behind with perfect play with 100 endgame > moves to go, it is quite likely that you can catch up against a high Dan > player. Against a low Dan player you can likely catch up 5 to 10 points in > the endgame. Of course you can only catch up if you play the moves that > gain the most points every time you move. If you make a move that costs a > point while making an obvious threat, you are falling further behind. > My point is that if you need to catch up, you probably go outplayed ANYWAY early on and the odds are good that you are losing if you think your are losing with 95+ percent probability. 95% probability isn't about how many stones ahead or behind, it is an estimate of probability of winning and it could be 1/2 point or 50 points if those points are contested. If there is a fight between just 3 or 4 points that could go either way, the program is keenly aware of this and doesn't waste moves (we are judging things relatively, of course a much stronger player might think it's wasting moves but from the point of view of the program it is "fully engaged" and playing up to it's ability.) > Good moves are the moves that gain the most points in the local situation. > Often there are several goo d local moves, for example the one that gains > the most points but lets the opponent move first elsewhere, and the one that > gains fewer points but lets me move first elsewhere. If I can gain 5 points > locally, but play a move that only gains 4 points and otherwise has no > difference, it's clear that the 4 point move is a bad move. The 5 point > move might not be the best move on the whole board (if there is a 6 point > move somewhere else), but we can still say that the 5 point move i
Re: [computer-go] Re: Should 9x9 komi be 8.0 ?]
Actually, I think the solution to all of this is relatively simple. When the programs go into the state where the moves are no longer "cosmetically appealing" it's because all the moves lead to the same result, whether it be wins or losses. That being so, one solution is to impose a different move generator, one that plays cosmetically appealing moves.As long as we can get UCT to "believe" that the pretty moves are ever so slightly better, it will play those when all else is equal. In fact, my simple program ogo and anchorman behave that way a little, since I have "imposed" some logic on the moves outside of the simulations. I have a few incentives and disincentives to encourage and discourage certain moves. If I took this a little farther you might not even realize it was a monte carlo program. - Don David Fotland wrote: >> I don't like using the words "good" and "bad" when describing the >> quality of the moves because I try to use terminology that's more >> descriptive (although I fail miserably many times.)In a lost >> position how do you distinguish one move from another when they all >> lose? It sounds funny to me when you say (in so many words) that >> once the program is losing it starts playing "bad moves." >> >> Since this is a subjective quality can we use a subjective term such as >> "normal" to describe moves that are cosmetically appealing to us? >> And perhaps "ugly" to describe moves that are not? >> >> My feeling is that in lost positions, the only thing we are trying to >> accomplish is to make the moves more cosmetically appealing (normal) >> and >> at best improve the programs chances of winning against weak players. >> After all, if the program is in bad shape, then to be completely >> realistic it's probably going to lose to the player that put it in this >> bad shape. >> > > This is chess thinking and it is not true for go. In chess if you have a > clearly lost position (like down a piece without compensation), you can only > hope for a miracle. > > But go is a game of accumulating points. Every player, even professionals, > make mistakes in the endgame and play moves that don't optimize the score. > I'm talking about endgame positions, which by definition have no unsettled > groups, so we aren't talking about moves that have different probabilities > of causing an opponent mistake. I'm talking about making a move that gains > 2 points when there is another one that gains 3 points. > > If you are objectively 3 points behind with perfect play with 100 endgame > moves to go, it is quite likely that you can catch up against a high Dan > player. Against a low Dan player you can likely catch up 5 to 10 points in > the endgame. Of course you can only catch up if you play the moves that > gain the most points every time you move. If you make a move that costs a > point while making an obvious threat, you are falling further behind. > > Good moves are the moves that gain the most points in the local situation. > Often there are several goo d local moves, for example the one that gains > the most points but lets the opponent move first elsewhere, and the one that > gains fewer points but lets me move first elsewhere. If I can gain 5 points > locally, but play a move that only gains 4 points and otherwise has no > difference, it's clear that the 4 point move is a bad move. The 5 point > move might not be the best move on the whole board (if there is a 6 point > move somewhere else), but we can still say that the 5 point move is good and > the 4 point move is bad. > > This has nothing to do with "cosmetically appealing". Once the endgame > starts and groups are solid and endgame regions become independent, then it > is all about making the move that gains the most points. > > This kind of endgame play is not obvious on 9x9 sine that board is so small > there isn't much endgame. > > David > > > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: Should 9x9 komi be 8.0 ?]
Jonas Kahn wrote: >> I don't see that, but then again I am not a very strong player >> myself. What I notice is that it plays very "normal" until it's >> pretty obvious that it's losing, not just when it varies slightly from >> 50% but when it doesn't vary much from zero. However, it does play >> more desperately once it varies significantly from 50% but certainly not >> "meaninglessly." >> >> I don't like using the words "good" and "bad" when describing the >> quality of the moves because I try to use terminology that's more >> descriptive (although I fail miserably many times.)In a lost >> position how do you distinguish one move from another when they all >> lose? It sounds funny to me when you say (in so many words) that >> once the program is losing it starts playing "bad moves." >> > > At first move, the position is lost or won for one of the two players. > Yet I am sure you could consider that some of the moves are good or bad. > The only thing that matters is: > Does this move increase my probability of winning against this opponent ? > If two moves have the same result, which one is more beautiful ? > MC programs are not that smart. I'm talking about positions where it's so clear that even MC programs know it's a win or loss (at least "know" with high probability.)This is really the only case where there is a problem. If there are reasonable chances (in practical terms) then MC doesn't have this ugly behavior. - Don > I do not expect a computer to play Lasker way any time soon, so we might > have to change the first criterion into ``against a generic opponent". > > Now when MC goes to the rampage, it's usually LESS efficient than if it > was a little less desperate. The post of Baudis explains that well. I > think the problem is that MC is overevaluating the probability that an > opponent does not answer an easy threat, and, if playing against a > human, lessen the probability of small imprecisions. > > When losing, look for an overplay. But a reasonable overplay... > > > >> Since this is a subjective quality can we use a subjective term such as >> "normal" to describe moves that are cosmetically appealing to us? >> And perhaps "ugly" to describe moves that are not? >> >> My feeling is that in lost positions, the only thing we are trying to >> accomplish is to make the moves more cosmetically appealing (normal) and >> at best improve the programs chances of winning against weak players. >> After all, if the program is in bad shape, then to be completely >> realistic it's probably going to lose to the player that put it in this >> bad shape. Unless of course the program is being upset by a much >> weaker player which can occasionally happen too.We can't reasonably >> expect that if a program is quite sure that it is losing that the >> program that it is beating it is not going to be aware of this too. >> >> It's also a bad mistake in my opinion to try to coerce it into playing >> moves that are "normal" when an increasing amount of "desperation" is >> indeed called for.I have presented anecdotes before about how chess >> players have won games based on not playing as if things are normal when >> they are losing, but instead suddenly playing differently which usually >> consists of violating general principles and "normal" play. >> >> Again, I feel that this effect of moves that are not normal kick in >> mostly when the position is very close to 0 or 1. So what we are >> looking for is AT BEST a very minor improvement and we are wasting a lot >> of energy on this. >> > > Agreed. > > >> If the goal is to make the moves more cosmetically >> appealing I can respect that more - that is realistic and probably even >> easy to accomplish (and then the goal is to do it without weakening the >> program too much.) >> > > Here I don't agree. > Why should that weaken noticeably the program ? Same situation as > before, it happens for won-lost positions: minor change in performance. > > >> It's also being considered to use this to cover over some other weakness >> such as nakade where the program doesn't understand the actual end of >> the game and is thinking it has lost by 2 or 3 stones when in fact it >> has a win.Aside from the fact that this is a fairly rare >> occurrence, I believe it should be addressed directly, not with a >> superficial treatment of the symptoms. >> >> So if you can make it win slightly more lost games by playing as if >> nothing is wrong, then more power to you. It doesn't seem reasonable >> to me that you should be able to do this by feeding the program false >> information. You are effectively saying, "you are losing, be happy >> with that." >> >> By the way, if this is to work (for instance for cosmetic reasons) I >> don't think you can apply this gradually or based on previous >> information. What if you are losing and the opponent plays a >> blunder? After all, this is w
Re: [computer-go] Re: Should 9x9 komi be 8.0 ?]
a few subtleties -- it's possible for a machine to play a perfect endgame, and my guess is that machines will play perfect endgames before people do, although most pros are excellent at the endgame. counting ko threats and utilizing kos effectively is tricky in playouts -- kos can naturally extend a playout very, very far beyond where the actual advantage would be taken in a non-ko situation, and the likelihood of getting this far often enough in playouts to see the advantage is going to be difficult for machines without a lot of domain-specific knowledge. different humans are often good at different stages of the game, and making up a few points in the endgame, or getting a massive lead in the beginning of the game may be possible, convincing a computer player of something that isn't true -- either that it's nearly guaranteed to win, or nearly guaranteed to lose. all that having been said, i'm quite impressed with how well these programs are doing. s. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
endgame (Was [computer-go] Re: Should 9x9 komi be 8.0 ?])
Mogo is already very strong at endgame and certainly plays perfectly near the end of the game. The more advanced the program, the sooner it can play perfect endgame. But correct ko threats playing has nothing to do with the playout part : Since it is a strategic concept that involves global understanting, It is handled by the UCT tree part. - Message d'origine De : steve uurtamo <[EMAIL PROTECTED]> À : computer-go Envoyé le : Dimanche, 2 Mars 2008, 20h25mn 33s Objet : Re: [computer-go] Re: Should 9x9 komi be 8.0 ?] a few subtleties -- it's possible for a machine to play a perfect endgame, and my guess is that machines will play perfect endgames before people do, although most pros are excellent at the endgame. counting ko threats and utilizing kos effectively is tricky in playouts -- kos can naturally extend a playout very, very far beyond where the actual advantage would be taken in a non-ko situation, and the likelihood of getting this far often enough in playouts to see the advantage is going to be difficult for machines without a lot of domain-specific knowledge. different humans are often good at different stages of the game, and making up a few points in the endgame, or getting a massive lead in the beginning of the game may be possible, convincing a computer player of something that isn't true -- either that it's nearly guaranteed to win, or nearly guaranteed to lose. all that having been said, i'm quite impressed with how well these programs are doing. s. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ _ Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail http://mail.yahoo.fr ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re : [computer-go] Tactical information within simulations
I think it is a very good and natural idea. I guess in the future, all MC programs will have some kind of dynamic playout policies. - Message d'origine De : Jonas Kahn <[EMAIL PROTECTED]> À : computer-go Envoyé le : Dimanche, 2 Mars 2008, 19h43mn 29s Objet : [computer-go] Tactical information within simulations There is much high-level data to be found within the MC runs, such as whether a group is alive or not, etc. Now, I don't know if it is easy to inject it back within the simulations. Another approach (not excluding the first one) would be to gather much lower-level data. It's especially sad that the playouts have to discover over and over again that particular sequences are good or bad; we certainly could incorporate this knowledge in the following playouts. To know a sequence, it is sufficient to know which move follows each move. We could store that in a way similar to ``All moves as first'' (even if I thought about that before knowing AMAF). We could record the answers to each move on the board, with the results, in the simulations. Exemple: on 1 simulations, Black plays B2 6000 times. White answers A2 2000 times (1500 wins, 500 losses) C2 3000 times (500 wins, 2500 losses) B3 1000 times (500 wins, 500 losses) Now, there could be many variants. The easiest way, and one of the least memory-hungry (already very bad from that point of view, though), would be to keep after each move how many times each possible following move has been played, and with what overall overall result. We could also keep tables of the previous move by the same player. A combination of black's previous move + (white's) last move should be enough to distinguish many situations. A more ambitious variant would be to build a tree from each possible move, similarly to the main tree. In such a case, I gather that nodes should be expanded after being OFTEN visited. In both cases, we could/should keep the information from the previous moves, while letting it dim out slowly. A simple idea would be to multiply by say .9 or .95 the number of games each move has been recorded as playing at each stage (or if storage is wins-losses, multiply this number of losses and wins). There are many difficulties, especially since we must not use too much computing power. Notably: - Globally how to blend this with the choice of moves ? It depends on the implementation. - All moves are not simultaneously available: some might already have been played. I don't know if this would really be a handicap. - A move could be an only move in one situation, while another is the only move in another situation, with the same previous moves. I gather this should not be important. This rule would encourage using the move corresponding to the more frequent situation, but the winning rate would decrease because of the other situation, and encouragement should be kept at a reasonable level. - In the case of trees, how do we take into account the sequences in the other trees leading to the same place. I mean, suppose we have a tree N4-M2-O5-M6 and a tree M2-O5-M6. It might be heavy to update everything and keep coherence. - It's probably much too heavy to check whether a win or a loss occured when the move had low or high probability to be chosen. - We have frequencies and winning rates associated to moves. Either we can compute a probabiliy distribution from this data, and blend it in some way with the existing one, or we can compute bonus/malus to each of the moves appearing, and add that to log_prob of the original distribution. The problem of the former is: how do we deal with moves that seldom/never appear after the previous moves ? We cannot really get useful information on them. The problem with the latter is that we might end up giving a bonus to already much favoured moves. I think the way we use this information should satisfy to the following properties: - If there is an only move, the recommendation should be strong enough to mainly overcome all other suggestions. Say chosen 90% of the time. - Symmetrically, a move that always loses should be banned. - If a move gets a new victory, it should get slightly more probable. Mutatis mutandis, the same for loss. - Quick to compute ! Ideally using a Bradley-Terry model or something like that would be a good idea, but unusable during a game. In any case, the ``coefficients'' we use should not be updated often. Maybe once in 1000/1 simulations if it's quick enough, otherwise once at the beginning of each move. If anyone is interested, I can try to go down to specifics, matching as best I can the existing implementation. As finding a good mathematical procedure is not easy, I shall not work on that if it does not interest anyone. Jonas ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: endgame (Was [computer-go] Re: Should 9x9 komi be 8.0 ?])
i'm just saying (and perhaps i'm misunderstanding something here) that lots of playout depth, and therefore lots of simulations are required to see *any* advantage to playing out a ko. s. On Sun, Mar 2, 2008 at 3:17 PM, ivan dubois <[EMAIL PROTECTED]> wrote: > Mogo is already very strong at endgame and certainly plays perfectly near the > end of the game. The more advanced the program, the sooner it can play > perfect endgame. > But correct ko threats playing has nothing to do with the playout part : > Since it is a strategic concept that involves global understanting, It is > handled by the UCT tree part. > > - Message d'origine > De : steve uurtamo <[EMAIL PROTECTED]> > À : computer-go > Envoyé le : Dimanche, 2 Mars 2008, 20h25mn 33s > Objet : Re: [computer-go] Re: Should 9x9 komi be 8.0 ?] > > a few subtleties -- > > it's possible for a machine to play a perfect endgame, and my > guess is that machines will play perfect endgames before people > do, although most pros are excellent at the endgame. > > counting ko threats and utilizing kos effectively is tricky in playouts -- > kos can naturally extend a playout very, very far beyond where the > actual advantage would be taken in a non-ko situation, and the likelihood > of getting this far often enough in playouts to see the advantage is going > to be difficult for machines without a lot of domain-specific knowledge. > > different humans are often good at different stages of the game, and > making up a few points in the endgame, or getting a massive lead in > the beginning of the game may be possible, convincing a computer > player of something that isn't true -- either that it's nearly guaranteed > to win, or nearly guaranteed to lose. > > all that having been said, i'm quite impressed with how well these programs > are doing. > > s. > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > > > > _ > Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail > http://mail.yahoo.fr > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: endgame (Was [computer-go] Re: Should 9x9 komi be 8.0 ?])
> But correct ko threats playing has nothing to do with the playout part : > Since it is a strategic concept that involves global understanting, It is > handled by the UCT tree part. Yes and no. Theoretically, that's the work of the UCT part. But, as Steve pointed out, kos can go on for long. I don't know what depth is attained in the tree (by the way, I would really like to know), but I doubt it is that long. Moreover, some kos must be kept for later. Hence, some basic understanding of kos in the playouts might be useful. That's merely a variation of the horizon effect. We could even imagine a situation where the UCT makes a threat that loses points in the only aim of having the ko past the horizon, where it would be 50-50 (for example) in the playout. Jonas ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re : endgame (Was [computer-go] Re: Should 9x9 komi be 8.0 ?])
Ok, I think I see what you mean, but I am not sure I really agree. As you say, this is related to horizon effect. I think current MC programs can play ko quite well because they are trying do delay the outcome of losing the ko, therefore they tend to play threats do gain time, just like human players do. I dont think it is essential that the ko be resolved inside the tree part. And I dont believe there exist efficient way to handle ko in the playout other than just fordiding simple ko recapture. Ivan - Message d'origine De : Jonas Kahn <[EMAIL PROTECTED]> À : computer-go Envoyé le : Dimanche, 2 Mars 2008, 21h32mn 43s Objet : Re: endgame (Was [computer-go] Re: Should 9x9 komi be 8.0 ?]) > But correct ko threats playing has nothing to do with the playout part : > Since it is a strategic concept that involves global understanting, It is > handled by the UCT tree part. Yes and no. Theoretically, that's the work of the UCT part. But, as Steve pointed out, kos can go on for long. I don't know what depth is attained in the tree (by the way, I would really like to know), but I doubt it is that long. Moreover, some kos must be kept for later. Hence, some basic understanding of kos in the playouts might be useful. That's merely a variation of the horizon effect. We could even imagine a situation where the UCT makes a threat that loses points in the only aim of having the ko past the horizon, where it would be 50-50 (for example) in the playout. Jonas ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ _ Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail http://mail.yahoo.fr ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: Re : endgame (Was [computer-go] Re: Should 9x9 komi be 8.0 ?])
the issue with ko is the order in which the ko threats are played, which can only be successfully evaluated if the average playout finishes the ko correctly. s. On Sun, Mar 2, 2008 at 4:56 PM, ivan dubois <[EMAIL PROTECTED]> wrote: > Ok, I think I see what you mean, but I am not sure I really agree. > As you say, this is related to horizon effect. I think current MC programs > can play ko quite well because they are trying do delay the outcome of losing > the ko, therefore they tend to play threats do gain time, just like human > players do. I dont think it is essential that the ko be resolved inside the > tree part. And I dont believe there exist efficient way to handle ko in the > playout other than just fordiding simple ko recapture. > > Ivan > > - Message d'origine > De : Jonas Kahn <[EMAIL PROTECTED]> > À : computer-go > Envoyé le : Dimanche, 2 Mars 2008, 21h32mn 43s > Objet : Re: endgame (Was [computer-go] Re: Should 9x9 komi be 8.0 ?]) > > > But correct ko threats playing has nothing to do with the playout part : > Since it is a strategic concept that involves global understanting, It is > handled by the UCT tree part. > > Yes and no. > Theoretically, that's the work of the UCT part. But, as Steve pointed > out, kos can go on for long. I don't know what depth is attained in the > tree (by the way, I would really like to know), but I doubt it is that > long. Moreover, some kos must be kept for later. > > Hence, some basic understanding of kos in the playouts might be useful. > > That's merely a variation of the horizon effect. We could even imagine a > situation where the UCT makes a threat that loses points in the only aim > of having the ko past the horizon, where it would be 50-50 (for example) > in the playout. > > Jonas > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > > > > _ > Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail > http://mail.yahoo.fr > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/