Re: [computer-go] State of the art of pattern matching
On 31-mrt-08, at 20:28, Don Dailey wrote: You could be blind-siding the program. I think this is the crux of the matter. Not just in MC but in Go programming in general. If you add 'strong' knowledge you can create blind-spots. For example, I guess a ko rarely gets played during playouts because strong priority is given to connecting the stone in atari. It's only during the actual search that ko is discovered. But if you have a case where it's almost always out of the scope of the search-tree to find the blind-spot of a piece of knowledge then it can lead to very poor play. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] State of the art of pattern matching
Hi Lukasz In Rémi's paper about Bradly Terry models he found a way to give a comparable gamma score to things that were different, for instance: capturing a stone v.s. a given 3x3 pattern. His model is much more general, but has less patterns (not at all around 200K patterns of my system). Additionally, my mask pattern only system has a reasonable a priori value: times_played / (times_played + times_postponed). But there is an additional issue handled by Rémi's system that is ignored by the naif system: The fact that some competitions are much harder than others. If the sum of concurrent scores to the average competition is, say 100, the value of winning a competition of sum 40 is obviously much less than winning one of sum 300. That is not included in the simple formula. Therefore, a simple intuitive idea if adding more for a the hard competition 300/100 and less for the easy one 40/100. Because there is a feedback as when you change the scores you are also changing the sums of concurrent scores, there are so many variables to adjust ~200K and the a priori values are already reasonable, I prefer to adjust doing small steps since this is an offline process and speed is irrelevant. The hits of each move don't change that much. At the beginning of the adjustment: Hits of move 1 = 0.382405 acc = 0.382405 Hits of move 2 = 0.113530 acc = 0.495935 Hits of move 3 = 0.067430 acc = 0.563365 Hits of move 4 = 0.048055 acc = 0.611420 Hits of move 5 = 0.036304 acc = 0.647725 Hits of move 6 = 0.028978 acc = 0.676703 Hits of move 7 = 0.023769 acc = 0.700472 Hits of move 8 = 0.019793 acc = 0.720264 Hits of move 9 = 0.016693 acc = 0.736957 Hits of move 10 = 0.014164 acc = 0.751121 At the end of the adjustment: (Additional steps don't improve further, some value may even decrease.) Hits of move 1 = 0.409278 acc = 0.409278 Hits of move 2 = 0.115598 acc = 0.524877 Hits of move 3 = 0.062659 acc = 0.587536 Hits of move 4 = 0.044014 acc = 0.631550 Hits of move 5 = 0.033681 acc = 0.665230 Hits of move 6 = 0.027696 acc = 0.692926 Hits of move 7 = 0.022885 acc = 0.715811 Hits of move 8 = 0.019029 acc = 0.734840 Hits of move 9 = 0.015758 acc = 0.750598 Hits of move 10 = 0.012886 acc = 0.763484 What I call the distribution is this: The gamma value of each legal move defines a probability distribution function over the legal moves. If we had no information (uniform random distribution) we would expect that with 20% of the legal moves we hint 20% of the times, etc. So for 0.1, 0.2 .. 0.9 of the legal moves we hit around 0.1, 0.2 .. 0.9. Now, what happens when we use our pdf over the legal moves? We can get 10%, 20% 30%, etc with very few move candidates. But, if the first move has say 0.17 and the 1+2 = 0.21 and I want to get 0.2 If the first is a match, count 1, if the second is a match count 3/4 = (0.2 - 0.17)/(0.21 - 0.17) if any other move is a match count 0. Do I really get 0.1, 0.2, .. 0.9, of course, with much less moves ? No. for a reason I don't understand, I get something like: Distribution fit expected 0.1 found 0.153164 Distribution fit expected 0.2 found 0.298602 Distribution fit expected 0.3 found 0.433074 Distribution fit expected 0.4 found 0.551575 Distribution fit expected 0.5 found 0.651135 Distribution fit expected 0.6 found 0.727419 Distribution fit expected 0.7 found 0.776876 Distribution fit expected 0.8 found 0.804008 Distribution fit expected 0.9 found 0.823292 So my distribution is distorted, when I try to get 30% of the "guessing chance" I get 43.3%, but when I try to get 90% I get 82.3%. I don't know why. Jacques. Łukasz Lew wrote: On Sat, Mar 29, 2008 at 3:47 PM, Jacques Basaldúa <[EMAIL PROTECTED]> wrote: Mark Boon wrote: > I suppose there's no reason why it has to be assembler, > you could just as well generate C-code. I don't think so. But I don't write that much asm as it may appear. I see what the compiler writes when I debug and write critical parts only. And automatically generated code. > I don't see yet how you go from there to finding patterns. Depending on the mode, it computes either the hash or the mask. That mask has to be translated into urgency by a table search. (Btw. urgency is not just: times_played / (times_played + times_postponed) but that value is used as an a priori value for Bradley Terry models as Rémi introduced. My adjustment is not as advanced as what Rémi describes. I just give a weight to a competition based on the sum of the competing gamma values, and that gives me another point: SUM weights_of_played / SUM weights_of_played + weights_of_postponed Can you tell how you came with such an equation? Does it improves much? Thanks Lukasz And I advance a step (like 1/4 of the distance) in that direction and use the new point as the current value. Iterating this way improves the wins of the best move but distorts the distribution (I extract lots of statistics at each step) so I stop rather early.) The macro for searching in an ordered
Re: [computer-go] State of the art of pattern matching
On Mon, Mar 31, 2008 at 03:12:39PM -0700, Christoph Birk wrote: > > On Mar 31, 2008, at 1:05 PM, Don Dailey wrote: >> >> >> Christoph Birk wrote: >>> >>> On Mar 31, 2008, at 10:48 AM, Mark Boon wrote: I don't know about this. I'm pretty sure MoGo checks if the stone can make at least two liberties (ladder problem) in which case it can still be horrible but very seldomly worse than random. >>> >>> I would expect playing a "not-working" ladder to be worse than random >>> most >>> of the time. >> Of course this is true, but presumably a move that answers a direct >> atari threat would classify as being better than random. > > Not if it's a working ladder. I think this is not obvious, since there is still about 50% chance on who gets the second move there. The original MoGo pattern describes only capture-possibility check, not atari-possibility check, so even if you play out the ladder, the opponent will most likely tenuki. So I think not playing out working ladders as white is actually better because it saves _black_ from getting a bad result! I implemented a naive ladder check (that does not catch a lot of cases, especially when you have to decide whether to "slide" the stones in ladder along border or some of your friendly stones) and tested its efficiency in UCT-driven playouts with only a 50% probability capture-possibility check heuristic. Against GNUGo Level8, I get 37% +-5.7% wins with ladder check, 28% +-4.6% without ladder check. Unfortunately, the numbers are not very precise and the difference is probably much smaller in reality - 37% seems overinflated given later measurements; I will redo the test with newer codebase and more games. (I plan to post detailed measurements of effects of various heuristics and parameters anyway; so far it seems my code is still too buggy though: AMAF, prior knowledge and FPU all make my program weaker :-(.) -- Petr "Pasky" Baudis Whatever you can do, or dream you can, begin it. Boldness has genius, power, and magic in it.-- J. W. von Goethe ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Some ideas how to make strong heavy playouts
A recurrent concept popping up in discussions on how to improve playouts is "balance". So I would like to try to share my philosophy behind the playouts of Valkyria and how I define "balance" and how it relates to the evaluation of go positions. *Background In an old school program the evaluation function would try to see which stones are safely connected to other stones of the same colours. Connected stones are called groups, and the program would probably also try to evaluate the safety of the groups looking at the eyespace at hand, weak neighbouring groups and so on. This quickly gets very commplicated. My old program Viking had several 1000's of handmade patterns for evaluating connectivity alone. This worked as a dream as long as the position consisted of patterns in the database... but in each an every game there were new situations and new patterns had to be added. A more robust method would be to use tactical search in the program to evaluate connectivity. The problem there is to ensure accuracy efficiently. Any tactical search tends to either become too time consuming, or resort to guessing. *MC-evaluation Programs using uniformly random MC-eval favors very solid but inefficient shape, often building blocks of adjascent stones in the center of the board. The reason is that if stones are played more loosely the stones often get cut off and get killed in the simulations. What we rather want is a program that can play efficent moves where stones are safely connected but occupy as much territory/eyespace as possible. The tree search (UCT) cannot alone solve this problem. Shapes created in a 19x19 game may exist untouched to the late endgame and it is not possible to read out all shapes on the board. It is much better if secure shapes stay stable in the simulation. They way I implemented that in Valkyria is that the playout part is basically reactive to random moves that attacks shapes on the board. It does not in any sense attempt to play the best move on the board. If it does not need to defend a shape it plays uniformly random somewhere. [Note that Valkyria also prunes really ugly moves, thus it plays uniformly the first move that is not pruned] This is also how the pattern system works in Mogo as I understand it. If I remember correctly I would say that all Mogo patterns are very basic and common sense defenses against attacks on otherwise stable shapes. But there also have to be balance. Valkyria also punishes bad shape. That is if a weak shape already is on the board, or a random move attacked two shapes simulatanously in the simulation, then the program may attack the weakness (or in a way it also reacts to the situation defending against "the weak shape becoming stronger"). Often the same move that would have been the proper defence is played. *Eliminating noise rather than predicting the best move Nothing I wrote above is original or new to the readers of this list. But I want to make a distinction between systems that tries to predict the best move and a system that only tries to eliminate noise from the otherwise very random simulations. Noise is eliminated when strong stones live and weak stones die almost always in the simulations. This way the random evaluations will mostly react to moves that matter in urgent fighting with shapes that are not yet stable. A MC-program that does this should stop defending and attacking strong shapes and would require much less simulations to discriminate between good and bad moves. Valkyria2 and Valkyria3 has slightly different tree search algorithms but uses the same playouts. Both versions needs only 512 playouts per move to win 50% against Gnugo 3.7.10. Still I think predicting the best moves is very important in the tree part, but this may be much less important in the playouts, and perhaps even detrimental as some people have experienced. *19x19 The best programs on 19x19 seems to focus the uct search on local fighting. This temporarilly overcomes the biases of the simulations locally. But the information gained locally about good shape in the tree is forgotten when the fighting moves elswhere. But this knowledge can then be rediscovered later if the fighting comes back. Could a future improvement to 19x19 go be to use locally narrow searches that seeds the playouts with strong patterns for the current shapes on the board? Maybe someone is already doing this? A really strong approach would be to eliminate the need of hard coded patterns or offline pattern harvesting and let the program learn during the game. -Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Some ideas how to make strong heavy playouts
don, > But I also discovered that there seems to be no benefit whatsoever in > removing them from the play-outs.I have no real explanation for > this. But it does tell me that the play-outs are very different in > nature from the tree - you cannot just use the same algorithms for > selection and prioritizing moves. did you use the same heuristic in the playouts when pruning them? i.e. that no other stones are anywhere nearby? in any particular playout, they may be the killing move for a group that ends up getting built near to the edge of the board, or a successful monkey jump. so removing them from somewhere deep in the tree as a rule would be bad, but if nothing is anywhere nearby, removing them as a first move choice is pretty reasonable. they make up a fairly small fraction of possible moves, and this is magnified the deeper you go into a search (so the net effect would be diminished, even with the heuristic). anything you can say about what not to do on the very first move of a search, if it applies to every board situation (ha ha ha) is great, though, if it's fast enough to check for. in fact, if *nothing* is *anywhere* nearby, a move on even the second line is bad. third or fourth is much better. s. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Some ideas how to make strong heavy playouts
Hi Magnus, Your post makes a great deal of sense. I agree with all the points you have stated. I don't think you have ever made an illogical post like most of us have (including myself) and they are always well thought out and worded. I have a response to this comment: Still I think predicting the best moves is very important in the tree part, but this may be much less important in the playouts, and perhaps even detrimental as some people have experienced. A class of "bad" moves is moves on the edge of the board when nothing else is nearby. In other words, the candidate edge placement doesn't "touch" any other stones of either color. In my experiments I found that it strengthens the program significantly to avoid trying these moves in the tree search (I call them veto patterns even though they are not necessarily veto'd permanently.) But I also discovered that there seems to be no benefit whatsoever in removing them from the play-outs.I have no real explanation for this. But it does tell me that the play-outs are very different in nature from the tree - you cannot just use the same algorithms for selection and prioritizing moves. I think I like your theory that eliminating noise is a good thing and probably the primary thing. Your last comment is what I am interested in the most, and what I have spent quite some time looking at. I call it "learning on the fly." The use of patterns is acceptable in this setting if those patterns are learned during the game. The basic important observation is that most of the things learned during the game is relearned over and over and over and over All-moves-as-first is a form of this learning, but very brittle of course. And I don't have a problem with imposing knowledge from the outside if it's used only as a "kick start" - a way to overcome the initial learning inertia, but not imposed in any absolute way. - Don Magnus Persson wrote: > A recurrent concept popping up in discussions on how to improve > playouts is "balance". So I would like to try to share my philosophy > behind the playouts of Valkyria and how I define "balance" and how it > relates to the evaluation of go positions. > > *Background > > In an old school program the evaluation function would try to see > which stones are safely connected to other stones of the same colours. > Connected stones are called groups, and the program would probably > also try to evaluate the safety of the groups looking at the eyespace > at hand, weak neighbouring groups and so on. This quickly gets very > commplicated. My old program Viking had several 1000's of handmade > patterns for evaluating connectivity alone. This worked as a dream as > long as the position consisted of patterns in the database... but in > each an every game there were new situations and new patterns had to > be added. A more robust method would be to use tactical search in the > program to evaluate connectivity. The problem there is to ensure > accuracy efficiently. Any tactical search tends to either become too > time consuming, or resort to guessing. > > *MC-evaluation > > Programs using uniformly random MC-eval favors very solid but > inefficient shape, often building blocks of adjascent stones in the > center of the board. The reason is that if stones are played more > loosely the stones often get cut off and get killed in the simulations. > > What we rather want is a program that can play efficent moves where > stones are safely connected but occupy as much territory/eyespace as > possible. > > The tree search (UCT) cannot alone solve this problem. Shapes created > in a 19x19 game may exist untouched to the late endgame and it is not > possible to read out all shapes on the board. It is much better if > secure shapes stay stable in the simulation. > > They way I implemented that in Valkyria is that the playout part is > basically reactive to random moves that attacks shapes on the board. > It does not in any sense attempt to play the best move on the board. > If it does not need to defend a shape it plays uniformly random > somewhere. [Note that Valkyria also prunes really ugly moves, thus it > plays uniformly the first move that is not pruned] > > This is also how the pattern system works in Mogo as I understand it. > If I remember correctly I would say that all Mogo patterns are very > basic and common sense defenses against attacks on otherwise stable > shapes. > > But there also have to be balance. Valkyria also punishes bad shape. > That is if a weak shape already is on the board, or a random move > attacked two shapes simulatanously in the simulation, then the program > may attack the weakness (or in a way it also reacts to the situation > defending against "the weak shape becoming stronger"). Often the same > move that would have been the proper defence is played. > > > *Eliminating noise rather than predicting the best move > > Nothing I wrote above is original or new to
Re: [computer-go] Some ideas how to make strong heavy playouts
Quoting Don Dailey <[EMAIL PROTECTED]>: I have a response to this comment: Still I think predicting the best moves is very important in the tree part, but this may be much less important in the playouts, and perhaps even detrimental as some people have experienced. A class of "bad" moves is moves on the edge of the board when nothing else is nearby. In other words, the candidate edge placement doesn't "touch" any other stones of either color. In my experiments I found that it strengthens the program significantly to avoid trying these moves in the tree search (I call them veto patterns even though they are not necessarily veto'd permanently.) Sofar I have not tried removing those moves in the tree only. I need to try that. But I also discovered that there seems to be no benefit whatsoever in removing them from the play-outs.I have no real explanation for this. But it does tell me that the play-outs are very different in nature from the tree - you cannot just use the same algorithms for selection and prioritizing moves. My experience is that sometimes I get really bad interactions between special patterns in my playouts and plays on the first and second line. For example, if ladders are implemented then a move on the first line will be able to capture a contact move to the side of it. Thus I had situations where I think removing those moves made a positve difference, because the program liked those bad moves too much. But an alternative solution is also to prune bad moves, for example playing next to a stone of the first line allowing a ladder is perhaps even worse than the first move. So my program solves these kind of problems in a very adhoc multiple ways so it is hard to tell if my eperineces and your experiences here are universal or just specific to certain programs as a whole. Still I recently had the idea that perhaps AMAF works best if in all moves are played because then each move played in the playouts are tested under a larger variety of situations. But I do not remember now if I tested it properly. What I did do recently was to stop pruning moves in the trees and allowed all legal moves. This dropped the winrate of 1500 playouts vs gnugo with 10%, it seemed to have a lot of temporary hallucinations early in search where a lot of really bad moves were searched perhaps 50-200 times before finding more decent moves to search. I planned to make a version that make some soft pruning allowing them to be searched later so that tree will have full width near the root and be strongly pruned below. I would be happy if I could do that and keep the playing strength becuase then my program would at least in principle be able to play perfectly. Right now the pruning is to aggressive for that to be true. I think I like your theory that eliminating noise is a good thing and probably the primary thing. Your last comment is what I am interested in the most, and what I have spent quite some time looking at. I call it "learning on the fly." The use of patterns is acceptable in this setting if those patterns are learned during the game. The basic important observation is that most of the things learned during the game is relearned over and over and over and over All-moves-as-first is a form of this learning, but very brittle of course. And I don't have a problem with imposing knowledge from the outside if it's used only as a "kick start" - a way to overcome the initial learning inertia, but not imposed in any absolute way. I agree also here but I have no really good practical ideas, just loose feeling on what it should be like. But every time I try to think in more concrete terms I never get anywhere... -Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Some ideas how to make strong heavy playouts
steve uurtamo wrote: > don, > > >> But I also discovered that there seems to be no benefit whatsoever in >> removing them from the play-outs.I have no real explanation for >> this. But it does tell me that the play-outs are very different in >> nature from the tree - you cannot just use the same algorithms for >> selection and prioritizing moves. >> > > did you use the same heuristic in the playouts when pruning them? > i.e. that no other stones are anywhere nearby? > > Yes, of course.Nearby means "touching" in this case because this is based on 3x3 patterns where the center point is the point in question. I am of course aware that there are exceptions to this pattern, the move I am throwing out in some hopefully rare situations could be the best move. But my intuition tells me it should be more dangerous in the tree, and should make the play-outs far more relevant on average - but it doesn't seem to work that way! > in any particular playout, they may be the killing move for a group > that ends up getting built near to the edge of the board, or a successful > monkey jump. so removing them from somewhere deep in the tree > as a rule would be bad, but if nothing is anywhere nearby, removing > them as a first move choice is pretty reasonable. > I don't actually "remove" them permanently. They will still get tried when the simulation count gets high enough - I don't believe in absolute selectivity unless you can prove it's admissible. > they make up a fairly small fraction of possible moves, and this is > magnified the deeper you go into a search (so the net effect would > be diminished, even with the heuristic). > But the effect on the tree is pretty large. It's very useful to prune even a few moves and get a big total benefit if the pruning is sound. In practice you get MORE pruning as the game progresses and the center fills with stones until the edge starts getting touched. > anything you can say about what not to do on the very first move > of a search, if it applies to every board situation (ha ha ha) is > great, though, if it's fast enough to check for. > Pruning should be progressive. If anything, you should look at more moves near the root of the tree, not less.Of course this is subject to hard hard you have to work at it, and how sure you are of the rule you use. It's no good to see that an edge move is actually important but then not be able to see it when the position actually occurs on the game board. > in fact, if *nothing* is *anywhere* nearby, a move on even the > second line is bad. third or fourth is much better. > My patterns were collected statistically and moves to the edge were never played by my strongest program playing at long time controls, and thus the 3x3 edge pattern was generated by the automated pattern harvester. I think when I did 5x5 patterns I got some patterns just like you describe, moves to the 2nd line were rare if nothing was within the 5x5 pattern space.But if I remember they were not considered as horrible as edge moves. I don't remember if they made my veto threshold or not. - Don > s. > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Some ideas how to make strong heavy playouts
I think there was some confusion in Don's post on ``out of atari'' in play-outs. For one thing, I do not agree with the maximal information argument. Testing ``out of atari'' moves is not good because they might be good, or might be bad, but merely because they might be good. By contrast, you should test (in the tree) a kind of move that is either good or average, but not either average or bad, even if it's the same amount of information. In the tree, you look for the best move. Near the root at least; when going deeper and the evaluation being less precise, you merely look for good moves, that keep a trustworthy evaluation of the position, and try to avoid brittle ones. In the playouts, that's another matter. I would say that (almost) always playing 'out of atari' would add stability, much in the way Magnus Persson very well explained. What do we want of playout policies ? As static evaluation functions, we would want them to give the right ordering of move values, with differences as wide as possible. More precisely, we would want that among the best moves, that's not important if the evaluation is not precise for bad moves. Now we build a tree while computing the evaluation function. So that we can allow for false good moves if they are quickly seen as such in the tree, that is after a one or three plies search, if the false good moves are not too numerous. False wrong moves is much worse, since we might never exploit the branch long enough to correct this feeling. The latter paragraph applies also to pre-ordering (I keep a model like the one of Crazystone in mind, with a priori probabilities for each move in the tree, and also a distribution in the playouts). Conclusions: It's no matter if there is a bias in the playout policy, as long as it's the same for all moves. Bias toward solid move is therefore a nuisance... Playing (not too numerous) nonsensical moves would only be a nuisance if there are positions whose associated playouts call for many more urgent moves than in others. What matters is making two moves having very different values. Here the reduction of noise comes into play: if there is 50% chance in all playouts that an alive group dies, and decides the game, then the difference in evaluation of the other positions is divided by two... Taking out all the common noise (that is the mistakes that appear in all playouts) makes the distinction easier. On the other hand, concentrating along a wrong evaluation (after this particular move, the group is dead) would be a catastropha. If this comes from one particular move, it should be noticed and played/avoided systematically. About learning on the fly: I agree completely, that was one of my first posts. However I really think we should have learnt patterns (and other properties) first: you cannot learn the whole of go, or even your game, in one game. And learning is a good thing, but you must find the good move first, and should as quick as possible. For one thing, if we learn patterns, here they should obviously be localized (not translation invariant). We could also learn short move sequences. In a setting with probability distribution on moves, taking that into account is merely changing the probability distribution. Question is: how ? By the way, my guess is that learning on the fly would be more important in the playouts than in the tree: it would contribute to stabilizing the playouts. The tree should anyhow end up with the good moves. This learning should probably also come from the playouts (very much information, and we could stay with information already calculated for the playouts, allowing easy re-use), automatically building a status for groups that are solved only near the end... Jonas, who always reread thinking ``How do I manage to be that unclear !" ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] State of the art of pattern matching
Hi Jacques > > No. for a reason I don't understand, I get something like: > > Distribution fit expected 0.1 found 0.153164 > Distribution fit expected 0.2 found 0.298602 > Distribution fit expected 0.3 found 0.433074 > Distribution fit expected 0.4 found 0.551575 > Distribution fit expected 0.5 found 0.651135 > Distribution fit expected 0.6 found 0.727419 > Distribution fit expected 0.7 found 0.776876 > Distribution fit expected 0.8 found 0.804008 > Distribution fit expected 0.9 found 0.823292 > > So my distribution is distorted, when I try to get 30% of > the "guessing chance" I get 43.3%, but when I try to get > 90% I get 82.3%. I don't know why. I guess you have checked that with your rules for getting probability distributions out of gammas, the mean of the probability of your move 1 was that that you observed (about 40 %) ? And the same for the following ones ? For the tails (many moves), I guess that a proportion of available moves is a better reference. In fact, doing that for the following moves too would be a way to calibrate your function (gamma_1, gamma_2) -> Prob(gamma_1); All the logits and so on are somewhat arbitrary and can be wrong especially in the tails. For a distribution fit of .1, I guess you merely always keep the first one (except very special case). Then, if you had a homogeneous 40% probability that first choiceis the right one, and you decide it's half the time 30%, half the time 50%, you see that you get (.1/.3 + .1/.5) / 2 > .1/.4. Hence if your fluctuations in the estimation of the first choice are not adapte, you shall get bigger fit. On the other hand, when you are not in the first moves, shape is not a factor for really deciding the move. Suppose that for 50% of moves, shape counts and the right move is one of the first 5 candidates, and that for 50% it does not count, and the good move is chosen uniformly at random with respect to your patterns criterion; Then achieving .9 fit merely asks for always taking 80% of the possible moves (including of course the 'good pattern' ones). If you have wrong estimation of your tail probabilities, you can easily select much less moves. Jonas ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] State of the art of pattern matching
Do you have a link to those papers? -Josh > My go-programming efforts are very much concentrated on patterns. > (maybe I have been influenced by the Kojima-papers) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Some ideas how to make strong heavy playouts
Jonas Kahn wrote: > I think there was some confusion in Don's post on ``out of atari'' in > play-outs. > For one thing, I do not agree with the maximal information argument. > This is more a theory than an argument. Maybe I didn't express it very well either. It's a pretty solid principle in tree search - for instance in computer chess the "checking" move and the responses to it are considered highly important moves and almost every program even extends the search on these moves. Is that because checking moves are usually great moves?Of course not - the majority of them are blunders. But they are super critical in importance because you really "just gotta know", you cannot afford to be wrong about these. Atari moves in Go is the closest things to check in chess. You cannot thumb your nose at them because there is more game critical information in them than in a typical random placement.I would say that they win and lose games more than most other moves and that their resolution is likely to decide the game either way. You may not believe that, or perhaps you think other moves are more important and tell you more about who will win and lose. I'm not a go player so I could be wrong about these moves being important, but I noticed that the first Mogo paper addresses this issue as the very first order of business. So I don't know how you define information content, but if I were writing a classic program I would probably have as my first order of business a routine to tell me if groups that have limited liberties could be captured. And I would consider a ladder search as super important. Maybe David Fotland can tell us if ladder search is very important or not, but I believe it would be. It is "information rich" in that it doesn't mater whether it wins or loses, you need to know in either case or you will lose the game. What is often not understood in game tree search is that you usually have to look at certain classes of bad moves pretty hard to understand that they are bad. The knee jerk reaction is to throw out moves that are "on average" bad but you really should be more focused on moves that have high information content. In chess these are the so called "high compulsion" moves. Checks, captures, threats, etc.Most checks, captures, threats are stupid moves but it'a an error to immediately prune them away. So I could be wrong about this, but I'm pretty convinced I am right - you must look at high information content moves. These moves are not "noisy" because they tend to tell you right away if you are going to win or lose. > Testing ``out of atari'' moves is not good because they might be good, > or might be bad, but merely because they might be good. That's what I mean by information content.You are semantically challenged here, what you are saying agrees with what I am saying. If I run a marathon just to "see if I can finish" you can also rephrase this to say, "To see if I can finish or not." The only important issue is that I would run the marathon in order to make the distinction. > By contrast, you > should test (in the tree) a kind of move that is either good or average, > but not either average or bad, even if it's the same amount of > information. In the tree, you look for the best move. Near the root at > least; when going deeper and the evaluation being less precise, you > merely look for good moves, that keep a trustworthy evaluation of the > position, and try to avoid brittle ones. > Again, you are semantically challenged but basically correct. A move that you can statically evaluate as being on the lower end of the scale does not have much information content - in other words evaluating it in the tree has very little chance of changing the score. The alpha beta procedure in general is about information content in a big way.Many of the branches cut off in alpha beta spring from great moves, that have no chance of changing the score so there is no useful information there.You would not look at those moves just because they happen to be "really great" moves. > In the playouts, that's another matter. I would say that (almost) always > playing 'out of atari' would add stability, much in the way Magnus > Persson very well explained. > > What do we want of playout policies ? > As static evaluation functions, we would want them to give the right > ordering of move values, with differences as wide as possible. > More precisely, we would want that among the best moves, that's not > important if the evaluation is not precise for bad moves. > Maybe I'm semantically challenged now, but the only correct ordering of move values is win or lose.I suppose you are saying that there should be a variety of scores that somehow reflect the difficulty of winning? This is hard to talk about without quantifying exactly what should be returned in an ideal world. If the evaluation can be perfect of course you
Re: [computer-go] Some ideas how to make strong heavy playouts
Don, I'd strongly agree. You must know whether ladders work or not, whether a nakade play works or not, whether various monkey jumps and hanes and so forth succeed or not. In and of themselves, few moves are objectively good or bad in any sense - one has to try them and see what happens. Some form of search or playout is needed to determine this. Even patterns which are completely understood must be evaluated in the context of a game. To take a trivial example, three liberties in a row - should the middle point be played to reduce it to one eye, or to create two eyes, depending on whose move it is? Usually the answer is "yes, if the life of a group depends on that play, and there is nothing bigger to do - otherwise, it's a bad play." It's important to try that play and see what happens. It might be a good play or a bad play. Static patterns can't make that decision without more information. ( is the group isolated? how big is it? what else is on the board? ) Terry McIntyre <[EMAIL PROTECTED]> Wherever is found what is called a paternal government, there is found state education. It has been discovered that the best way to insure implicit obedience is to commence tyranny in the nursery. Benjamin Disraeli, Speech in the House of Commons [June 15, 1874] You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. http://tc.deals.yahoo.com/tc/blockbuster/text5.com ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Some ideas how to make strong heavy playouts
terry mcintyre wrote: > Don, > > I'd strongly agree. You must know whether ladders work > or not, whether a nakade play works or not, whether > various monkey jumps and hanes and so forth succeed or > not. In and of themselves, few moves are objectively > good or bad in any sense - one has to try them and see > what happens. > > Some form of search or playout is needed to determine > this. Even patterns which are completely understood > must be evaluated in the context of a game. > Yes, and even the strongest patterns may not be relevant, especially in our MC programs which lose interest once the game is won or lost. That's partly why I'm interested in exploring "on the fly" leaning. Learning outside the context of the position being played may not have much relevance. - Don > To take a trivial example, three liberties in a row - > should the middle point be played to reduce it to one > eye, or to create two eyes, depending on whose move it > is? Usually the answer is "yes, if the life of a group > depends on that play, and there is nothing bigger to > do - otherwise, it's a bad play." > > It's important to try that play and see what happens. > It might be a good play or a bad play. Static patterns > can't make that decision without more information. ( > is the group isolated? how big is it? what else is on > the board? ) > > > > Terry McIntyre <[EMAIL PROTECTED]> > > “Wherever is found what is called a paternal government, there is found state > education. It has been discovered that the best way to insure implicit > obedience is to commence tyranny in the nursery.” > > Benjamin Disraeli, Speech in the House of Commons [June 15, 1874] > > > > > You rock. That's why Blockbuster's offering you one month of Blockbuster > Total Access, No Cost. > http://tc.deals.yahoo.com/tc/blockbuster/text5.com > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/