Re: [computer-go] State of the art of pattern matching

2008-04-01 Thread Mark Boon


On 31-mrt-08, at 20:28, Don Dailey wrote:


You could be
blind-siding the program.


I think this is the crux of the matter. Not just in MC but in Go  
programming in general. If you add 'strong' knowledge you can create  
blind-spots. For example, I guess a ko rarely gets played during  
playouts because strong priority is given to connecting the stone in  
atari. It's only during the actual search that ko is discovered. But  
if you have a case where it's almost always out of the scope of the  
search-tree to find the blind-spot of a piece of knowledge then it  
can lead to very poor play.


Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] State of the art of pattern matching

2008-04-01 Thread Jacques Basaldúa

Hi Lukasz

In Rémi's paper about Bradly Terry models he found a way to give a
comparable gamma score to things that were different, for instance:
capturing a stone v.s. a given 3x3 pattern.
His model is much more general, but has less patterns (not at all
around 200K patterns of my system). Additionally, my mask pattern
only system has a reasonable a priori value:

times_played / (times_played + times_postponed).

But there is an additional issue handled by Rémi's system that is
ignored by the naif system: The fact that some competitions are
much harder than others. If the sum of concurrent scores to the
average competition is, say 100, the value of winning a competition
of sum 40 is obviously much less than winning one of sum 300. That
is not included in the simple formula. Therefore, a simple intuitive
idea if adding more for a the hard competition 300/100 and less for
the easy one 40/100. Because there is a feedback as when you change
the scores you are also changing the sums of concurrent scores, there
are so many variables to adjust ~200K and the a priori values are
already reasonable, I prefer to adjust doing small steps since
this is an offline process and speed is irrelevant.

The hits of each move don't change that much.

At the beginning of the adjustment:

Hits of move 1 = 0.382405 acc = 0.382405
Hits of move 2 = 0.113530 acc = 0.495935
Hits of move 3 = 0.067430 acc = 0.563365
Hits of move 4 = 0.048055 acc = 0.611420
Hits of move 5 = 0.036304 acc = 0.647725
Hits of move 6 = 0.028978 acc = 0.676703
Hits of move 7 = 0.023769 acc = 0.700472
Hits of move 8 = 0.019793 acc = 0.720264
Hits of move 9 = 0.016693 acc = 0.736957
Hits of move 10 = 0.014164 acc = 0.751121


At the end of the adjustment: (Additional steps don't
improve further, some value may even decrease.)

Hits of move 1 = 0.409278 acc = 0.409278
Hits of move 2 = 0.115598 acc = 0.524877
Hits of move 3 = 0.062659 acc = 0.587536
Hits of move 4 = 0.044014 acc = 0.631550
Hits of move 5 = 0.033681 acc = 0.665230
Hits of move 6 = 0.027696 acc = 0.692926
Hits of move 7 = 0.022885 acc = 0.715811
Hits of move 8 = 0.019029 acc = 0.734840
Hits of move 9 = 0.015758 acc = 0.750598
Hits of move 10 = 0.012886 acc = 0.763484

What I call the distribution is this: The gamma value of each legal
move defines a probability distribution function over the legal moves.
If we had no information (uniform random distribution) we would expect
that with 20% of the legal moves we hint 20% of the times, etc. So
for 0.1, 0.2 .. 0.9 of the legal moves we hit around 0.1, 0.2 .. 0.9.
Now, what happens when we use our pdf over the legal moves? We can
get 10%, 20% 30%, etc with very few move candidates. But, if the
first move has say 0.17 and the 1+2 = 0.21 and I want to get 0.2
If the first is a match, count 1, if the second is a match count
3/4 = (0.2 - 0.17)/(0.21 - 0.17) if any other move is a match count 0.
Do I really get 0.1, 0.2, .. 0.9, of course, with much less moves ?

No. for a reason I don't understand, I get something like:

Distribution fit expected 0.1 found 0.153164
Distribution fit expected 0.2 found 0.298602
Distribution fit expected 0.3 found 0.433074
Distribution fit expected 0.4 found 0.551575
Distribution fit expected 0.5 found 0.651135
Distribution fit expected 0.6 found 0.727419
Distribution fit expected 0.7 found 0.776876
Distribution fit expected 0.8 found 0.804008
Distribution fit expected 0.9 found 0.823292

So my distribution is distorted, when I try to get 30% of
the "guessing chance" I get 43.3%, but when I try to get
90% I get 82.3%. I don't know why.


Jacques.

Łukasz Lew wrote:


On Sat, Mar 29, 2008 at 3:47 PM, Jacques Basaldúa <[EMAIL PROTECTED]> wrote:
 


Mark Boon wrote:

> I suppose there's no reason why  it has to be assembler,
> you could just as well generate C-code.

I don't think so. But I don't write that much asm as it may appear.
I see what the compiler writes when I debug and write critical
parts only. And automatically generated code.


> I don't see yet how you go from there to finding patterns.

Depending on the mode, it computes either the hash or the mask.
That mask has to be translated into urgency by a table search.

(Btw. urgency is not just:

  times_played / (times_played + times_postponed)

but that value is used as an a priori value for Bradley Terry
models as Rémi introduced. My adjustment is not as advanced
as what Rémi describes. I just give a weight to a competition
based on the sum of the competing gamma values, and that gives
me another point:

SUM weights_of_played / SUM weights_of_played + weights_of_postponed
   



Can you tell how you came with such an equation?
Does it improves much?

Thanks
Lukasz

 


And I advance a step (like 1/4 of the distance) in that direction
and use the new point as the current value. Iterating this way
improves the wins of the best move but distorts the distribution
(I extract lots of statistics at each step) so I stop rather early.)

The macro for searching in an ordered 

Re: [computer-go] State of the art of pattern matching

2008-04-01 Thread Petr Baudis
On Mon, Mar 31, 2008 at 03:12:39PM -0700, Christoph Birk wrote:
>
> On Mar 31, 2008, at 1:05 PM, Don Dailey wrote:
>>
>>
>> Christoph Birk wrote:
>>>
>>> On Mar 31, 2008, at 10:48 AM, Mark Boon wrote:
 I don't know about this. I'm pretty sure MoGo checks if the stone can
 make at least two liberties (ladder problem) in which case it can
 still be horrible but very seldomly worse than random.
>>>
>>> I would expect playing a "not-working" ladder to be worse than random
>>> most
>>> of the time.
>> Of course this is true, but presumably a move that answers a direct
>> atari threat would classify as being better than random.
>
> Not if it's a working ladder.

I think this is not obvious, since there is still about 50% chance on
who gets the second move there. The original MoGo pattern describes only
capture-possibility check, not atari-possibility check, so even if you
play out the ladder, the opponent will most likely tenuki. So I think
not playing out working ladders as white is actually better because it
saves _black_ from getting a bad result!

I implemented a naive ladder check (that does not catch a lot of cases,
especially when you have to decide whether to "slide" the stones in
ladder along border or some of your friendly stones) and tested its
efficiency in UCT-driven playouts with only a 50% probability
capture-possibility check heuristic.

Against GNUGo Level8, I get 37% +-5.7% wins with ladder check, 28%
+-4.6% without ladder check. Unfortunately, the numbers are not very
precise and the difference is probably much smaller in reality - 37%
seems overinflated given later measurements; I will redo the test with
newer codebase and more games.

(I plan to post detailed measurements of effects of various heuristics
and parameters anyway; so far it seems my code is still too buggy
though: AMAF, prior knowledge and FPU all make my program weaker :-(.)

-- 
Petr "Pasky" Baudis
Whatever you can do, or dream you can, begin it.
Boldness has genius, power, and magic in it.-- J. W. von Goethe
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Some ideas how to make strong heavy playouts

2008-04-01 Thread Magnus Persson
A recurrent concept popping up in discussions on how to improve  
playouts is "balance". So I would like to try to share my philosophy  
behind the playouts of Valkyria and how I define "balance" and how it  
relates to the evaluation of go positions.


*Background

In an old school program the evaluation function would try to see  
which stones are safely connected to other stones of the same colours.  
Connected stones are called groups, and the program would probably  
also try to evaluate the safety of the groups looking at the eyespace  
at hand, weak neighbouring groups and so on. This quickly gets very  
commplicated. My old program Viking had several 1000's of handmade  
patterns for evaluating connectivity alone. This worked as a dream as  
long as the position consisted of patterns in the database... but in  
each an every game there were new situations and new patterns had to  
be added. A more robust method would be to use tactical search in the  
program to evaluate connectivity. The problem there is to ensure  
accuracy efficiently. Any tactical search tends to either become too  
time consuming, or resort to guessing.


*MC-evaluation

Programs using uniformly random MC-eval favors very solid but  
inefficient shape,  often building blocks of adjascent stones in the  
center of the board. The reason is that if stones are played more  
loosely the stones often get cut off and get killed in the simulations.


What we rather want is a program that can play efficent moves where  
stones are safely connected but occupy as much territory/eyespace as  
possible.


The tree search (UCT) cannot alone solve this problem. Shapes created  
in a 19x19 game may exist untouched to the late endgame and it is not  
possible to read out all shapes on the board. It is much better if  
secure shapes stay stable in the simulation.


They way I implemented that in Valkyria is that the playout part is  
basically reactive to random moves that attacks shapes on the board.  
It does not in any sense attempt to play the best move on the board.  
If it does not need to defend a shape it plays uniformly random  
somewhere. [Note that Valkyria also prunes really ugly moves, thus it  
plays uniformly the first move that is not pruned]


This is also how the pattern system works in Mogo as I understand it.  
If I remember correctly I would say that all Mogo patterns are very  
basic and common sense defenses against attacks on otherwise stable  
shapes.


But there also have to be balance. Valkyria also punishes bad shape.  
That is if a weak shape already is on the board, or a random move  
attacked two shapes simulatanously in the simulation, then the program  
may attack the weakness (or in a way it also reacts to the situation  
defending against "the weak shape becoming stronger"). Often the same  
move that would have been the proper defence is played.



*Eliminating noise rather than predicting  the best move

Nothing I wrote above is original or new to the readers of this list.  
But I want to make a distinction between systems that tries to predict  
the best move and a system that only tries to eliminate noise from the  
otherwise very random simulations.


Noise is eliminated when strong stones live and weak stones die almost  
always in the simulations. This way the random evaluations will mostly  
react to moves that matter in urgent fighting with shapes that are not  
yet stable. A MC-program that does this should stop defending and  
attacking strong shapes and would require much less simulations to  
discriminate between good and bad moves. Valkyria2 and Valkyria3 has  
slightly different tree search algorithms but uses the same playouts.  
Both versions needs only 512 playouts per move to win 50% against  
Gnugo 3.7.10.



Still I think predicting the best moves is very important in the tree  
part, but this may be much less important in the playouts, and perhaps  
even detrimental as some people have experienced.


*19x19

The best programs on 19x19 seems to focus the uct search on local  
fighting. This temporarilly overcomes the biases of the simulations  
locally. But the information gained locally about good shape in the  
tree is forgotten when the fighting moves elswhere. But this knowledge  
can then be rediscovered later if the fighting comes back. Could a  
future improvement to 19x19 go be to use locally narrow searches that  
seeds the playouts with strong patterns for the current shapes on the  
board? Maybe someone is already doing this? A really strong approach  
would be to eliminate the need of hard coded patterns or offline  
pattern harvesting and let the program learn during the game.


-Magnus
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Some ideas how to make strong heavy playouts

2008-04-01 Thread steve uurtamo
don,

>  But I also discovered that there seems to be no benefit whatsoever in
>  removing them from the play-outs.I have no real explanation for
>  this.   But it does tell me that the play-outs are very different in
>  nature from the tree - you cannot just use the same algorithms for
>  selection and prioritizing moves.

did you use the same heuristic in the playouts when pruning them?
i.e. that no other stones are anywhere nearby?

in any particular playout, they may be the killing move for a group
that ends up getting built near to the edge of the board, or a successful
monkey jump.  so removing them from somewhere deep in the tree
as a rule would be bad, but if nothing is anywhere nearby, removing
them as a first move choice is pretty reasonable.

they make up a fairly small fraction of possible moves, and this is
magnified the deeper you go into a search (so the net effect would
be diminished, even with the heuristic).

anything you can say about what not to do on the very first move
of a search, if it applies to every board situation (ha ha ha) is
great, though, if it's fast enough to check for.

in fact, if *nothing* is *anywhere* nearby, a move on even the
second line is bad.  third or fourth is much better.

s.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Some ideas how to make strong heavy playouts

2008-04-01 Thread Don Dailey
Hi Magnus,

Your post makes a great deal of sense.   I agree with all the points you
have stated.   I don't think you have ever made an illogical post like
most of us have (including myself) and they are always well thought out
and worded.

I have a response to this comment:

Still I think predicting the best moves is very important in the
tree part, but this may be much less important in the playouts, and
perhaps even detrimental as some people have experienced.

A class of "bad" moves is moves on the edge of the board when nothing
else is nearby.  In other words, the candidate edge placement doesn't
"touch" any other stones of either color. In my experiments I found
that it strengthens the program significantly to avoid trying these
moves in the tree search (I call them veto patterns even though they are
not necessarily veto'd permanently.)   

But I also discovered that there seems to be no benefit whatsoever in
removing them from the play-outs.I have no real explanation for
this.   But it does tell me that the play-outs are very different in
nature from the tree - you cannot just use the same algorithms for
selection and prioritizing moves. 

I think I like your theory that eliminating noise is a good thing and
probably the primary thing.

Your last comment is what I am interested in the most,  and what I have
spent quite some time looking at.  I call it  "learning on the
fly."   The use of patterns is acceptable in this setting if those
patterns are learned during the game. The basic important
observation is that most of the things learned during the game is
relearned over and over and over and over  

All-moves-as-first is a form of this learning, but very brittle of
course. And I don't have a problem with imposing knowledge from the
outside if it's used only as a "kick start" - a way to overcome the
initial learning inertia,  but not imposed in any absolute way.

- Don
 



Magnus Persson wrote:
> A recurrent concept popping up in discussions on how to improve
> playouts is "balance". So I would like to try to share my philosophy
> behind the playouts of Valkyria and how I define "balance" and how it
> relates to the evaluation of go positions.
>
> *Background
>
> In an old school program the evaluation function would try to see
> which stones are safely connected to other stones of the same colours.
> Connected stones are called groups, and the program would probably
> also try to evaluate the safety of the groups looking at the eyespace
> at hand, weak neighbouring groups and so on. This quickly gets very
> commplicated. My old program Viking had several 1000's of handmade
> patterns for evaluating connectivity alone. This worked as a dream as
> long as the position consisted of patterns in the database... but in
> each an every game there were new situations and new patterns had to
> be added. A more robust method would be to use tactical search in the
> program to evaluate connectivity. The problem there is to ensure
> accuracy efficiently. Any tactical search tends to either become too
> time consuming, or resort to guessing.
>
> *MC-evaluation
>
> Programs using uniformly random MC-eval favors very solid but
> inefficient shape,  often building blocks of adjascent stones in the
> center of the board. The reason is that if stones are played more
> loosely the stones often get cut off and get killed in the simulations.
>
> What we rather want is a program that can play efficent moves where
> stones are safely connected but occupy as much territory/eyespace as
> possible.
>
> The tree search (UCT) cannot alone solve this problem. Shapes created
> in a 19x19 game may exist untouched to the late endgame and it is not
> possible to read out all shapes on the board. It is much better if
> secure shapes stay stable in the simulation.
>
> They way I implemented that in Valkyria is that the playout part is
> basically reactive to random moves that attacks shapes on the board.
> It does not in any sense attempt to play the best move on the board.
> If it does not need to defend a shape it plays uniformly random
> somewhere. [Note that Valkyria also prunes really ugly moves, thus it
> plays uniformly the first move that is not pruned]
>
> This is also how the pattern system works in Mogo as I understand it.
> If I remember correctly I would say that all Mogo patterns are very
> basic and common sense defenses against attacks on otherwise stable
> shapes.
>
> But there also have to be balance. Valkyria also punishes bad shape.
> That is if a weak shape already is on the board, or a random move
> attacked two shapes simulatanously in the simulation, then the program
> may attack the weakness (or in a way it also reacts to the situation
> defending against "the weak shape becoming stronger"). Often the same
> move that would have been the proper defence is played.
>
>
> *Eliminating noise rather than predicting  the best move
>
> Nothing I wrote above is original or new to

Re: [computer-go] Some ideas how to make strong heavy playouts

2008-04-01 Thread Magnus Persson

Quoting Don Dailey <[EMAIL PROTECTED]>:


I have a response to this comment:

Still I think predicting the best moves is very important in the
tree part, but this may be much less important in the playouts, and
perhaps even detrimental as some people have experienced.

A class of "bad" moves is moves on the edge of the board when nothing
else is nearby.  In other words, the candidate edge placement doesn't
"touch" any other stones of either color. In my experiments I found
that it strengthens the program significantly to avoid trying these
moves in the tree search (I call them veto patterns even though they are
not necessarily veto'd permanently.)


Sofar I have not tried removing those moves in the tree only. I need  
to try that.



But I also discovered that there seems to be no benefit whatsoever in
removing them from the play-outs.I have no real explanation for
this.   But it does tell me that the play-outs are very different in
nature from the tree - you cannot just use the same algorithms for
selection and prioritizing moves.


My experience is that sometimes I get really bad interactions between  
special patterns in my playouts and plays on the first and second  
line. For example, if ladders are implemented then a move on the first  
line will be able to capture a contact move to the side of it. Thus I  
had situations where I think removing those moves made a positve  
difference, because the program liked those bad moves too much. But an  
alternative solution is also to prune bad moves, for example playing  
next to a stone of the first line allowing a ladder is perhaps even  
worse than the first move. So my program solves these kind of problems  
in a very adhoc multiple ways so it is hard to tell if my eperineces  
and your experiences here are universal or just specific to certain  
programs as a whole.


Still I recently had the idea that perhaps AMAF works best if in all  
moves are played because then each move played in the playouts are  
tested under a larger variety of situations. But I do not remember now  
if I tested it properly.


What I did do recently was to stop pruning moves in the trees and  
allowed all legal moves. This dropped the winrate of 1500 playouts vs  
gnugo with 10%, it seemed to have a lot of temporary hallucinations  
early in search where a lot of really bad moves were searched perhaps  
50-200 times before finding more decent moves to search.


I planned to make a version that make some soft pruning allowing them  
to be searched later so that tree will have full width near the root  
and be strongly pruned below. I would be happy if I could do that and  
keep the playing strength becuase then my program would at least in  
principle be able to play perfectly. Right now the pruning is to  
aggressive for that to be true.






I think I like your theory that eliminating noise is a good thing and
probably the primary thing.

Your last comment is what I am interested in the most,  and what I have
spent quite some time looking at.  I call it  "learning on the
fly."   The use of patterns is acceptable in this setting if those
patterns are learned during the game. The basic important
observation is that most of the things learned during the game is
relearned over and over and over and over 

All-moves-as-first is a form of this learning, but very brittle of
course. And I don't have a problem with imposing knowledge from the
outside if it's used only as a "kick start" - a way to overcome the
initial learning inertia,  but not imposed in any absolute way.


I agree also here but I have no really good practical ideas, just  
loose feeling on what it should be like. But every time I try to think  
in more concrete terms I never get anywhere...


-Magnus
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Some ideas how to make strong heavy playouts

2008-04-01 Thread Don Dailey


steve uurtamo wrote:
> don,
>
>   
>>  But I also discovered that there seems to be no benefit whatsoever in
>>  removing them from the play-outs.I have no real explanation for
>>  this.   But it does tell me that the play-outs are very different in
>>  nature from the tree - you cannot just use the same algorithms for
>>  selection and prioritizing moves.
>> 
>
> did you use the same heuristic in the playouts when pruning them?
> i.e. that no other stones are anywhere nearby?
>
>   
Yes, of course.Nearby means "touching" in this case because this is
based on 3x3 patterns where the center point is the point in question.

I am of course aware that there are exceptions to this pattern,  the
move I am throwing out in some hopefully rare situations could be the
best move. But my intuition tells me it should be more dangerous in
the tree, and should make the play-outs far more relevant on average -
but it doesn't seem to work that way!


> in any particular playout, they may be the killing move for a group
> that ends up getting built near to the edge of the board, or a successful
> monkey jump.  so removing them from somewhere deep in the tree
> as a rule would be bad, but if nothing is anywhere nearby, removing
> them as a first move choice is pretty reasonable.
>   
I don't actually "remove" them permanently.   They will still get tried
when the simulation count gets high enough - I don't believe in absolute
selectivity unless you can prove it's admissible.


> they make up a fairly small fraction of possible moves, and this is
> magnified the deeper you go into a search (so the net effect would
> be diminished, even with the heuristic).
>   
But the effect on the tree is pretty large.   It's very useful to prune
even a few moves and get a big total benefit if the pruning is
sound. In practice you get MORE pruning as the game progresses and
the center fills with stones until the edge starts getting touched. 
> anything you can say about what not to do on the very first move
> of a search, if it applies to every board situation (ha ha ha) is
> great, though, if it's fast enough to check for.
>   
Pruning should be progressive.   If anything, you should look at more
moves near the root of the tree, not less.Of course this is subject
to hard hard you have to work at it,  and how sure you are of the rule
you use.  It's no good to see that an edge move is actually
important but then not be able to see it when the position actually
occurs on the game board.

> in fact, if *nothing* is *anywhere* nearby, a move on even the
> second line is bad.  third or fourth is much better.
>   
My patterns were collected statistically and moves to the edge were
never played by my strongest program playing at long time controls,  and
thus the 3x3 edge pattern was generated by the automated pattern
harvester. 

I think when I did 5x5 patterns I got some patterns just like you
describe,  moves to the 2nd line were rare if nothing was within the 5x5
pattern space.But if I remember they were not considered as horrible
as edge moves.   I don't remember if they made my veto threshold or not.

- Don


> s.
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
>   
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Some ideas how to make strong heavy playouts

2008-04-01 Thread Jonas Kahn
I think there was some confusion in Don's post on ``out of atari'' in
play-outs.
For one thing, I do not agree with the maximal information argument.
Testing ``out of atari'' moves is not good because they might be good,
or might be bad, but merely because they might be good. By contrast, you
should test (in the tree) a kind of move that is either good or average,
but not either average or bad, even if it's the same amount of
information. In the tree, you look for the best move. Near the root at
least; when going deeper and the evaluation being less precise, you
merely look for good moves, that keep a trustworthy evaluation of the
position, and try to avoid brittle ones.

In the playouts, that's another matter. I would say that (almost) always
playing 'out of atari' would add stability, much in the way Magnus
Persson very well explained.

What do we want of playout policies ? 
As static evaluation functions, we would want them to give the right
ordering of move values, with differences as wide as possible.
More precisely, we would want that among the best moves, that's not
important if the evaluation is not precise for bad moves.

Now we build a tree while computing the evaluation function. So that we
can allow for false good moves if they are quickly seen as such in the
tree, that is after a one or three plies search, if the false good moves
are not too numerous.
False wrong moves is much worse, since we might never exploit the branch
long enough to correct this feeling.

The latter paragraph applies also to pre-ordering (I keep a model like
the one of Crazystone in mind, with a priori probabilities for each move
in the tree, and also a distribution in the playouts).


Conclusions:
It's no matter if there is a bias in the playout policy, as long as it's
the same for all moves.
Bias toward solid move is therefore a nuisance...
Playing (not too numerous) nonsensical moves would only be a nuisance if
there are positions whose associated playouts call for many more urgent
moves than in others.

What matters is making two moves having very different values. Here the
reduction of noise comes into play: if there is 50% chance in all
playouts that an alive group dies, and decides the game, then the
difference in evaluation of the other positions is divided by two...
Taking out all the common noise (that is the mistakes that appear in all
playouts) makes the distinction easier.
On the other hand, concentrating along a wrong evaluation (after this
particular move, the group is dead) would be a catastropha.
If this comes from one particular move, it should be noticed and
played/avoided systematically.


About learning on the fly:
I agree completely, that was one of my first posts.
However I really think we should have learnt patterns (and other
properties) first: you cannot learn the whole of go, or even your game,
in one game. 
And learning is a good thing, but you must find the good move first, and
should as quick as possible.
For one thing, if we learn patterns, here they should obviously be
localized (not translation invariant). We could also learn short move
sequences.
In a setting with probability distribution on moves, taking that into
account is merely changing the probability distribution. Question is:
how ?

By the way, my guess is that learning on the fly would be more important
in the playouts than in the tree: it would contribute to stabilizing the
playouts. The tree should anyhow end up with the good moves.

This learning should probably also come from the playouts (very much
information, and we could stay with information already calculated for
the playouts, allowing easy re-use), automatically building a status for
groups that are solved only near the end...

Jonas, who always reread thinking ``How do I manage to be that unclear
!"
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-04-01 Thread Jonas Kahn
Hi Jacques

>
> No. for a reason I don't understand, I get something like:
>
> Distribution fit expected 0.1 found 0.153164
> Distribution fit expected 0.2 found 0.298602
> Distribution fit expected 0.3 found 0.433074
> Distribution fit expected 0.4 found 0.551575
> Distribution fit expected 0.5 found 0.651135
> Distribution fit expected 0.6 found 0.727419
> Distribution fit expected 0.7 found 0.776876
> Distribution fit expected 0.8 found 0.804008
> Distribution fit expected 0.9 found 0.823292
>
> So my distribution is distorted, when I try to get 30% of
> the "guessing chance" I get 43.3%, but when I try to get
> 90% I get 82.3%. I don't know why.

I guess you have checked that with your rules for getting probability
distributions out of gammas, the mean of the probability of your move 1
was that that you observed (about 40 %) ? And the same for the following
ones ?
For the tails (many moves), I guess that a proportion of available moves
is a better reference.

In fact, doing that for the following moves too would be a way to
calibrate your function (gamma_1, gamma_2) -> Prob(gamma_1);
All the logits and so on are somewhat arbitrary and can be wrong
especially in the tails.

For a distribution fit of .1, I guess you merely always keep the first
one (except very special case). Then, if you had a homogeneous 40%
probability that first choiceis the right one, and you decide it's half
the time 30%,  half the time 50%, you see that you get (.1/.3 + .1/.5) /
2 > .1/.4.
Hence if your fluctuations in the estimation of the first choice are not
adapte, you shall get bigger fit.

On the other hand, when you are not in the first moves, shape is not a
factor for really deciding the move.
Suppose that for 50% of moves, shape counts and the right move is one of
the first 5 candidates, and that for 50% it does not count, and the good
move is chosen uniformly at random with respect to your patterns
criterion;
Then achieving .9 fit merely asks for always taking 80% of the possible
moves (including of course the 'good pattern' ones).
If you have wrong estimation of your tail probabilities, you can easily
select much less moves.

Jonas
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-04-01 Thread Joshua Shriver
Do you have a link to those papers?
-Josh



> My go-programming efforts are very much concentrated on patterns.
> (maybe I have been influenced by the Kojima-papers)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Some ideas how to make strong heavy playouts

2008-04-01 Thread Don Dailey


Jonas Kahn wrote:
> I think there was some confusion in Don's post on ``out of atari'' in
> play-outs.
> For one thing, I do not agree with the maximal information argument.
>   
This is more a theory than an argument.   Maybe I didn't express it very
well either.

It's a pretty solid principle in tree search - for instance in computer
chess the "checking" move and the responses to it are considered  highly
important moves and almost every program even extends the search on
these moves.   Is that because checking moves are usually great
moves?Of course not - the majority of them are blunders.

But they are super critical in importance because you really "just gotta
know", you cannot afford to be wrong about these.

Atari moves in Go is the closest things to check in chess.   You cannot
thumb your nose at them because there is more game critical information
in them than in a typical random placement.I would say that they win
and lose games more than most other moves and that their resolution is
likely to decide the game either way.

You may not believe that, or perhaps you think other moves are more
important and tell you more about who will win and lose.   I'm not a go
player so I could be wrong about these moves being important,  but I
noticed that the first Mogo paper addresses this issue as the very first
order of business.  

So I don't know how you define information content,  but if I were
writing a classic program I would probably have as my first order of
business a routine to tell me if groups that have limited liberties
could be captured.   And I would consider a ladder search as super
important.  Maybe David Fotland can tell us if ladder search is very
important or not,  but I believe it would be.   It is "information rich"
in that it doesn't mater whether it wins or loses,  you need to know in
either case or you will lose the game.

What is often not understood in game tree search is that you usually
have to look at certain classes of bad moves pretty hard to understand
that they are bad.   The knee jerk reaction is to throw out moves that
are "on average" bad but you really should be more focused on moves that
have high information content.   In chess these are the so called "high
compulsion" moves.   Checks, captures,  threats, etc.Most checks,
captures, threats are stupid moves but it'a an error to immediately
prune them away.  

So I could be wrong about this,  but I'm pretty convinced I am right -
you must look at high information content moves. These moves are not
"noisy" because they tend to tell you right away if you are going to win
or lose.

> Testing ``out of atari'' moves is not good because they might be good,
> or might be bad, but merely because they might be good. 
That's what I mean by information content.You are semantically
challenged here,  what you are saying agrees with what I am saying. 
If I run a marathon just to "see if I can finish" you can also rephrase
this to say, "To see if I can finish or not." The only important
issue is that I would run the marathon in order to make the
distinction. 

> By contrast, you
> should test (in the tree) a kind of move that is either good or average,
> but not either average or bad, even if it's the same amount of
> information. In the tree, you look for the best move. Near the root at
> least; when going deeper and the evaluation being less precise, you
> merely look for good moves, that keep a trustworthy evaluation of the
> position, and try to avoid brittle ones.
>   
Again, you are semantically challenged but basically correct.   A move
that you can statically evaluate as being on the lower end of the scale
does not have much information content - in other words evaluating it in
the tree has very little chance of changing the score.   

The alpha beta procedure in general is about information content in a
big way.Many of the branches cut off in alpha beta spring from great
moves, that have no chance of changing the score so there is no useful
information there.You would not look at those moves just because
they happen to be "really great" moves.  

> In the playouts, that's another matter. I would say that (almost) always
> playing 'out of atari' would add stability, much in the way Magnus
> Persson very well explained.
>
> What do we want of playout policies ? 
> As static evaluation functions, we would want them to give the right
> ordering of move values, with differences as wide as possible.
> More precisely, we would want that among the best moves, that's not
> important if the evaluation is not precise for bad moves.
>   
Maybe I'm semantically challenged now,  but the only correct ordering of
move values is win or lose.I suppose you are saying that there
should be a variety of scores that somehow reflect the difficulty of
winning?  

This is hard to talk about without quantifying exactly what should be
returned in an ideal world.  If the evaluation can be perfect of course
you 

Re: [computer-go] Some ideas how to make strong heavy playouts

2008-04-01 Thread terry mcintyre
Don, 

I'd strongly agree. You must know whether ladders work
or not, whether a nakade play works or not, whether
various monkey jumps and hanes and so forth succeed or
not. In and of themselves, few moves are objectively
good or bad in any sense - one has to try them and see
what happens.

Some form of search or playout is needed to determine
this. Even patterns which are completely understood
must be evaluated in the context of a game.

To take a trivial example, three liberties in a row -
should the middle point be played to reduce it to one
eye, or to create two eyes, depending on whose move it
is? Usually the answer is "yes, if the life of a group
depends on that play, and there is nothing bigger to
do - otherwise, it's a bad play."

It's important to try that play and see what happens.
It might be a good play or a bad play. Static patterns
can't make that decision without more information. (
is the group isolated? how big is it? what else is on
the board? )



Terry McIntyre <[EMAIL PROTECTED]>

“Wherever is found what is called a paternal government, there is found state 
education. It has been discovered that the best way to insure implicit 
obedience is to commence tyranny in the nursery.”

Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]


  

You rock. That's why Blockbuster's offering you one month of Blockbuster Total 
Access, No Cost.  
http://tc.deals.yahoo.com/tc/blockbuster/text5.com
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Some ideas how to make strong heavy playouts

2008-04-01 Thread Don Dailey


terry mcintyre wrote:
> Don, 
>
> I'd strongly agree. You must know whether ladders work
> or not, whether a nakade play works or not, whether
> various monkey jumps and hanes and so forth succeed or
> not. In and of themselves, few moves are objectively
> good or bad in any sense - one has to try them and see
> what happens.
>
> Some form of search or playout is needed to determine
> this. Even patterns which are completely understood
> must be evaluated in the context of a game.
>   
Yes, and even the strongest patterns may not be relevant, especially in
our MC programs which lose interest once the game is won or lost.  

That's partly why I'm interested in exploring "on the fly" leaning.  
Learning outside the context of the position being played may not have
much relevance.

- Don


> To take a trivial example, three liberties in a row -
> should the middle point be played to reduce it to one
> eye, or to create two eyes, depending on whose move it
> is? Usually the answer is "yes, if the life of a group
> depends on that play, and there is nothing bigger to
> do - otherwise, it's a bad play."
>
> It's important to try that play and see what happens.
> It might be a good play or a bad play. Static patterns
> can't make that decision without more information. (
> is the group isolated? how big is it? what else is on
> the board? )
>
>
>
> Terry McIntyre <[EMAIL PROTECTED]>
>
> “Wherever is found what is called a paternal government, there is found state 
> education. It has been discovered that the best way to insure implicit 
> obedience is to commence tyranny in the nursery.”
>
> Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]
>
>
>   
> 
> You rock. That's why Blockbuster's offering you one month of Blockbuster 
> Total Access, No Cost.  
> http://tc.deals.yahoo.com/tc/blockbuster/text5.com
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
>   
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/