Re: [computer-go] Should 9x9 komi be 8.0 ?

2008-02-27 Thread Vlad Dumitrescu
Hi,

On Wed, Feb 13, 2008 at 7:51 PM, Don Dailey <[EMAIL PROTECTED]> wrote:
>  In a serious competition you would want to throttle down the playing
>  strength (when playing black) so that it could win more and not just
>  quit (resign) out of frustration!

Why throttle the playing strength? Wouldn't be enough to raise the
threshold where the program resigns?

Naively put: if all results say the game is lost, switch the
evaluation to "best possible score" and continue playing for a while.
If any winning paths appear, switch back to normal evaluation, else
resign.

best regards,
Vlad
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[EMAIL PROTECTED]: Re: [computer-go] Should 9x9 komi be 8.0 ?]

2008-02-27 Thread jonas . kahn
Hi there

I am new here, but have read the list for a few monthes.
I am a mathematician, finishing my PhD on quantum statistics (that is
statistics on quantum objects, quantum information, etc.).
So do not expect me to write any code, but I could have suggestions for
heuristics in the choice of moves and so on.

I'll speak more about one of them in the coming days.

I shall just begin with a short reflexion upon the evaluation function,
starting from the Don's fear that we might have to curb too powerful and
pessimistic programs.

> The funny thing about MC scoring is that if white DOES win,  then it
> doesn't matter what black plays, everything loses!
>
> That would mean that in a scalability study,   where colors are
> alternated,  really strong versions of Mogo would score poorly against
> the weaker programs.
>
> In a serious competition you would want to throttle down the playing
> strength (when playing black) so that it could win more and not just
> quit (resign) out of frustration!


There was the suggestion of
something like tanh(K*score), instead of 0-1 scoring.

A similar system would make some sense, without loss of strength, if we
scale K like 1/n, where n is the number of simulations. A consequence
would be that programs would choose among 100% victory or defeat moves
according to the score they expect.

Let us analyze all this more precisely.

First, I consider that the "true" evaluation function E of the node is
the
exact probability of winning by using the MC simulations. In other
words,
that's the value we would get with infinitely many runs and 1-0 scoring,
and that's what I want to approximate as precisely as possible with my
finite number of runs.

Second I evaluate here only at one node, without growth of the tree.
Given the conclusions, there should not be much difference. If the
conclusion had been: we can do much better than 0-1, that would have
been quite another matter.

The 1-0 evalutation function has convergence in 1/\sqrt{n}, impossible
to beat if there is sufficient irregularity in the expectation of each
result.

Now, let us write p_i for the probability that the score of the MC
simulation  is i. So that i=0.5, -3.5, or 7.5 for example.
We have \sum p_i = 1, and our evaluation function is
E = \sum_i p_i sign(i)
(Well that is rather counting -1 for a defeat, but that's the same and
is more symmetrical, so nicer for presentations of the following
calculations).

We observe
\hat{p}_i = n_i / n
where n_i is the number of our simulations that give result i, on the n
simulations we have carried out.
0-1 evaluation corresponds to estimate E by
\hat{E} = \sum_i \hat{p}_i sign(i).

Now the p_i are heavily correlated (**), so that I am convinced that I
could
get better results than 0-1 for low number of simulations. For high
number of simulations, certainly not (the autocorrelation would
dominate).

Now these methods (probably using low-band filters on the array of
\hat{p}_i) would probably be much slower than running a few more
simulations...

If we keep to something simple, we could use functions of the form
\hat{E}_{\lambda} = \sum_i \hat{p}_i sign(i) (1 - \lambda_i).

This includes the suggestion of using tan(K i)...
Notice that 0-1 corresponds to taking lambda_i = 0.

Let us compute the expected mean square error (MSE)  = Bias^2 + Variance:

Bias :
B = - E + \sum_i p_i sign(i) (1 - \lambda_i)
  = - \sum_i p_i sign(i) \lambda_i

This is zero with the 0-1 evaluation function.

Variance :
V = (1/n) (\sum_i p_i (1 - \lambda(i))^2  - (E + B)^2)

We are only interested in the difference with the 0-1 case, so I do not
develop further the variance. For 0-1, the variance would be 1/n (1 -
E^2).

MSE(\lambda) - MSE(0-1) =
B^2 (1 - 1/n) - 2 E*B / n + (1/n) [ \sum_i p_i (\lambda(i)^2 - 2 \lambda_i)]

So that if we want this value to be negative, showing better results
than 0-1, we first need the bias to be of order 1/n.
The only way to achieve this with pre-chosen p_i is to take (\lambda_i _
1) of order 1/n.
The change would then be only cosmetic: this means changing at most a
fixed number of outcomes when n goes to infinity !

Then the first order (that is 1/n^2) can be rewritten:
B^2 + (2/n) (\sum_i p_i \lambda(i) [E sgn(i) - 1] )

If we restrict to the cases where \lambda(i) = \lambda(-i), then we see
that the positive contributions imply (products of) terms or the form:
(p_i - p_{-i}) \lambda(i)
whereas the negative contribution implies
(p_i + p_{-i})  \lambda(i)

As we want to be as negative as possible, we could get a small
improvement if \lambda_i is bigger when p_i is near p_{-i}.
A priori, this should be the case for small i: if there are high chances
that final score is -.5, there are high chances that it is 1.5. No link
between 20.5 and -19.5 a priori, on the other hand...

We have to keep \sum \lambda_i very low, though, otherwise the squared bias 
gets dominant.


*** CONCLUSION ***

If we restrict our attention   to scheme where we add a value to the
node after the sim

[computer-go] March KGS bot tournament

2008-02-27 Thread Nick Wedd
Registration is now open for this Sunday's bot tournament.  This will 
use full-sized boards for both divisions. It will start at 16:00 GMT, 
and take place in the Asian night, European evening, and American 
daytime.  Time limits will be 45 minutes each, sudden death. It will end 
around half an hour before midnight GMT.


Registration is as usual, and is as described at
http://www.weddslist.com/kgs/how/index.html

The tournaments themselves are on the KGS site at
http://www.gokgs.com/tournInfo.jsp?id=366 and
http://www.gokgs.com/tournInfo.jsp?id=367.

Nick
--
Nick Wedd[EMAIL PROTECTED]
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Should 9x9 komi be 8.0 ?

2008-02-27 Thread Matthew Woodcraft
Vlad Dumitrescu wrote:
> Why throttle the playing strength? Wouldn't be enough to raise the
> threshold where the program resigns?

> Naively put: if all results say the game is lost, switch the
> evaluation to "best possible score" and continue playing for a while.
> If any winning paths appear, switch back to normal evaluation, else
> resign.

I experimented with something similar a while ago, using the
publicly available mogo and manipulating komi between moves.

If its win probability fell below a certain threshold (and the move
number wasn't too high), I told it to play on the assumption that it
would receive a few points more komi (and similarly when the win
probability became high).

This did seem to make it stronger vs gnugo on 13x13, but I didn't
investigate thoroughly.

I hope that techniques like this will be able to avoid the situation
where the computer starts playing very risky moves when it judges it is
only a little behind (which can be foolish if the opponent is likely to
make plenty of mistakes of its own). They could also help a computer
make better use of handicap stones.

-M-

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] f(score) instead of sign(score)

2008-02-27 Thread Rémi Coulom

Hi Jonas,

welcome to the list.

The idea of using f(score) instead of sign(score) is interesting. Long 
ago, I tried tanh(K*score) on 9x9 (that was before the 2006 Olympiad, so 
it may be worth trying again), and I found that the higher K, the 
stronger the program. Still, I believe that other f may be worth trying.


By the way, Olivier and Sylvain mentioned earlier on this list that they 
were using floating point in their tree data structure. So MoGo may be 
using a floating point function to estimate the score of a playout, 
otherwise there would be no reason to use floating point. But I may be 
guessing wrong. Maybe they can tell us ?


Rémi
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Open source paper

2008-02-27 Thread David Silver
For those who haven't seen it, here is a nice paper describing the  
benefits of open source software. This journal (JMLR) is also starting  
a new track for publications relating to open source software.


http://jmlr.csail.mit.edu/papers/volume8/sonnenburg07a/sonnenburg07a.pdf

-Dave
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/