Re: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread dhillismail
> From: Don Dailey <[EMAIL PROTECTED]> > ... > > Rémi Coulom wrote: > > ... > > Instead of playing UCT bot vs UCT bot, I am thinking about running a > > scaling experiment against humans on KGS. I'll probably start with 2k, > > 8k, 16k, and 32k playouts. > That would be a great experiment.   Ther

RE: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread David Fotland
3 kyu at this level is a lot for a person. I've know club players who never got better than 9k, and people who study and play may still take a year or more to make this much improvement. Many club players stall somewhere between 7k and 4k and never get any better. David > -Original Message-

Re: [computer-go] Re: Scalbility study: low end

2008-01-29 Thread Heikki Levanto
On Thu, Jan 24, 2008 at 03:19:52PM +0100, Magnus Persson wrote: > > As a rule of thumb I want 300 games for each datapoint and at least > 500 if I am going to make any conclusions. Ok, I think we start to have those 500 games. To my eyes, FatMan shows a clear turn in the curve at FatMan_03. Be

Re: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread Michael Williams
I don't feel like searching for it right now, but not too long ago someone posted a link to a chart that gave the winrates and equivalent rankings for different rating systems. Don Dailey wrote: I wish I knew how that translates to win expectancy (ELO rating.)Is 3 kyu at this level a prett

Re: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread Don Dailey
I wish I knew how that translates to win expectancy (ELO rating.)Is 3 kyu at this level a pretty significant improvement? - Don Hiroshi Yamashita wrote: >> Instead of playing UCT bot vs UCT bot, I am thinking about running a >> scaling experiment against humans on KGS. I'll probably start

Re: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread Don Dailey
We can say with absolute statistical certainty that humans when playing chess improve steadily with each doubling of time.This is not a hunch, guess or theory, it's verified by the FACT that we know exactly how much computers improve with extra time and we also know for sure that humans play

Re: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread Don Dailey
Hiroshi Yamashita wrote: >> What are the time controls for the games? > > Both are 10 minutes + 30 seconds byo-yomi. > > Hiroshi Yamashita Good. I think that is a good way to test. - Don > > > ___ > computer-go mailing list > computer-go@computer

Re: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread Hiroshi Yamashita
What are the time controls for the games? Both are 10 minutes + 30 seconds byo-yomi. Hiroshi Yamashita ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread Don Dailey
What are the time controls for the games? - Don Hiroshi Yamashita wrote: >> Instead of playing UCT bot vs UCT bot, I am thinking about running a >> scaling experiment against humans on KGS. I'll probably start with >> 2k, 8k, 16k, and 32k playouts. > > I have a result on KGS. > > AyaMC 6k (5.9k

Re: [computer-go] scalability and transitivity

2008-01-29 Thread Don Dailey
Jeff Nowakowski wrote: > On Tue, 2008-01-29 at 17:41 -0500, Don Dailey wrote: > >> This is in response to a few posts about the "self-test" effect in ELO >> rating tests. >> > [...] > >> So my assertion is that scalability based on sound principles is more or >> less universal with per

Re: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread Don Dailey
Rémi Coulom wrote: > Don Dailey wrote: >> They seem under-rated to me also. Bayeselo pushes the ratings together >> because that is apparently a valid initial assumption. With enough >> games I believe that effect goes away. >> >> I could test that theory with some work.Unless there is a

Re: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread Hiroshi Yamashita
Instead of playing UCT bot vs UCT bot, I am thinking about running a scaling experiment against humans on KGS. I'll probably start with 2k, 8k, 16k, and 32k playouts. I have a result on KGS. AyaMC 6k (5.9k) 16po http://www.gokgs.com/graphPage.jsp?user=AyaMC AyaMC2 9k (8.4k) 1po http:

Re: [computer-go] scalability and transitivity

2008-01-29 Thread Michael Williams
He stated that there is a depth limit in FatMan. IMO, it is quite likely that is the limiting factor. Jeff Nowakowski wrote: On Tue, 2008-01-29 at 17:41 -0500, Don Dailey wrote: This is in response to a few posts about the "self-test" effect in ELO rating tests. [...] So my assertion is th

Re: [computer-go] scalability and transitivity

2008-01-29 Thread Jeff Nowakowski
On Tue, 2008-01-29 at 17:41 -0500, Don Dailey wrote: > This is in response to a few posts about the "self-test" effect in ELO > rating tests. [...] > So my assertion is that scalability based on sound principles is more or > less universal with perhaps a small amount of self-play distortion, but >

Re: [computer-go] 19x19 Study - prior in bayeselo, and KGS study

2008-01-29 Thread Rémi Coulom
Don Dailey wrote: They seem under-rated to me also. Bayeselo pushes the ratings together because that is apparently a valid initial assumption. With enough games I believe that effect goes away. I could test that theory with some work.Unless there is a way to turn that off in bayelo (I d

Re: [computer-go] 19x19 Study

2008-01-29 Thread Don Dailey
In other tests I've done, bayeselo comes out pretty much the same as an iterative method I use where players are batch rated over and over until they converge. I did a similar thing with the current data and it comes out very similar to bayeselo, so I have no reason to believe that with a few hu

Re: [computer-go] 19x19 Study

2008-01-29 Thread Don Dailey
Yes, you are right. But gnugo's 1800 rating is the only real point of reference that I have. As you get farther away from 1800 I believe it's the case that the "true" rating can be sloppy. - Don Sylvain Gelly wrote: > > between pairs > of programs, you can get a more and more conf

[computer-go] scalability and transitivity

2008-01-29 Thread Don Dailey
This is in response to a few posts about the "self-test" effect in ELO rating tests. I'll start by claiming right up front that I don't believe, for certain types of programs, that this is something we have to worry unduly about. I'll explain why I feel that way in a moment. One general obser

Re: [computer-go] 19x19 Study

2008-01-29 Thread Christoph Birk
On Tue, 29 Jan 2008, Álvaro Begué wrote: Of course the value of A is the interesting number here. If they are using anything like the common UCT exploration/exploitation formula, I think we will see a roof fairly soon. In my own experiments at long time controls, the program would spend entirely

Re: [computer-go] 19x19 Study

2008-01-29 Thread Sylvain Gelly
> > between pairs > of programs, you can get a more and more confident belief about > the actual ELO. so they'll converge to the correct values, and > should do so reasonably rapidly. > You are right. My point was that here we have only 1 fixed rating, which is very low, and all the higher levels

Re: [computer-go] 19x19 Study

2008-01-29 Thread Álvaro Begué
On Jan 29, 2008 3:18 PM, steve uurtamo <[EMAIL PROTECTED]> wrote: > > But that's not relevant for this study. All that matters is the 14 > out > > of 20 total score and the order should not matter one little bit. > > With players that change, that is very relevant however. > > lemme think tha

Re: [computer-go] 19x19 Study

2008-01-29 Thread steve uurtamo
sorry, of course you're right -- linearity of expectation, after all. the total number of games is relevant for the std. dev., though. which has settled down for everyone but the new guys to a quite respectably small number of elo points. s. - Original Message From: Don Dailey <[EMAIL

Re: [computer-go] 19x19 Study

2008-01-29 Thread steve uurtamo
> But that's not relevant for this study. All that matters is the 14 out > of 20 total score and the order should not matter one little bit. > With players that change, that is very relevant however. lemme think that over. by the way, attached is what a quadratic fit looks like to the

Re: [computer-go] 19x19 Study

2008-01-29 Thread Don Dailey
steve uurtamo wrote: > it is a good thing to make your prior knowledge completely > fair (in the sense of not having any bias) when doing bayesian > calculations. any estimator being used will reshape that > knowledge on the fly. > > the idea is that your prior knowledge of the ELO ranking shoul

Re: [computer-go] 19x19 Study

2008-01-29 Thread steve uurtamo
it is a good thing to make your prior knowledge completely fair (in the sense of not having any bias) when doing bayesian calculations. any estimator being used will reshape that knowledge on the fly. the idea is that your prior knowledge of the ELO ranking should be about the same for every sing

Re: [computer-go] 19x19 Study

2008-01-29 Thread terry mcintyre
Sylvain, in the download notes, you mention that Mogo has some troubles with "very long" timescales, due to the low resolution of single floats. Do you have any estimate of how many simulations would lead to this situation? Terry McIntyre <[EMAIL PROTECTED]> - Original Message From:

Re: [computer-go] 19x19 Study

2008-01-29 Thread Don Dailey
They seem under-rated to me also. Bayeselo pushes the ratings together because that is apparently a valid initial assumption. With enough games I believe that effect goes away. I could test that theory with some work.Unless there is a way to turn that off in bayelo (I don't see it) I could

Re: [computer-go] 19x19 Study

2008-01-29 Thread Sylvain Gelly
> > but not linearly and you can see a nice gradual curve in the plot. > > Now we have something we can argue about for weeks. Why is it not > mostly linear? Could it be the memory issue I just mentioned? > Hi Don and all participants to that study, that is very interesting! The memory constr

Re: [computer-go] 19x19 Study

2008-01-29 Thread Don Dailey
No, FatMan is a primitive UCT player, so it builds a tree in memory. I am not willing to extend it beyond the current level as it is a resource hog. Mogo is playing at the same strength in far less time. - Don Jason House wrote: > > > On Jan 29, 2008 10:01 AM, Don Dailey <[EMAIL PROTECTED

Re: [computer-go] 19x19 Study

2008-01-29 Thread Jason House
On Jan 29, 2008 10:01 AM, Don Dailey <[EMAIL PROTECTED]> wrote: > FatMan seems to hit some kind of hard limit rather suddenly. It > could be an implementation bug or something else - I don't really > understand this. It's very difficult to test a program for > scalability since you are lim

Re: [computer-go] 19x19 Study

2008-01-29 Thread Joshua Shriver
I'd be willing to donate some CPU time. I run Ubuntu linux on a P4 3ghz with 1gig RAM. Can be configured with or w/o HT. (usually leave it off) -Josh On Jan 29, 2008 10:01 AM, Don Dailey <[EMAIL PROTECTED]> wrote: > The 9x9 scalability study has been a huge success with 35 cpu's > participating

[computer-go] 19x19 Study

2008-01-29 Thread Don Dailey
The 9x9 scalability study has been a huge success with 35 cpu's participating and several volunteers.This means we got about a month of testing done per day and have the equivalent of about a years worth of data or more.We are considering whether to extend the study a bit more to test some