To answer several emails in one:

Hideki:  One question, why the rating of FatMan-1 is not 1800?

They are probably not identical - if one plays much faster than another, it 
will slightly cripple
programs that think on the opponents time.    That's the only explanation I can 
think of.

Also, I fixed FatMan to 1800, not FatMan-1 because FatMan has more games.   
However, I am going
to change this to FatMan-1 since it has many more RECENT games.   Of course on 
CGOS they are both
fixed at 1800.    This will cause a shift in the entire rating pool of course.

Hideki:  And, I'll be happier if there are links to programs on HOF so that I 
Hideki:  can see their rating pages.

This is easy - next version will do this.

Rémi: Readpgn games.pgn
Rémi: elo
Rémi: advantage 0 ;no advantage for playing White
Rémi: drawelo 0.01 ;draws are extremely unlikely
Rémi: mm
Rémi: exactdist
Rémi: ratings

I'll switch over to this.

Rémi: No draw at all would be "drawelo 0", but it generates some numerical 
Rémi: problems in the algorithm for computing confidence intervals.

Ahh!  I tried to set this to zero and got bizarre results so I left it alone.  
Now I know what to do.



Rémi: I would be very interested if you could lower your threshold to 197 
Rémi: games, and include december results, which would include Crazy Stone in 
Rémi: the list ;-)

I plan to do this at the end of each month, strictly by month by an automated 
script (it's mostly automated now including a little module to convert the 
results to a pgn format that bayeselo can read.   But until I get this fully 
automated I will just include all the games available from now on.     I think 
Crazy Stone was at the top before I applied these constraints.    I want there 
to be a "price for admission"  to get programs to play lots of games.   

Rémi: Another idea: in order to give more significance to the results, games 
Rémi: between different versions of the same program should be excluded. These 
Rémi: games tend to strongly overestimate the strength of the stronger version.

There is no way to do this without adding some more infrastructure.   It is 
impossible to determine with complete confidence which version is which and 
even if I could,  how do I determine which version is the one that should be 
representing the program family?     There are simple heuristics that might 
work most of the time but are not completely reliable.    The convention I once 
suggested for assigning versions to programs isn't widely followed and isn't 
enforced.  The idea was to version your program like this:   "Lazarus-1-0"  
where everything before they first hyphen was the generic program name and 
everything after the (first) hyphen is the version number.   But a malicious 
user could kick your program off by using your name (with a different version 
number.)

It's possible to enforce this policy by letting the server impose it - your 
password applies to the first characters before the hyphen.   Even then, how 
does the server determine which version is representative?   When I developed 
Lazarus I added many experimental versions which didn't make the cut but had 
more recent version numbers.  I would not want those versions to "carry the 
flag" for the Lazarus family of programs.    So the only way to get this 
working right is to add more infrastructure to CGOS.   

I think what is better is to provide a way for users to specify that they don't 
want to play their own program.  We could consider a "family" of programs to be 
specified by password.   OR, it might be reasonable to just say that if you 
don't want your program to play different versions of itself,  the password 
must match as well as all characters before the first hyphen.   So essentially 
you are forced to version with a hyphen if you want this behavior.    

Does that sound like a reasonable improvement to the system?


Rémi









Rémi Coulom wrote:
> Don Dailey wrote:
>> I put up a web page that displays EVERY player who has played at least
>> 200 games on CGOS.
>>
>> It uses the bayeselo program that Rémi authored.    
>>
>>       http://cgos.boardspace.net/9x9/hof.html
>>
>>
>> I'm not sure I used the program correctly - it's rather complicated and
>> I'm not that great with statistics.   If anyone is interested in the
>> settings I used I can provide that.
>>
>> - Don
> Another idea: in order to give more significance to the results, games
> between different versions of the same program should be excluded.
> These games tend to strongly overestimate the strength of the stronger
> version.
>
> Rémi
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to