Rémi Coulom wrote:
> Accelerated UCT does this: https://www.conftool.net/acg13/index.php/Hashimoto-Accelerated_UCT_and_Its_A pplication_to_Two-Player_Games-111.pdf?page=downloadPaper <https://www.conftool.net/acg13/index.php/Hashimoto-Accelerated_UCT_and_Its_ Application_to_Two-Player_Games-111.pdf?page=downloadPaper&filename=Hashimot o-Accelerated_UCT_and_Its_Application_to_Two-Player_Games-111.pdf&form_id=11 1&form_version=final> &filename=Hashimoto-Accelerated_UCT_and_Its_Application_to_Two-Player_Games- 111.pdf&form_id=111&form_version=final This idea was mentioned, circa 2009, on this list. It seemed intuitively right that giving more weight to most recent results should improve play. I implemented it pretty much like the author of the paper in 2009 and it is still a configurable option in my program. I also used simulated sources of results. In simulation it became clear that it was working fine and the "learning" evaluation was a much better estimator of the final value than the "non learning" (In my implementation it is called "estimate trend"). When playing games, not only it didn't work but it makes the program clearly weaker. Even constants resulting in a very slow learning can lose 10 to 20 Elo points. No value has ever made it stronger, at best I can fade it out completely making it irrelevant. And I have tested it more than once, because I believed in it, as the program has evolved for double digit kyu to 3-4 kyu, always with negative results. Has Someone else tried it? I am still interested in understanding why it doesn't work (for me) as it seems a good idea. Jacques.
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
