Hi! On Fri, Nov 13, 2015 at 08:39:20AM +0000, Josef Moudrik wrote: > There has been some debate in science about making the research more > reproducible and open. Recently, I have been thinking about making a > standard public fixed dataset of Go games, mainly to ease comparison of > different methods, to make results more reproducible and maybe free the > authors of the burden of composing a dataset. I think that the current > practice can be improved a lot.
I think the current de facto standard dataset is GoGoD (some year, not quite fixed). So I think it's useful to differentiate your proposal against this dataset - what are the current problems and what will be the advantage? One advantage would be of course if the dataset is freely available. But it's not clear how to achieve that, i.e. where to get a large professional game collection without copyright protection. > 2a) Size: My current view is that at least 2 sizes are necessary: small > (1000-2000 games?) and large dataset (50000-60000 games). What's the usecase for a small dataset? -- Petr Baudis If you have good ideas, good data and fast computers, you can do almost anything. -- Geoffrey Hinton _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go