Hello, On Fri, Nov 13, 2015 at 10:13 AM <fotl...@smart-games.com> wrote: > I would only use it if it is licensed for commercial use.
Yes, I would like to licence this as such, please see below. On Fri, Nov 13, 2015 at 10:23 AM Petr Baudis <pa...@ucw.cz> wrote: > I think the current de facto standard dataset is GoGoD (some year, not > quite fixed). So I think it's useful to differentiate your proposal > against this dataset - what are the current problems and what will be > the advantage? Yes, I know GoGoD is used frequently, but I think that the lack of "precise" specification is the problem. There are many choices an author has to make when using the GoGoD database: year of release, year span, handicap games?, amateur/professional? (how to tell? pro rank is d not p). Related thing is that some of the games (If I remember my experience correctly) cannot be parsed by some libraries in which case they are usually skipped. All these are branching points that make "precise" replication of results hard. > One advantage would be of course if the dataset is freely available. > But it's not clear how to achieve that, i.e. where to get a large > professional game collection without copyright protection. I consider this "negotiation" as the hardest work I will have to do, but before I start, I want to research if the dataset would be even used. From the point of view of copyright law, I believe that what is protected is the "collection of games" and "additional materials" (comments, etc), not the actual individual games themselves (which as a record of a historical event afaik cannot be copyrighted). The "collection of games" and "additional materials" right of current collection owners could be protected by anonymization of the records and mixing of different databases, if the current owners agree. >From the licensing point of view, again given that owners agree, I would like to release the dataset under something like free-for-all-purposes-with-attribution license. This I have to research yet. > What's the usecase for a small dataset? I had prototype testing in mind, s.t. authors can say "our method is slow, so we only tested on the SmallGoDataset" instead of "we randomly took 1000 games from the BigGoDataset", but I assume there would be other usecases as well. Anyway, I think the big and small datasets would not imo cause much use-fragmentation, because the use cases for big vs small would be different. But maybe I am overthinking things and this would not be used much.. Regards, Josef
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go