Data (facts) are not copyright worthy, but databases (collections of facts) can be. See Feist v Rural for precedent; in short, there must be an inobvious and creative aspect to the database for it to be elevated to copyrightable status. I doubt that a collection of datasets would clear this bar, but it's still worth noting.
--t > On Mar 4, 2016, at 6:22 AM, Robert M. Flight <rfligh...@gmail.com> wrote: > > I am pretty sure in general "data" is not copyrightable per se ( > http://www.lib.umich.edu/copyright/facts-and-data), so while I might > contact the original authors as a courtesy, if the data has been released > into any public database, then you should be free to do with it as you > please. Providing the original accession numbers for the data and relevant > citations (if they exist) so that it is easy for you and others to be given > credit if the data is used would be a good thing to do. > > Also, I would personally go with the CC0 (waive of copyright, see > https://wiki.creativecommons.org/wiki/CC0) for a data package, as the data > is already publicly available, you have just packaged it together into a > useful set. > > My 2 cents. > > -Robert > > Robert M Flight, PhD > Bioinformatics Research Associate > Resource Center for Stable Isotope Resolved Metabolomics > Manager, Systems Biology and Omics Integration Journal Club > Markey Cancer Center > CC434 Roach Building > University of Kentucky > Lexington, KY > > Twitter: @rmflight > Web: rmflight.github.io > ORCID: http://orcid.org/0000-0001-8141-7788 > EM rfligh...@gmail.com > PH 502-509-1827 > > To call in the statistician after the experiment is done may be no more > than asking him to perform a post-mortem examination: he may be able to say > what the experiment died of. - Ronald Fisher > > > > On Fri, Mar 4, 2016 at 8:52 AM Kasper Daniel Hansen < > kasperdanielhan...@gmail.com> wrote: > >> For data packages, which does not contain any code, it seems weird to use a >> software license such as GPL or GPL-2. It seems better to use something >> like Artistic-2.0 or one of the CC licenses. >> >> On Thu, Mar 3, 2016 at 5:15 PM, davide risso <risso.dav...@gmail.com> >> wrote: >> >>> Hi Hervé and Sean, >>> >>> thanks for your help. It will indeed be interesting to hear how other >>> people chose the license, especially for those package that redistribute >> a >>> dataset not from their lab. >>> >>> I do have an experimental data package in Bioc, zebrafishRNASeq, but it's >>> an experiment from a collaborator and at the time I didn't pay much >>> attention on which license to use. >>> In this case, I'd like to redistribute data from different labs. I guess >> I >>> will contact the original authors at least as a courtesy. >>> But I'm still keen to hear opinions on which license(s) is appropriate >> for >>> experimental data sharing. >>> >>> Best, >>> davide >>> >>> >>> >>> >>> On Thu, Mar 3, 2016 at 12:50 PM Hervé Pagès <hpa...@fredhutch.org> >> wrote: >>> >>>> Hi Davide, >>>> >>>>> On 03/01/2016 02:25 PM, davide risso wrote: >>>>> Dear Bioc developers, >>>>> >>>>> I recently downloaded three publicly available single-cell RNA-seq >>>> datasets >>>>> from the NCBI GEO/SRA repository and created an R package with some >>>>> gene-level summaries (read counts and FPKMs). >>>>> >>>>> I'm currently using the package locally for my own tests, but I'm >>>> thinking >>>>> that this may be a useful resource for the community and thinking of >>>>> sharing it on github and eventually submit it to Bioconductor. >>>>> >>>>> I was not involved in any way with the original studies, and I'm >>>> wondering >>>>> what is the best practice in terms of license / data sharing. Since >>> there >>>>> are many experimental data packages in Bioconductor, I'm guessing >> that >>>> I'm >>>>> not the first person wondering about this. >>>>> >>>>>> From the NCBI website, I read (quote from >>>>> https://www.ncbi.nlm.nih.gov/home/about/policies.shtml): >>>>> Databases of molecular data on the NCBI Web site include such >> examples >>> as >>>>> nucleotide sequences (GenBank), protein sequences, macromolecular >>>>> structures, molecular variation, gene expression, and mapping data. >>> They >>>>> are designed to provide and encourage access within the scientific >>>>> community to sources of current and comprehensive information. >>> Therefore, >>>>> NCBI itself places no restrictions on the use or distribution of the >>> data >>>>> contained therein. Nor do we accept data when the submitter has >>> requested >>>>> restrictions on reuse or redistribution. However, some submitters of >>> the >>>>> original data (or the country of origin of such data) may claim >> patent, >>>>> copyright, or other intellectual property rights in all or a portion >> of >>>> the >>>>> data (that has been submitted). NCBI is not in a position to assess >> the >>>>> validity of such claims and since there is no transfer of rights from >>>>> submitters to NCBI, NCBI has no rights to transfer to a third party. >>>>> Therefore, NCBI cannot provide comment or unrestricted permission >>>>> concerning the use, copying, or distribution of the information >>> contained >>>>> in the molecular databases. >>>>> >>>>> Should I contact the original authors for permission? Or is the fact >>> that >>>>> the data were publicly shared enough to grant me permission to >>>> redistribute? >>>>> In that case, is there a standard license that I should use? >>>>> >>>>> Thanks for any feedback / thought! >>>> >>>> I don't have much to offer. AFAIK we don't really have guidelines or >>>> recommendations for what license to use for experimental data packages, >>>> except for the usual "make sure you use an appropriate license" advice. >>>> So far it has really been up to each author/maintainer to make sure >>>> they pick up a license that is compatible with the original >>>> license/copyright/patent of the original data they are packaging >>>> and with its redistribution thru the Bioconductor channel. >>>> >>>> FWIW here is a summary of the licenses used by the 276 experimental >>>> data packages currently in BioC devel: >>>> >>>> License Nb of packages >>>> ------------ -------------- >>>> GPL 135 >>>> Artistic-2.0 96 >>>> LGPL 41 >>>> other 4 >>>> >>>> Would be interesting to hear from other developers about this. For >>>> example, how people choose between GPL vs Artistic-2.0? Is one >>>> license typically more appropriate for packaging and redistributing >>>> data that is already publicly available? >>>> >>>> H. >>>> >>>>> >>>>> Best, >>>>> davide >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> _______________________________________________ >>>>> Bioc-devel@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>>>> >>>> >>>> -- >>>> Hervé Pagès >>>> >>>> Program in Computational Biology >>>> Division of Public Health Sciences >>>> Fred Hutchinson Cancer Research Center >>>> 1100 Fairview Ave. N, M1-B514 >>>> P.O. Box 19024 >>>> Seattle, WA 98109-1024 >>>> >>>> E-mail: hpa...@fredhutch.org >>>> Phone: (206) 667-5791 >>>> Fax: (206) 667-1319 >>>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioc-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel