Dear all,
The discussion about keeping primary data, and what level of data can
be considered 'primary', has - rather unsurprisingly - come up also in
areas other than structural biology.
An example is next generation sequencing. A full-dataset is a few tera
bytes, but post-processing reduces it to sub-Gb size. However, the
post-processed data, as in our case,
have suffered the inadequacy of computational "reduction" ... At least
out institute has decided to create double back-up of the primary data
in triplicate. For that reason our facility bought
three -80 freezers, one on site at the basement, on at the top floor,
and one off-site, and they keep the DNA to be sequenced. A sequencing
run is already sub-1k$ and it will not become
more expensive. So, if its important, do it again. Its cheaper and its
better.
At first sight, that does not apply to MX. Or does it?
So, maybe the question is not "To archive or not to archive" but "What
to archive".
(similarly, it never crossed my mind if I should "be or not be" - I
always wondered "what to be")
A.
On Oct 30, 2011, at 11:59, Kay Diederichs wrote:
Am 20:59, schrieb Jrh:
...
So:- Universities are now establishing their own institutional
repositories, driven largely by Open Access demands of funders. For
these to host raw datasets that underpin publications is a reasonable
role in my view and indeed they already have this category in the
University of Manchester eScholar system, for example. I am set to
explore locally here whether they would accommodate all our Lab's raw
Xray images datasets per annum that underpin our published crystal
structures.
It would be helpful if readers of this CCP4bb could kindly also
explore with their own universities if they have such an
institutional repository and if raw data sets could be accommodated.
Please do email me off list with this information if you prefer but
within the CCP4bb is also good.
Dear John,
I'm pretty sure that there exists no consistent policy to provide an
"institutional repository" for deposition of scientific data at
German universities or Max-Planck institutes or Helmholtz
institutions, at least I never heard of something like this. More
specifically, our University of Konstanz certainly does not have the
infrastructure to provide this.
I don't think that Germany is the only country which is the
exception to any rule of availability of "institutional
repository" . Rather, I'm almost amazed that British and American
institutions seem to support this.
Thus I suggest to not focus exclusively on official institutional
repositories, but to explore alternatives: distributed filestores
like Google's BigTable, Bittorrent or others might be just as
suitable - check out http://en.wikipedia.org/wiki/Distributed_data_store
. I guess that any crystallographic lab could easily sacrifice/
donate a TB of storage for the purposes of this project in 2011 (and
maybe 2 TB in 2012, 3 in 2013, ...), but clearly the level of work
to set this up should be kept as low as possible (a bittorrent
daemon seems simple enough).
Just my 2 cents,
Kay
P please don't print this e-mail unless you really need to
Anastassis (Tassos) Perrakis, Principal Investigator / Staff Member
Department of Biochemistry (B8)
Netherlands Cancer Institute,
Dept. B8, 1066 CX Amsterdam, The Netherlands
Tel: +31 20 512 1951 Fax: +31 20 512 1954 Mobile / SMS: +31 6 28 597791