Dear James, Perhaps it is time for us to admit that this is too large, expensive and complex a problem for us to resolve without help from one or more of the commercial data managers, such as Google or Amazon. I know that dealing with ads is a nuisance, introducing a loss of time for research, but going nuts trying to recover lost data also costs time. Perhaps we should show a willingness to sell a little of our eyeball time seeing some ads in order to have access to the most cost-effective data management systems currently in existence.
Regards, Herbert On Sat, Jul 14, 2018 at 2:23 PM, James Holton <jmhol...@slac.stanford.edu> wrote: > Why not just upload it to proteindiffraction.org ? Or the SBGrid data > bank (https://data.sbgrid.org/) ? Or both for "redundancy" ? > > > Yes, I did once do some calculations on what it would take to preserve > data for tens of thousands of years, and the only proven storage medium for > that timescale is clay tablets. Assuming 1 mm^3 is all you need to store > one bit it comes to about $3000/GB. > > > Hard drives, however, are now down to $33/TB, which is comparable to a box > of pipette tips, and takes up less space. LTO-6 tapes are $3/TB. So the > cost of storage I don't think is any real burden, its the cost of managing > that storage. If you buy a box of 12 TB bare drives, then you need to > spend a lot of time and effort getting your data onto them, and then > wondering if they will still work after a few years. Modern drives are > much more reliable than they used to be, but maybe you want two copies? Or > a parity disk? What you pay for when you buy a NAS, particularly a > high-end NAS like NetApp is the cost and quality of management. Rolled > into the price of the product is not just redundant bits and the wires to > connect them, but a team of people who get paid to make sure your data are > always safe and available. > > > The question then always comes down to cost/benefit. What is the > consequence of data loss? What is the probability of data loss? And are > you feeling lucky? > > > A few years ago I got a panicked email from a user whom I will not name, > but this user had just been "Rupp-ed". As in Bernhard had found a deposit > of theirs that look a lot like a fake structure, and asked about it. This > deposition had been made ten years earlier, the student who did it had left > science, and could not be reached. This left the PI holding the bag. Turns > out the student had made a mistake and deposited Fcalc instead of Fobs. But > how do you prove that? This user was VERY happy to find out that I still > had their images on DVD. I was able to restore them and re-process them in > about an hour. > > > Lucky? Perhaps. Not every beamline at every synchrotron backs up data, > and not every DVD I've written can be read back. About 3000 images are > still unrecoverable from those days. On the other hand, there are other > beamlines who make a point of destroying any traces of user data as part of > their data protection plan. Most, I think, are middle-of-the-road with a > data retention policy like "we'll do what we can, but can't promise > anything". Even at the same synchrotron policies can vary from beamline to > beamline. So again: do you feel lucky? Do you? > > > -James Holton > > MAD Scientist > > On 7/13/2018 2:30 AM, Sergei Strelkov wrote: > > Dear All, > > > I believe this question may be of some interest. > > In the past, we always stored all raw data ever collected by the lab. > > With the recent advances, such as > > (a) automated/on-the-fly processing offered by some (European) > synchrotrons, and > > (b) an ongoing discussion on centralized raw data archiving, > > I wonder if it is time to revise the strict policy of keeping all data > > (before we invest in a new NAS system... ) > > > Best wishes, > > Sergei > > > Prof. Sergei V. Strelkov Laboratory for Biocrystallography Department of > Pharmaceutical Sciences, KU Leuven > > > ------------------------------ > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 > > > > ------------------------------ > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 > ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1