I'm glad that the discussion has finally set in, and would only like to comment on the practicability of storing images.

Mischa Machius schrieb:
I don't think archiving images would be that expensive. For one, I have found that most formats can be compressed quite substantially using simple, standard procedures like bzip2. If optimized, raw images won't take up that much space. Also, initially, only those images that have been used to obtain phases and to refine finally deposited structures could be archived. If the average structure takes up 20GB of space,

that's on the high side I'd say; I would have estimated 1.5 GB (native alone) to 5 GB for e.g. a native and 3 wavelengths (after bzip2).

5,000 structures would be 1TB, which fits on a single hard drive for

5,000 structures of 20GB would be 100 TB

If the PDB would require all images of a _single_ dataset for molecular-replacement structures or mutant studies, and all images of all wavelengths/derivatives for experimentally phased structures, that would come to roughly (40,000 X-ray structures) * (on average 2 GB per structure) = 80 TB of data. At €250 per TB, that would be 20,000 € - an estimate of what it takes to store all the raw data for _all_ the X-ray structures in the PDB - less than what a single a single protein cloning/purification/crystallization/structure project costs per year.


less than $400. If the community thinks this is a worthwhile endeavor, money should be available from granting agencies to establish a central repository (e.g., at the RCSB). Imagine what could be done with as little as $50,000. For large detectors, binning could be used, but giving current hard drive prices and future developments, that won't be necessary. Best - MM


Archiving images is quite practical even for those data that do not directly correspond to deposited PDB entries. In 1999 we abandoned tape storage of raw data in favor of disk storage. Everything we collected at synchrotrons since then still fits on two 750GB disks. In 2000 we also needed two disks, and have been upgrading the disks when the old ones were full. To have these data online means that one can easily look at them again, for testing data reduction and phasing programs, and for trying to solve, using new programs, those structures where crystals could never be reproduced.

just my 2 cents -

Kay Diederichs
--
Kay Diederichs                 http://strucbio.biologie.uni-konstanz.de
email: [EMAIL PROTECTED]     Tel +49 7531 88 4049 Fax 3183
Fachbereich Biologie, Universitaet Konstanz, Box M647, D-78457 Konstanz

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to