On Thu, Aug 16, 2007 at 03:13:29PM +0100, Phil Evans wrote:
> What do you count as raw data? Rawest are the images - everything  
> beyond that is modellling - but archiving images is _expensive_!  

Hmmm - not sure: let's say that a typical dataset requires about 180
images with 10Mb each image. With the current amount of roughly 40000
X-ray structures in the PDB this is:

  40000 * 180 * 10Mb = ~ 70 Tb of data

With simple 1TB external disk at about GBP 200 we get a price of GBP
14000, i.e. 35 pence per dataset.

Ok, this is not a proper calculation (more data collected, fine-phi
slicing, MAD datasets etc etc) and lets apply a 'safety factor' of 10:
but even then I think this is easily doable.

As Tassos remarked as well: if we could store/deposit and manage PDB
files in the 70s we should be able to do the same now (30 years
later!) with images ... easily.

Cheers

Clemens

> Unmerged intensities are probably more manageable
> 
> Phil
> 
> 
> On  16 Aug 2007, at 15:05, Ashley Buckle wrote:
> 
> >Dear Randy
> >
> >These are very valid points, and I'm so glad you've taken the  
> >important step of initiating this. For now I'd like to respond to  
> >one of them, as it concerns something I and colleagues in Australia  
> >are doing:
> >>
> >>The more information that is available, the easier it will be to  
> >>detect fabrication (because it is harder to make up more  
> >>information convincingly). For instance, if the diffraction data  
> >>are deposited, we can check for consistency with the known  
> >>properties of real macromolecular crystals, e.g. that they contain  
> >>disordered solvent and not vacuum. As Tassos Perrakis has  
> >>discovered, there are characteristic ways in which the standard  
> >>deviations depend on the intensities and the resolution. If  
> >>unmerged data are deposited, there will probably be evidence of  
> >>radiation damage, weak effects from intrinsic anomalous  
> >>scatterers, etc. Raw images are probably even harder to simulate  
> >>convincingly.
> >
> >After the recent Science retractions we realised that its about  
> >time raw data was made available. So, we have set about creating  
> >the necessary IT and software to do this for our diffraction data,  
> >and are encouraging Australian colleagues to do the same. We are  
> >about a week away from launching a web-accessible repository for  
> >our recently published (eg deposited in PDB) data, and this should  
> >coincide with an upcoming publication describing a new structure  
> >from our labs. The aim is that publication occurs simultaneously  
> >with release in PDB as well as raw diffraction data on our website.  
> >We hope to house as much of our data as possible, as well as data  
> >from other Australian labs, but obviously the potential dataset  
> >will be huge, so we are trying to develop, and make available  
> >freely to the community, software tools that allow others to easily  
> >setup their own repositories.  After brief discussion with PDB the  
> >plan is that PDB include links from coordinates/SF's to the raw  
> >data using a simple handle that can be incorporated into a URL.  We  
> >would hope that we can convince the journals that raw data must be  
> >made available at the time of publication, in the same way as  
> >coordinates and structure factors.  Of course, we realise that  
> >there will be many hurdles along the way but we are convinced that  
> >simply making the raw data available ASAP is a 'good thing'.
> >
> >We are happy to share more details of our IT plans with the CCP4BB,  
> >such that they can be improved, and look forward to hearing feedback
> >
> >cheers
> 

-- 

***************************************************************
* Clemens Vonrhein, Ph.D.     vonrhein AT GlobalPhasing DOT com
*
*  Global Phasing Ltd.
*  Sheraton House, Castle Park 
*  Cambridge CB3 0AX, UK
*--------------------------------------------------------------
* BUSTER Development Group      (http://www.globalphasing.com)
***************************************************************

Reply via email to