On Mar 7, 2009, at 14:51, Gerard Bricogne wrote:

Thank you for your comments on this topic. I think, however, that no amount of format extension or sanity checking will ever replace the ultimate sanity of depositing the images themselves. This would eliminate the many

Depositing the images would certainly be the best solution in the long run. But I don't expect it to happen soon: the software infrastructure isn't there yet, and I suppose most scientists' minds aren't quite ready yet either.

In the meantime, the PDB could improve the quality of its structure factor files with little effort by using better sanity checking. Here are two examples I stumbled over recently and which would have been very easy to catch with straightforward verification tools:

1) PDB entry 2PL8

Lines 1222ff of the structure factor file are:

1 1 1    5    3    0    6339.16   761.27 o
1 1 1    5    3   -1    6810.46   580.22 o
1 1 1    5    3   -2    2976.58   253.95 f
1 1 1    5    3 -312354       0.85  1051.53 o
1 1 1    5    3   -4    5875.30   500.59 o

This looks like a basic mmCIF conversion mistake, which a simple sanity check on the Miller indices would have detected.

2) PDB entry 2P2O

The PDB file says:

REMARK 200  NUMBER OF UNIQUE REFLECTIONS   : 107292
REMARK 200  RESOLUTION RANGE HIGH      (A) : 1.740
REMARK 200  RESOLUTION RANGE LOW       (A) : 50.000
REMARK 200  REJECTION CRITERIA  (SIGMA(I)) : 1.000
REMARK 200
REMARK 200 OVERALL.
REMARK 200  COMPLETENESS FOR RANGE     (%) : 92.3
REMARK 200  DATA REDUNDANCY                : 3.200
REMARK 200  R MERGE                    (I) : NULL
REMARK 200  R SYM                      (I) : 0.04700
REMARK 200  <I/SIGMA(I)> FOR THE DATA SET  : 15.0000

But the structure factor file doesn't agree on the number of reflections:

_reflns.number_all       116714
_reflns.number_obs         7733

A look at the reflections tells a bit more about the discrepancy:

1 1 1 -38 0 3 x ? ? ? ? ? ?
?      ?
1 1 1 -38 0 4 x ? ? ? ? ? ?
?      ?
1 1 1 -38 0 5 x ? ? ? ? ? ?
?      ?
1 1 1 -38 0 6 x ? ? ? ? ? ?
?      ?

All but 7733 of the 116714 listed reflections have status x and no data. It is hard to say whether this is due to a mistake or due to the wish to deposit a structure factor file without actually revealing any data, but in any case a simple sanity check would have detected the discrepancy.

be given top priority. Any half-way house that would pin its hopes on more massaging of reduced data would seem to me pure procrastination, as what is
accepted as "reduced" today will be seen as "massacred" tomorrow.

I think the best way would be a firm decision to go for depositing images within a well-defined time frame while at the same time improving the verification of deposited structure factor data.

Konrad.
--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: hin...@cnrs-orleans.fr
Web: http://dirac.cnrs-orleans.fr/~hinsen/
---------------------------------------------------------------------

Reply via email to