Re: [ccp4bb] Reasoning for Rmeas or Rpim as Cutoff

Florian Schmitzberger Mon, 30 Jan 2012 11:03:05 -0800

On Jan 30, 2012, at 10:28 AM, Jacob Keller wrote:

>> I'm intrigued:  how come this apparently excellent idea has not become
>> standard best practice in the 14 years since it was published?
> 
> It would seem because too few people know about it, and it is not
> implemented in any software in the usual pipeline. Maybe it could be?


Phenix.model_vs_data calculates a sigmaA_ vs resolution plot (in comprehensive 
validation in the GUI). Pavel would probably have replied by now, but I don't 
think the discussion has been cross-posted to the phenix bb.

Cheers,

Florian




> 
> Perhaps the way to do it would be always to integrate to
> ridiculously-high resolution, give that to Refmac, and starting from
> lower resolution, to iterate to higher resolution according the most
> recent sigma a calculation, and cutoff according to some reasonable
> sigma a value?
> 
> JPK
> 
> 
> 
>> 
>> phx
>> 
>> 
>> 
>> On 30/01/2012 09:40, Randy Read wrote:
>> 
>> Hi,
>> 
>> Here are a couple of links on the idea of judging resolution by a type of
>> cross-validation with data not used in refinement:
>> 
>> Ling et al, 1998: http://pubs.acs.org/doi/full/10.1021/bi971806n
>> Brunger et al,
>> 2008: http://journals.iucr.org/d/issues/2009/02/00/ba5131/index.html
>>   (cites earlier relevant papers from Brunger's group)
>> 
>> Best wishes,
>> 
>> Randy Read
>> 
>> On 30 Jan 2012, at 07:09, arka chakraborty wrote:
>> 
>> Hi all,
>> 
>> In the context of the above going discussion can anybody post links for a
>> few relevant articles?
>> 
>> Thanks in advance,
>> 
>> ARKO
>> 
>> On Mon, Jan 30, 2012 at 3:05 AM, Randy Read <rj...@cam.ac.uk> wrote:
>>> 
>>> Just one thing to add to that very detailed response from Ian.
>>> 
>>> We've tended to use a slightly different approach to determining a
>>> sensible resolution cutoff, where we judge whether there's useful
>>> information in the highest resolution data by whether it agrees with
>>> calculated structure factors computed from a model that hasn't been refined
>>> against those data.  We first did this with the complex of the Shiga-like
>>> toxin B-subunit pentamer with the Gb3 trisaccharide (Ling et al, 1998).
>>>  From memory, the point where the average I/sig(I) drops below 2 was around
>>> 3.3A.  However, we had a good molecular replacement model to solve this
>>> structure and, after just carrying out rigid-body refinement, we computed a
>>> SigmaA plot using data to the edge of the detector (somewhere around 2.7A,
>>> again from memory).  The SigmaA plot dropped off smoothly to 2.8A
>>> resolution, with values well above zero (indicating significantly better
>>> than random agreement), then dropped suddenly.  So we chose 2.8A as the
>>> cutoff.  Because there were four pentamers in the asymmetric unit, we could
>>> then use 20-fold NCS averaging, which gave a fantastic map.  In this case,
>>> the averaging certainly helped to pull out something very useful from a very
>>> weak signal, because the maps weren't nearly as clear at lower resolution.
>>> 
>>> Since then, a number of other people have applied similar tests.  Notably,
>>> Axel Brunger has done some careful analysis to show that it can indeed be
>>> useful to take data beyond the conventional limits.
>>> 
>>> When you don't have a great MR model, you can do something similar by
>>> limiting the resolution for the initial refinement and rebuilding, then
>>> assessing whether there's useful information at higher resolution by using
>>> the improved model (which hasn't seen the higher resolution data) to compute
>>> Fcalcs.  By the way, it's not necessary to use a SigmaA plot -- the
>>> correlation between Fo and Fc probably works just as well.  Note that, when
>>> the model has been refined against the lower resolution data, you'll expect
>>> a drop in correlation at the resolution cutoff you used for refinement,
>>> unless you only use the cross-validation data for the resolution range used
>>> in refinement.
>>> 
>>> -----
>>> Randy J. Read
>>> Department of Haematology, University of Cambridge
>>> Cambridge Institute for Medical Research    Tel: +44 1223 336500
>>> Wellcome Trust/MRC Building                         Fax: +44 1223 336827
>>> Hills Road
>>>  E-mail: rj...@cam.ac.uk
>>> Cambridge CB2 0XY, U.K.
>>> www-structmed.cimr.cam.ac.uk
>>> 
>>> On 29 Jan 2012, at 17:25, Ian Tickle wrote:
>>> 
>>>> Jacob, here's my (personal) take on this:
>>>> 
>>>> The data quality metrics that everyone uses clearly fall into 2
>>>> classes: 'consistency' metrics, i.e. Rmerge/meas/pim and CC(1/2) which
>>>> measure how well redundant observations agree, and signal/noise ratio
>>>> metrics, i.e. mean(I/sigma) and completeness, which relate to the
>>>> information content of the data.
>>>> 
>>>> IMO the basic problem with all the consistency metrics is that they
>>>> are not measuring the quantity that is relevant to refinement and
>>>> electron density maps, namely the information content of the data, at
>>>> least not in a direct and meaningful way.  This is because there are 2
>>>> contributors to any consistency metric: the systematic errors (e.g.
>>>> differences in illuminated volume and absorption) and the random
>>>> errors (from counting statistics, detector noise etc.).  If the data
>>>> are collected with sufficient redundancy the systematic errors should
>>>> hopefully largely cancel, and therefore only the random errors will
>>>> determine the information content.  Therefore the systematic error
>>>> component of the consistency measure (which I suspect is the biggest
>>>> component, at least for the strong reflections) is not relevant to
>>>> measuring the information content.  If the consistency measure only
>>>> took into account the random error component (which it can't), then it
>>>> would be essentially be a measure of information content, if only
>>>> indirectly (but then why not simply use a direct measure such as the
>>>> signal/noise ratio?).
>>>> 
>>>> There are clearly at least 2 distinct problems with Rmerge, first it's
>>>> including systematic errors in its measure of consistency, second it's
>>>> not invariant with respect to the redundancy (and third it's useless
>>>> as a statistic anyway because you can't do any significance tests on
>>>> it!).  The redundancy problem is fixed to some extent with Rpim etc,
>>>> but that still leaves the other problems.  It's not clear to me that
>>>> CC(1/2) is any better in this respect, since (as far as I understand
>>>> how it's implemented), one cannot be sure that the systematic errors
>>>> will cancel for each half-dataset Imean, so it's still likely to
>>>> contain a large contribution from the irrelevant systematic error
>>>> component and so mislead in respect of the real data quality exactly
>>>> in the same way that Rmerge/meas/pim do.  One may as well use the
>>>> Rmerge between the half dataset Imeans, since there would be no
>>>> redundancy effect (i.e. the redundancy would be 2 for all included
>>>> reflections).
>>>> 
>>>> I did some significance tests on CC(1/2) and I got silly results, for
>>>> example it says that the significance level for the CC is ~ 0.1, but
>>>> this corresponded to a huge Rmerge (200%) and a tiny mean(I/sigma)
>>>> (0.4).  It seems that (without any basis in statistics whatsoever) the
>>>> rule-of-thumb CC > 0.5 is what is generally used, but I would be
>>>> worried that the statistics are so far divorced from the reality - it
>>>> suggests that something is seriously wrong with the assumptions!
>>>> 
>>>> Having said all that, the mean(I/sigma) metric, which on the face of
>>>> it is much more closely related to the information content and
>>>> therefore should be a more relevant metric than Rmerge/meas/pim &
>>>> CC(1/2), is not without its own problems (which probably explains the
>>>> continuing popularity of the other metrics!).  First and most obvious,
>>>> it's a hostage to the estimate of sigma(I) used.  I've never been
>>>> happy with inflating the counting sigmas to include effects of
>>>> systematic error based on the consistency of redundant measurements,
>>>> since as I indicated above if the data are collected redundantly in
>>>> such a way that the systematic errors largely cancel, it implies that
>>>> the systematic errors should not be included in the estimate of sigma.
>>>> The fact that then the sigma(I)'s would generally be smaller (at
>>>> least for the large I's), so the sample variances would be much larger
>>>> than the counting variances, is irrelevant, because the former
>>>> includes the systematic errors.  Also the I/sigma cut-off used would
>>>> probably not need to be changed since it affects only the weakest
>>>> reflections which are largely unaffected by the systematic error
>>>> correction.
>>>> 
>>>> The second problem with mean(I/sigma) is also obvious: i.e. it's a
>>>> mean, and as such it's rather insensitive to the actual distribution
>>>> of I/sigma(I).  For example if a shell contained a few highly
>>>> significant intensities these could be overwhelmed by a large number
>>>> of weak data and give an insignificant mean(I/sigma).  It seems to me
>>>> that one should be considering the significance of individual
>>>> reflections, not the shell averages.  Also the average will depend on
>>>> the width of the resolution bin, so one will get the strange effect
>>>> that the apparent resolution will depend on how one bins at the data!
>>>> The assumption being made in taking the bin average is that I/sigma(I)
>>>> falls off smoothly with d* but that's unlikely to be the reality.
>>>> 
>>>> It seems to me that a chi-square statistic which takes into account
>>>> the actual distribution of I/sigma(I) would be a better bet than the
>>>> bin average, though it's not entirely clear how one would formulate
>>>> such a metric.  One would have to consider subsets of the data as a
>>>> whole sorted by increasing d* (i.e. not in resolution bins to avoid
>>>> the 'bin averaging effect' described above), and apply the resolution
>>>> cut-off where the chi-square statistic has maximum probability.  This
>>>> would automatically take care of incompleteness effects since all
>>>> unmeasured reflections would be included with I/sigma = 0 just for the
>>>> purposes of working out the cut-off point.  I've skipped the details
>>>> of implementation and I've no idea how it would work in practice!
>>>> 
>>>> An obvious question is: do we really need to worry about the exact
>>>> cut-off anyway, won't our sophisticated maximum likelihood refinement
>>>> programs handle the weak data correctly?  Note that in theory weak
>>>> intensities should be handled correctly, however the problem may
>>>> instead lie with incorrectly estimated sigmas: these are obviously
>>>> much more of an issue for any software which depends critically on
>>>> accurate estimates of uncertainty!  I did some tests where I refined
>>>> data for a known protein-ligand complex using the original apo model,
>>>> and looked at the difference density for the ligand, using data cut at
>>>> 2.5, 2 and 1.5 Ang where the standard metrics strongly suggested there
>>>> was only data to 2.5 Ang.
>>>> 
>>>> I have to say that the differences were tiny, well below what I would
>>>> deem significant (i.e. not only the map resolutions but all the map
>>>> details were essentially the same), and certainly I would question
>>>> whether it was worth all the soul-searching on this topic over the
>>>> years!  So it seems that the refinement programs do indeed handle weak
>>>> data correctly, but I guess this should hardly come as a surprise (but
>>>> well done to the software developers anyway!).  This was actually
>>>> using Buster: Refmac seems to have more of a problem with scaling &
>>>> TLS if you include a load of high resolution junk data.  However,
>>>> before anyone acts on this information I would _very_ strongly advise
>>>> them to repeat the experiment and verify the results for themselves!
>>>> The bottom line may be that the actual cut-off used only matters for
>>>> the purpose of quoting the true resolution of the map, but it doesn't
>>>> significantly affect the appearance of the map itself.
>>>> 
>>>> Finally an effect which confounds all the quality metrics is data
>>>> anisotropy: ideally the cut-off surface of significance in reciprocal
>>>> space should perhaps be an ellipsoid, not a sphere.  I know there are
>>>> several programs for anisotropic scaling, but I'm not aware of any
>>>> that apply anisotropic resolution cutoffs (or even whether this would
>>>> be advisable).
>>>> 
>>>> Cheers
>>>> 
>>>> -- Ian
>>>> 
>>>> On 27 January 2012 17:47, Jacob Keller <j-kell...@fsm.northwestern.edu>
>>>> wrote:
>>>>> Dear Crystallographers,
>>>>> 
>>>>> I cannot think why any of the various flavors of Rmerge/meas/pim
>>>>> should be used as a data cutoff and not simply I/sigma--can somebody
>>>>> make a good argument or point me to a good reference? My thinking is
>>>>> that signal:noise of >2 is definitely still signal, no matter what the
>>>>> R values are. Am I wrong? I was thinking also possibly the R value
>>>>> cutoff was a historical accident/expedient from when one tried to
>>>>> limit the amount of data in the face of limited computational
>>>>> power--true? So perhaps now, when the computers are so much more
>>>>> powerful, we have the luxury of including more weak data?
>>>>> 
>>>>> JPK
>>>>> 
>>>>> 
>>>>> --
>>>>> *******************************************
>>>>> Jacob Pearson Keller
>>>>> Northwestern University
>>>>> Medical Scientist Training Program
>>>>> email: j-kell...@northwestern.edu
>>>>> *******************************************
>> 
>> 
>> 
>> 
>> --
>> 
>> ARKA CHAKRABORTY
>> CAS in Crystallography and Biophysics
>> University of Madras
>> Chennai,India
>> 
>> 
>> ------
>> Randy J. Read
>> Department of Haematology, University of Cambridge
>> Cambridge Institute for Medical Research      Tel: + 44 1223 336500
>> Wellcome Trust/MRC Building                   Fax: + 44 1223 336827
>> Hills Road                                    E-mail: rj...@cam.ac.uk
>> Cambridge CB2 0XY, U.K.                       www-structmed.cimr.cam.ac.uk
>> 
> 
> 
> 
> -- 
> *******************************************
> Jacob Pearson Keller
> Northwestern University
> Medical Scientist Training Program
> email: j-kell...@northwestern.edu
> *******************************************

Re: [ccp4bb] Reasoning for Rmeas or Rpim as Cutoff

Reply via email to