On Jan 30, 2012, at 10:28 AM, Jacob Keller wrote: >> I'm intrigued: how come this apparently excellent idea has not become >> standard best practice in the 14 years since it was published? > > It would seem because too few people know about it, and it is not > implemented in any software in the usual pipeline. Maybe it could be?
Phenix.model_vs_data calculates a sigmaA_ vs resolution plot (in comprehensive validation in the GUI). Pavel would probably have replied by now, but I don't think the discussion has been cross-posted to the phenix bb. Cheers, Florian > > Perhaps the way to do it would be always to integrate to > ridiculously-high resolution, give that to Refmac, and starting from > lower resolution, to iterate to higher resolution according the most > recent sigma a calculation, and cutoff according to some reasonable > sigma a value? > > JPK > > > >> >> phx >> >> >> >> On 30/01/2012 09:40, Randy Read wrote: >> >> Hi, >> >> Here are a couple of links on the idea of judging resolution by a type of >> cross-validation with data not used in refinement: >> >> Ling et al, 1998: http://pubs.acs.org/doi/full/10.1021/bi971806n >> Brunger et al, >> 2008: http://journals.iucr.org/d/issues/2009/02/00/ba5131/index.html >> (cites earlier relevant papers from Brunger's group) >> >> Best wishes, >> >> Randy Read >> >> On 30 Jan 2012, at 07:09, arka chakraborty wrote: >> >> Hi all, >> >> In the context of the above going discussion can anybody post links for a >> few relevant articles? >> >> Thanks in advance, >> >> ARKO >> >> On Mon, Jan 30, 2012 at 3:05 AM, Randy Read <rj...@cam.ac.uk> wrote: >>> >>> Just one thing to add to that very detailed response from Ian. >>> >>> We've tended to use a slightly different approach to determining a >>> sensible resolution cutoff, where we judge whether there's useful >>> information in the highest resolution data by whether it agrees with >>> calculated structure factors computed from a model that hasn't been refined >>> against those data. We first did this with the complex of the Shiga-like >>> toxin B-subunit pentamer with the Gb3 trisaccharide (Ling et al, 1998). >>> From memory, the point where the average I/sig(I) drops below 2 was around >>> 3.3A. However, we had a good molecular replacement model to solve this >>> structure and, after just carrying out rigid-body refinement, we computed a >>> SigmaA plot using data to the edge of the detector (somewhere around 2.7A, >>> again from memory). The SigmaA plot dropped off smoothly to 2.8A >>> resolution, with values well above zero (indicating significantly better >>> than random agreement), then dropped suddenly. So we chose 2.8A as the >>> cutoff. Because there were four pentamers in the asymmetric unit, we could >>> then use 20-fold NCS averaging, which gave a fantastic map. In this case, >>> the averaging certainly helped to pull out something very useful from a very >>> weak signal, because the maps weren't nearly as clear at lower resolution. >>> >>> Since then, a number of other people have applied similar tests. Notably, >>> Axel Brunger has done some careful analysis to show that it can indeed be >>> useful to take data beyond the conventional limits. >>> >>> When you don't have a great MR model, you can do something similar by >>> limiting the resolution for the initial refinement and rebuilding, then >>> assessing whether there's useful information at higher resolution by using >>> the improved model (which hasn't seen the higher resolution data) to compute >>> Fcalcs. By the way, it's not necessary to use a SigmaA plot -- the >>> correlation between Fo and Fc probably works just as well. Note that, when >>> the model has been refined against the lower resolution data, you'll expect >>> a drop in correlation at the resolution cutoff you used for refinement, >>> unless you only use the cross-validation data for the resolution range used >>> in refinement. >>> >>> ----- >>> Randy J. Read >>> Department of Haematology, University of Cambridge >>> Cambridge Institute for Medical Research Tel: +44 1223 336500 >>> Wellcome Trust/MRC Building Fax: +44 1223 336827 >>> Hills Road >>> E-mail: rj...@cam.ac.uk >>> Cambridge CB2 0XY, U.K. >>> www-structmed.cimr.cam.ac.uk >>> >>> On 29 Jan 2012, at 17:25, Ian Tickle wrote: >>> >>>> Jacob, here's my (personal) take on this: >>>> >>>> The data quality metrics that everyone uses clearly fall into 2 >>>> classes: 'consistency' metrics, i.e. Rmerge/meas/pim and CC(1/2) which >>>> measure how well redundant observations agree, and signal/noise ratio >>>> metrics, i.e. mean(I/sigma) and completeness, which relate to the >>>> information content of the data. >>>> >>>> IMO the basic problem with all the consistency metrics is that they >>>> are not measuring the quantity that is relevant to refinement and >>>> electron density maps, namely the information content of the data, at >>>> least not in a direct and meaningful way. This is because there are 2 >>>> contributors to any consistency metric: the systematic errors (e.g. >>>> differences in illuminated volume and absorption) and the random >>>> errors (from counting statistics, detector noise etc.). If the data >>>> are collected with sufficient redundancy the systematic errors should >>>> hopefully largely cancel, and therefore only the random errors will >>>> determine the information content. Therefore the systematic error >>>> component of the consistency measure (which I suspect is the biggest >>>> component, at least for the strong reflections) is not relevant to >>>> measuring the information content. If the consistency measure only >>>> took into account the random error component (which it can't), then it >>>> would be essentially be a measure of information content, if only >>>> indirectly (but then why not simply use a direct measure such as the >>>> signal/noise ratio?). >>>> >>>> There are clearly at least 2 distinct problems with Rmerge, first it's >>>> including systematic errors in its measure of consistency, second it's >>>> not invariant with respect to the redundancy (and third it's useless >>>> as a statistic anyway because you can't do any significance tests on >>>> it!). The redundancy problem is fixed to some extent with Rpim etc, >>>> but that still leaves the other problems. It's not clear to me that >>>> CC(1/2) is any better in this respect, since (as far as I understand >>>> how it's implemented), one cannot be sure that the systematic errors >>>> will cancel for each half-dataset Imean, so it's still likely to >>>> contain a large contribution from the irrelevant systematic error >>>> component and so mislead in respect of the real data quality exactly >>>> in the same way that Rmerge/meas/pim do. One may as well use the >>>> Rmerge between the half dataset Imeans, since there would be no >>>> redundancy effect (i.e. the redundancy would be 2 for all included >>>> reflections). >>>> >>>> I did some significance tests on CC(1/2) and I got silly results, for >>>> example it says that the significance level for the CC is ~ 0.1, but >>>> this corresponded to a huge Rmerge (200%) and a tiny mean(I/sigma) >>>> (0.4). It seems that (without any basis in statistics whatsoever) the >>>> rule-of-thumb CC > 0.5 is what is generally used, but I would be >>>> worried that the statistics are so far divorced from the reality - it >>>> suggests that something is seriously wrong with the assumptions! >>>> >>>> Having said all that, the mean(I/sigma) metric, which on the face of >>>> it is much more closely related to the information content and >>>> therefore should be a more relevant metric than Rmerge/meas/pim & >>>> CC(1/2), is not without its own problems (which probably explains the >>>> continuing popularity of the other metrics!). First and most obvious, >>>> it's a hostage to the estimate of sigma(I) used. I've never been >>>> happy with inflating the counting sigmas to include effects of >>>> systematic error based on the consistency of redundant measurements, >>>> since as I indicated above if the data are collected redundantly in >>>> such a way that the systematic errors largely cancel, it implies that >>>> the systematic errors should not be included in the estimate of sigma. >>>> The fact that then the sigma(I)'s would generally be smaller (at >>>> least for the large I's), so the sample variances would be much larger >>>> than the counting variances, is irrelevant, because the former >>>> includes the systematic errors. Also the I/sigma cut-off used would >>>> probably not need to be changed since it affects only the weakest >>>> reflections which are largely unaffected by the systematic error >>>> correction. >>>> >>>> The second problem with mean(I/sigma) is also obvious: i.e. it's a >>>> mean, and as such it's rather insensitive to the actual distribution >>>> of I/sigma(I). For example if a shell contained a few highly >>>> significant intensities these could be overwhelmed by a large number >>>> of weak data and give an insignificant mean(I/sigma). It seems to me >>>> that one should be considering the significance of individual >>>> reflections, not the shell averages. Also the average will depend on >>>> the width of the resolution bin, so one will get the strange effect >>>> that the apparent resolution will depend on how one bins at the data! >>>> The assumption being made in taking the bin average is that I/sigma(I) >>>> falls off smoothly with d* but that's unlikely to be the reality. >>>> >>>> It seems to me that a chi-square statistic which takes into account >>>> the actual distribution of I/sigma(I) would be a better bet than the >>>> bin average, though it's not entirely clear how one would formulate >>>> such a metric. One would have to consider subsets of the data as a >>>> whole sorted by increasing d* (i.e. not in resolution bins to avoid >>>> the 'bin averaging effect' described above), and apply the resolution >>>> cut-off where the chi-square statistic has maximum probability. This >>>> would automatically take care of incompleteness effects since all >>>> unmeasured reflections would be included with I/sigma = 0 just for the >>>> purposes of working out the cut-off point. I've skipped the details >>>> of implementation and I've no idea how it would work in practice! >>>> >>>> An obvious question is: do we really need to worry about the exact >>>> cut-off anyway, won't our sophisticated maximum likelihood refinement >>>> programs handle the weak data correctly? Note that in theory weak >>>> intensities should be handled correctly, however the problem may >>>> instead lie with incorrectly estimated sigmas: these are obviously >>>> much more of an issue for any software which depends critically on >>>> accurate estimates of uncertainty! I did some tests where I refined >>>> data for a known protein-ligand complex using the original apo model, >>>> and looked at the difference density for the ligand, using data cut at >>>> 2.5, 2 and 1.5 Ang where the standard metrics strongly suggested there >>>> was only data to 2.5 Ang. >>>> >>>> I have to say that the differences were tiny, well below what I would >>>> deem significant (i.e. not only the map resolutions but all the map >>>> details were essentially the same), and certainly I would question >>>> whether it was worth all the soul-searching on this topic over the >>>> years! So it seems that the refinement programs do indeed handle weak >>>> data correctly, but I guess this should hardly come as a surprise (but >>>> well done to the software developers anyway!). This was actually >>>> using Buster: Refmac seems to have more of a problem with scaling & >>>> TLS if you include a load of high resolution junk data. However, >>>> before anyone acts on this information I would _very_ strongly advise >>>> them to repeat the experiment and verify the results for themselves! >>>> The bottom line may be that the actual cut-off used only matters for >>>> the purpose of quoting the true resolution of the map, but it doesn't >>>> significantly affect the appearance of the map itself. >>>> >>>> Finally an effect which confounds all the quality metrics is data >>>> anisotropy: ideally the cut-off surface of significance in reciprocal >>>> space should perhaps be an ellipsoid, not a sphere. I know there are >>>> several programs for anisotropic scaling, but I'm not aware of any >>>> that apply anisotropic resolution cutoffs (or even whether this would >>>> be advisable). >>>> >>>> Cheers >>>> >>>> -- Ian >>>> >>>> On 27 January 2012 17:47, Jacob Keller <j-kell...@fsm.northwestern.edu> >>>> wrote: >>>>> Dear Crystallographers, >>>>> >>>>> I cannot think why any of the various flavors of Rmerge/meas/pim >>>>> should be used as a data cutoff and not simply I/sigma--can somebody >>>>> make a good argument or point me to a good reference? My thinking is >>>>> that signal:noise of >2 is definitely still signal, no matter what the >>>>> R values are. Am I wrong? I was thinking also possibly the R value >>>>> cutoff was a historical accident/expedient from when one tried to >>>>> limit the amount of data in the face of limited computational >>>>> power--true? So perhaps now, when the computers are so much more >>>>> powerful, we have the luxury of including more weak data? >>>>> >>>>> JPK >>>>> >>>>> >>>>> -- >>>>> ******************************************* >>>>> Jacob Pearson Keller >>>>> Northwestern University >>>>> Medical Scientist Training Program >>>>> email: j-kell...@northwestern.edu >>>>> ******************************************* >> >> >> >> >> -- >> >> ARKA CHAKRABORTY >> CAS in Crystallography and Biophysics >> University of Madras >> Chennai,India >> >> >> ------ >> Randy J. Read >> Department of Haematology, University of Cambridge >> Cambridge Institute for Medical Research Tel: + 44 1223 336500 >> Wellcome Trust/MRC Building Fax: + 44 1223 336827 >> Hills Road E-mail: rj...@cam.ac.uk >> Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk >> > > > > -- > ******************************************* > Jacob Pearson Keller > Northwestern University > Medical Scientist Training Program > email: j-kell...@northwestern.edu > *******************************************