Re: [ccp4bb] Review: Linearity and Resolution in X-Ray Crystallography and Electron Microscopy

Randy John Read Thu, 10 Oct 2024 02:13:01 -0700

Hi,

Ian makes a good case that “resolution” already has a useful definition. I 
would hesitate to take any single number, including the half-bit-per-reflection 
criterion, as a measure of diffraction limit, because there’s useful 
information beyond that point especially crystals that diffract 
anisotropically. In Phaser, we use reflections with an information gain down to 
as little as 0.01, because the ones between 0.01 and 0.5 do add some signal. 
Below some small number like 0.01, the log-likelihood-gain contribution that 
could be made is so small that it’s just a waste of computer time to compute 
it, and writing the code necessary to deal with the numerical complications of 
exceptionally weak data would be a waste of programming time! Interestingly, 
there are data sets in the PDB demonstrating that, if you integrate to high 
enough resolution, there will come a point where none of the reflections convey 
even as little as 0.01 bit of information. This is what I’ve heard Keith Wilson 
describe as “collecting hkl values”.


In practice, I think the main reason we still need numbers describing the 
resolution limit or diffraction limit is that most programs can’t deal properly 
with extremely weak data. Data converted from intensities to amplitudes with 
the French-Wilson (truncate) procedure look like they convey information even 
if they don’t: an intensity associated with an infinite standard deviation 
would be turned into the average amplitude from the Wilson distribution of 
amplitudes and the standard deviation of that Wilson distribution, which has a 
finite value. When you provide Phaser with intensities, our experience is that 
you don’t really need a resolution cutoff, because it uses the LLGI target that 
treats weak data properly. If we can get that error model into other programs, 
then they should all be able to deal well with such data.

Until then, a useful approach would be to compute the information gain for each 
reflection in Phaser, then for developers of other programs to decide what 
would be a good information gain cutoff for their algorithms, maybe something 
between 0.01 and 0.5. Unfortunately, we didn’t make it very easy to get those 
values out of Phaser! But it’s easy to get them from the new phasertng code, 
which is very close to an official release. For anyone who’s particularly 
interested, Airlie and I could tell you how to get them now.

As to when you can compute these values, probably the useful point is 
immediately after scaling and merging the data. 

Best wishes,

Randy

> On 9 Oct 2024, at 15:25, Frank von Delft <frank.vonde...@cmd.ox.ac.uk> wrote:
> 
> So Randy, what should we be saying/using, and where do we find it... and (not 
> least!), when in the experiment-to-final-model process?  
> 
> By "what" I mean, the specific words - since as Ian points out, the words 
> "resolution limit" and "diffraction limit" are quite different, whatever we 
> thoughtlessly use in day-to-day parlance.
> 
> (A very interesting discussion, thanks!)
> 
> 
> On 08/10/2024 09:05, Randy John Read wrote:
>> Dear Marin,
>> 
>> In crystallography we do have the information gain measure (based on 
>> Kullback-Leibler divergence) that my group put forward and implemented in 
>> our Phaser program (https://doi.org/10.1107/s2059798320001588). Signal and 
>> noise aren’t isotropic, so information gain isn’t isotropic either. However, 
>> we’ve observed that the resolution at which the average information gain is 
>> about 1/2 bit per reflection corresponds roughly to the resolution limits 
>> suggested by other techniques. Given the interpretation of information gain 
>> as the maximum log-likelihood-gain that one could achieve from an 
>> observation with a perfect model, it’s a very natural measure to use for the 
>> useful resolution. I don’t think this measure has gained much traction in 
>> the crystallographic community yet, but it’s becoming more widely available 
>> in some data analysis tools.
>> 
>> We’ve used the same KL-divergence approach to estimate the information gain 
>> from a Fourier term in a cryo-EM reconstruction 
>> (https://doi.org/10.1107/s2059798323001596). In the implementation of this 
>> in our EM-placement docking software, we have anisotropic estimates of 
>> signal and noise, so again the information gain is anisotropic. Somewhat to 
>> my surprise (given the differences in the derivations), our information gain 
>> measure turns out to be equivalent to yours 
>> (https://doi.org/10.48550/arXiv.2009.03223) if we assume that the signal and 
>> noise are isotropic. As you point out there, for cryo-EM reconstructions 
>> it’s essential to consider the effect of over-sampling of the Fourier 
>> transform and the corresponding lack of independence of the Fourier terms, 
>> so this has an over-sampling correction factor.
>> 
>> Best wishes,
>> 
>> Randy Read
>> 
>> 
>>> On 8 Oct 2024, at 00:02, Marin van Heel <marin.vanh...@gmail.com> wrote:
>>> 
>>> Dear Marius Schmidt
>>> 
>>> In my (our) original FRC/FSC papers (1982; 1986 ; 2000; 2004; 2017; 2020; 
>>> 2024) the linearity of these correlation functions/metrics have been 
>>> extensively discussed. Historically, EM started at a low resolution 
>>> "blobology" level whereas X-ray crystallography (XRC) at that time, already 
>>> had reached atomic resolution. This led to the belief that the XRC 
>>> resolution metrics ( like phase residuals and R-factors) were also 
>>> appropriate as resolution metrics for EM. However, in XRC the measurables 
>>> are diffraction patterns for which amplitudes corresponding phases had to 
>>> be derived iteratively. In EM and in imagining in general, the measurables 
>>> are the images themselves, that contain both the amplitude information and 
>>> the phase information. To revert to the then already established XRC 
>>> resolution metrics like phase residuals or R-factors, implied discarding 
>>> the most important part of the available information (see the Why-O-Why ). 
>>> (https://www.linkedin.com/posts/marin-van-heel-5845b422b_whyowhyarchive-activity-7149738255154946048-Oc93/?utm_source=share&utm_medium=member_desktop).
>>> That problem was realized soon and the mentioned FRC and FSC metrics were 
>>> thus suggested which exploit all the available information. Thus, the XRC 
>>> atomic resolution technique of the 1980s came with a low-quality resolution 
>>> metric whereas the Cryo-EM low-resolution blobology approach of the 1980s 
>>> came with a high-quality resolution metric.
>>> Thus, in summary, all resolution criteria in XRC are ad-hoc non-linear 
>>> metrics that have no general validity outside of XRC. Looking at only the 
>>> amplitudes of a diffraction pattern is like finding the highest resolution 
>>> spot in a diffraction pattern, where, even if the spot is clearly visible, 
>>> that does not mean one would be able to find its phase. We need a more 
>>> comprehensive metric that has a wide range of applicability. In other 
>>> words, where a CC1-2 metric cannot be applied to assess the 3D brain scan 
>>> of a brain-tumor patient, the FRC / FSC, and the newest FRI / FSI metrics 
>>> can be applied in all cases 
>>> where 2D and 3D data are dealt with! 
>>> Hope this helps, 
>>> 
>>> Marin van Heel
>>> 
>>> On Mon, Oct 7, 2024 at 3:04 PM Marius Schmidt <smar...@uwm.edu> wrote:
>>> I think this is taken care of:
>>> The CC1/2 and the CC1/2* are appropriate metrics for the resolution limit.
>>> They are all spit out by newer data processing software.
>>> The CC1/2 is directly comparable to the FSC. Many people use CC1/2 = 1/e as
>>> the resolution limit.
>>> In many cases of data the CC1/2 = 1/e is equivalent to I/sigI of 1, which
>>> is used sometimes as a metric for the resolution limit (some use I/sigI = 
>>> 2),
>>> and in more cases the CC1/2 corresponds to Rmerge in the range of 40%.
>>> For serial crystallography, the R-split goes through the roof at CC1/2 = 
>>> 1/e,
>>> so the CC1/2 is the better metric.
>>> 
>>> Best
>>> Marius
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Marius Schmidt, Dr. rer. Nat. (habil.)
>>> Professor
>>> University of Wisconsin-Milwaukee
>>> Kenwood Interdisciplinary Research Complex
>>> Physics Department, Room 3087
>>> 3135 North Maryland Avenue
>>> Milwaukee, Wi 53211
>>> phone (office): 1-414-229-4338
>>> phone (lab): 414-229-3946
>>> email: smar...@uwm.edu
>>> https://uwm.edu/physics/people/schmidt-marius/
>>> https://sites.uwm.edu/smarius/
>>> https://www.bioxfel.org/
>>> Nature News and Views: https://www.nature.com/articles/d41586-023-00504-4
>>> 
>>> From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> on behalf of Marin van 
>>> Heel <marin.vanh...@gmail.com>
>>> Sent: Monday, October 7, 2024 11:24 AM
>>> To: CCP4BB@JISCMAIL.AC.UK <CCP4BB@JISCMAIL.AC.UK>
>>> Subject: [ccp4bb] Review: Linearity and Resolution in X-Ray Crystallography 
>>> and Electron Microscopy 
>>> Dear All,
>>> 
>>> Sayan Bhakta and I have recently posted the preprint of a review on 
>>> resolution and linearity which will appear in a book to be launched on the 
>>> 16th of October 2024. 
>>> ( https://doi.org/10.1201/9781003326106 ). It is the first Cryo-EM review 
>>> that I have been involved in for 25 years. 
>>> In our preparation, I was quite amazed about what other authors wrote (or 
>>> did not write) in their many reviews on these matters.
>>> For example, I missed any serious discussion about resolution metrics in 
>>> X-ray crystallography, which technique is fundamentally non-linear. 
>>> Linearity is a prerequisite for defining the resolution of any instrument. 
>>> The iterative refinements applied in X-ray crystallography (and sometimes 
>>> Cryo-EM) makes that all Phase-residuals and R-factors or fixed threshold 
>>> values cannot be used to compare the results of independently conducted 
>>> experiments. What is an obvious consequence of the lack of universality of 
>>> such metrics like phase-residuals and R-factors, is that they cannot be 
>>> used outside of the immediate context in which they were defined, like 
>>> X-ray crystallography or structural biology. In contrast, the 
>>> Fourier-Ring-Correlation (FRC); Fourier-Shell-Correlation (FSC) and their 
>>> recent successors: the Fourier-Ring-Information (FRI) and the 
>>> Fourier-Shell-Information (FSI), plus their integrated versions, are 
>>> universal metrics that are applicable to all fields of science where 2D and 
>>> 3D data are dealt with!
>>> 
>>> https://doi.org/10.31219/osf.io/5empt
>>> 
>>> Have fun reading it!
>>> 
>>> Marin 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> To unsubscribe from the CCP4BB list, click the following link:
>>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 
>>> To unsubscribe from the CCP4BB list, click the following link:
>>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>> 
>> -----
>> Randy J. Read
>> Department of Haematology, University of Cambridge
>> Cambridge Institute for Medical Research Tel: +44 1223 336500
>> The Keith Peters Building
>> Hills Road E-mail: rj...@cam.ac.uk
>> Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk
>> 
>> 
>> ########################################################################
>> 
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>> 
>> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
>> list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
>> https://www.jiscmail.ac.uk/policyandsecurity/
>> 
> 

-----
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research     Tel: +44 1223 336500
The Keith Peters Building
Hills Road                                                       E-mail: 
rj...@cam.ac.uk
Cambridge CB2 0XY, U.K.                              
www-structmed.cimr.cam.ac.uk


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Review: Linearity and Resolution in X-Ray Crystallography and Electron Microscopy

Reply via email to