This is hunch speak - not proper analysis, but it is possible to get huge Fcalc, and hence large difference map terms, at low resolution by assuming the solvent volume is a vacuum, not full of partially ordered water molecules. The Babinet scaling can do something to correct this but it is a very blunt tool. And once a structure is more or less complete the Solvent masked contribution to Fcalc helps, but there is an intermediate stage where spurious differences can distort maps.
As Randy says - if either Eobs or Ecalc is small the FOM is also small. The worst offenders are when Eobs is large but Ecalc is crazy. I like to look at the plot of <Fobs> v <Fcalc>v resolution, output by REFMAC along with Rfactor plots. If thee are large discrepancies maybe it is time to worry about scaling options.. Eleanor PS - But are difference map terms weighted by FOM? On Thu, 17 Oct 2019 at 08:55, Jan Dohnalek <dohnalek...@gmail.com> wrote: > Dear all, > regarding the "remaining strong differences" between measured data and > calculated SFs from a a finished (high res structure) I once investigated a > bit into this going back to images and looking up some extreme outliers. > I found the same - those were clear strong diffraction spots, not ice, not > small molecule, genuine protein diffraction. So I had no explanation for > those. Some were even "forbidden" intensities, because of screw axes which > were correct. structure refined perfectly, no problems at all. > I then found some literature about the possibilities of multiple > reflections - I guess this is possible but I wonder if you could get easily > say a 25 sigma I in this way. > > And as we often end our beer-discussions - may be all protein space groups > are actually true P1, just close enough to satisfy the high symmetry rules > .. but this is getting a bit philosophical I know .. > > Jan Dohnalek > > > > > On Wed, Oct 16, 2019 at 6:24 PM Randy Read <rj...@cam.ac.uk> wrote: > >> James, >> >> Where we diverge is with your interpretation that big differences lead to >> small FOMs. The size of the FOM depends on the product of Fo and Fc, not >> their difference. The FOM for a reflection where Fo=1000 and Fc=10 is very >> different from the FOM for a reflection with Fo=5000 and Fc=4010, even >> though the difference is the same. >> >> Expanding on this: >> >> 1. The FOM actually depends more on the E values, i.e. reflections >> smaller than average get lower FOM values than ones bigger than average. >> In the resolution bin from 5.12 to 5.64Å of 2vb1, the mean observed >> intensity is 20687 and the mean calculated intensity is 20022, which means >> that Eobs=Sqrt(145.83/20687)=0.084 and Ecalc=Sqrt(7264/20022)=0.602. This >> reflection gets a low FOM because the product (0.050) is such a small >> number, not because the difference is big. >> >> 2. You have to consider the role of the model error in the difference, >> because for precisely-measured data most of the difference comes from model >> error. In this resolution shell, the correlation coefficient between Iobs >> and Fcalc^2 is about 0.88, which means that sigmaA is about Sqrt(0.88) = >> 0.94. The variance of both the real and imaginary components of Ec (as an >> estimate of the phased true E) will be (1-0.94^2)/2 = 0.058, so the >> standard deviations of the real and imaginary components of Ec will be >> about 0.24. In that context, the difference between Eobs and Ecalc is >> nothing like a 2000-sigma outlier. >> >> Looking at this another way, the reason why the FOM is low for this >> reflection is that the conditional probability distribution of Eo given Ec >> has significant values on the other side of the origin of the complex >> plane. That means that the *phase* of the complex Eo is very uncertain. >> The figures in this web page ( >> https://www-structmed.cimr.cam.ac.uk/Course/Statistics/statistics.html) >> should help to explain that idea. >> >> Best wishes, >> >> Randy >> >> On 16 Oct 2019, at 16:02, James Holton <jmhol...@lbl.gov> wrote: >> >> >> All very true Randy, >> >> But nevertheless every hkl has an FOM assigned to it, and that is used to >> calculate the map. Statistical distribution or not, the trend is that hkls >> with big amplitude differences get smaller FOMs, so that means large >> model-to-data discrepancies are down-weighted. I wonder sometimes at what >> point this becomes a self-fulfilling prophecy? If you look in detail and >> the Fo-Fc differences in pretty much any refined structure in the PDB you >> will find huge outliers. Some are hundreds of sigmas, and they can go in >> either direction. >> >> Take for example reflection -5,2,2 in the highest-resolution lysozyme >> structure in the PDB: 2vb1. Iobs(-5,2,2) was recorded as 145.83 ± 3.62 (at >> 5.4 Ang) with Fcalc^2(-5,2,2) = 7264. A 2000-sigma outlier! What are the >> odds? On the other hand, Iobs(4,-6,2) = 1611.21 ± 30.67 vs >> Fcalc^2(4,-6,2) = 73, which is in the opposite direction. One can always >> suppose "experimental errors", but ZD sent me these images and I have >> looked at all the spots involved in these hkls. I don't see anything wrong >> with any of them. The average multiplicity of this data set was 7.1 and >> involved 3 different kappa angles, so I don't think these are "zingers" or >> other weird measurement problems. >> >> I'm not just picking on 2vb1 here. EVERY PDB entry has this problem. >> Not sure where it comes from, but the FOM assigned to these huge >> differences is always small, so whatever is causing them won't show up in >> an FOM-weighted map. >> >> Is there a way to "change up" the statistical distribution that assigns >> FOMs to hkls? Or are we stuck with this systematic error? >> >> -James Holton >> MAD Scientist >> >> On 10/4/2019 9:31 AM, Randy Read wrote: >> >> Hi James, >> >> I'm sure you realise this, but it's important for other readers to >> remember that the FOM is a statistical quantity: we have a probability >> distribution for the true phase, we pick one phase (the "centroid" phase >> that should minimise the RMS error in the density map), and then the FOM is >> the expected value of the phase error, obtained by taking the cosines of >> all possible phase differences and weighting by the probability of that >> phase difference. Because it's a statistical quantity from a random >> distribution, you really can't expect this to agree reflection by >> reflection! It's a good start to see that the overall values are good, but >> if you want to look more closely you have to look a groups of reflections, >> e.g. bins of resolution, bins of observed amplitude, bins of calculated >> amplitude. However, each bin has to have enough members that the average >> will generally be close to the expected value. >> >> Best wishes, >> >> Randy Read >> >> On 4 Oct 2019, at 16:38, James Holton <jmhol...@lbl.gov> wrote: >> >> I've done a few little experiments over the years using simulated data >> where I know the "correct" phase, trying to see just how accurate FOMs >> are. What I have found in general is that overall FOM values are fairly >> well correlated to overall phase error, but if you go >> reflection-by-reflection they are terrible. I suppose this is because FOM >> estimates are rooted in amplitudes. Good agreement in amplitude gives you >> more confidence in the model (and therefore the phases), but if your R >> factor is 55% then your phases probably aren't very good either. However, >> if you look at any given h,k,l those assumptions become less and less >> applicable. Still, it's the only thing we've got. >> >> 2qwAt the end of the day, the phase you get out of a refinement program >> is the phase of the model. All those fancy "FWT" coefficients with "m" and >> "D" or "FOM" weights are modifications to the amplitudes, not the phases. >> The phases in your 2mFo-DFc map are identical to those of just an Fc map. >> Seriously, have a look! Sometimes you will get a 180 flip to keep the sign >> of the amplitude positive, but that's it. Nevertheless, the electron >> density of a 2mFo-DFc map is closer to the "correct" electron density than >> any other map. This is quite remarkable considering that the "phase error" >> is the same. >> >> This realization is what led my colleagues and I to forget about "phase >> error" and start looking at the error in the electron density itself >> (10.1073/pnas.1302823110). We did this rather pedagogically. Basically, >> pretend you did the whole experiment again, but "change up" the source of >> error of interest. For example if you want to see the effect of sigma(F) >> then you add random noise with the same magnitude as sigma(F) to the Fs, >> and then re-refine the structure. This gives you your new phases, and a >> new map. Do this 50 or so times and you get a pretty good idea of how any >> source of error of interest propagates into your map. There is even a >> little feature in coot for animating these maps, which gives a much more >> intuitive view of the "noise". You can also look at variation of model >> parameters like the refined occupancy of a ligand, which is a good way to >> put an "error bar" on it. The trick is finding the right source of error >> to propagate. >> >> -James Holton >> MAD Scientist >> >> >> On 10/2/2019 2:47 PM, Andre LB Ambrosio wrote: >> >> Dear all, >> >> How is the phase error estimated for any given reflection, specifically >> in the context of model refinement? In terms of math I mean. >> >> How useful is FOM in assessing the phase quality, when not for initial >> experimental phases? >> >> Many thank in advance, >> >> Andre. >> >> ------------------------------ >> >> To unsubscribe from the CCP4BB list, click the following link: >> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 >> >> >> >> ------------------------------ >> >> To unsubscribe from the CCP4BB list, click the following link: >> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 >> >> >> ------ >> Randy J. Read >> Department of Haematology, University of Cambridge >> Cambridge Institute for Medical Research Tel: + 44 1223 336500 >> The Keith Peters Building Fax: + 44 1223 >> 336827 >> Hills Road E-mail: >> rj...@cam.ac.uk <rj...@cam.ac.uk> >> Cambridge CB2 0XY, U.K. >> www-structmed.cimr.cam.ac.uk >> >> >> >> ------ >> Randy J. Read >> Department of Haematology, University of Cambridge >> Cambridge Institute for Medical Research Tel: + 44 1223 336500 >> The Keith Peters Building Fax: + 44 1223 >> 336827 >> Hills Road E-mail: >> rj...@cam.ac.uk <rj...@cam.ac.uk> >> Cambridge CB2 0XY, U.K. >> www-structmed.cimr.cam.ac.uk >> >> >> ------------------------------ >> >> To unsubscribe from the CCP4BB list, click the following link: >> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 >> > > > -- > Jan Dohnalek, Ph.D > Institute of Biotechnology > Academy of Sciences of the Czech Republic > Biocev > Prumyslova 595 > 252 50 Vestec near Prague > Czech Republic > > Tel. +420 325 873 758 > > ------------------------------ > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 > ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1