we can use model B factors to validate structures - see 

Analysis and validation of macromolecular B values
R. C. Masmaliyeva and G. N. Murshudov
Acta Cryst. (2019). D75, 505-518
https://doi.org/10.1107/S2059798319004807

HTH
Kay


On Sun, 8 Mar 2020 09:08:32 +0000, Rangana Warshamanage <ranga...@gmail.com> 
wrote:

>"The best estimate we have of the "true" B factor is the model B factors
>we get at the end of refinement, once everything is converged, after we
>have done all the building we can.  It is this "true B factor" that is a
>property of the data, not the model, "
>
>If this is the case, why can't we use model B factors to validate our
>structure? I know some people are skeptical about this approach because B
>factors are refinable parameters.
>
>Rangana
>
>On Sat, Mar 7, 2020 at 8:01 PM James Holton <jmhol...@lbl.gov> wrote:
>
>> Yes, that's right.  Model B factors are fit to the data.  That Boverall
>> gets added to all atomic B factors in the model before the structure is
>> written out, yes?
>>
>> The best estimate we have of the "true" B factor is the model B factors
>> we get at the end of refinement, once everything is converged, after we
>> have done all the building we can.  It is this "true B factor" that is a
>> property of the data, not the model, and it has the relationship to
>> resolution and map appearance that I describe below.  Does that make sense?
>>
>> -James Holton
>> MAD Scientist
>>
>> On 3/7/2020 10:45 AM, dusan turk wrote:
>> > James,
>> >
>> > The case you’ve chosen is not a good illustration of the relationship
>> between atomic B and resolution.   The problem is that during scaling of
>> Fcalc to Fobs also B-factor difference between the two sets of numbers is
>> minimized. In the simplest form  with two constants Koverall and Boverall
>> it looks like this:
>> >
>> > sum_to_be_minimized = sum (FOBS**2 -  Koverall * FCALC**2 * exp(-1/d**2
>> * Boverall) )
>> >
>> > Then one can include bulk solvent correction, anisotripic scaling, … In
>> PHENIX it gets quite complex.
>> >
>> > Hence, almost regardless of the average model B you will always get the
>> same map, because the “B" of the map will reflect the B of the FOBS.  When
>> all atomic Bs are equal then they are also equal to average B.
>> >
>> > best, dusan
>> >
>> >
>> >> On 7 Mar 2020, at 01:01, CCP4BB automatic digest system <
>> lists...@jiscmail.ac.uk> wrote:
>> >>
>> >>> On Thu, 5 Mar 2020 01:11:33 +0100, James Holton <jmhol...@lbl.gov>
>> wrote:
>> >>>
>> >>>> The funny thing is, although we generally regard resolution as a
>> primary
>> >>>> indicator of data quality the appearance of a density map at the
>> classic
>> >>>> "1-sigma" contour has very little to do with resolution, and
>> everything
>> >>>> to do with the B factor.
>> >>>>
>> >>>> Seriously, try it. Take any structure you like, set all the B factors
>> to
>> >>>> 30 with PDBSET, calculate a map with SFALL or phenix.fmodel and have a
>> >>>> look at the density of tyrosine (Tyr) side chains.  Even if you
>> >>>> calculate structure factors all the way out to 1.0 A the holes in the
>> >>>> Tyr rings look exactly the same: just barely starting to form.  This
>> is
>> >>>> because the structure factors from atoms with B=30 are essentially
>> zero
>> >>>> out at 1.0 A, and adding zeroes does not change the map.  You can
>> adjust
>> >>>> the contour level, of course, and solvent content will have some
>> effect
>> >>>> on where the "1-sigma" contour lies, but generally B=30 is the point
>> >>>> where Tyr side chains start to form their holes.  Traditionally, this
>> is
>> >>>> attributed to 1.8A resolution, but it is really at B=30.  The point
>> >>>> where waters first start to poke out above the 1-sigma contour is at
>> >>>> B=60, despite being generally attributed to d=2.7A.
>> >>>>
>> >>>> Now, of course, if you cut off this B=30 data at 3.5A then the Tyr
>> side
>> >>>> chains become blobs, but that is equivalent to collecting data with
>> the
>> >>>> detector way too far away and losing your high-resolution spots off
>> the
>> >>>> edges.  I have seen a few people do that, but not usually for a
>> >>>> published structure.  Most people fight very hard for those faint,
>> >>>> barely-existing high-angle spots.  But why do we do that if the map is
>> >>>> going to look the same anyway?  The reason is because resolution and B
>> >>>> factors are linked.
>> >>>>
>> >>>> Resolution is about separation vs width, and the width of the density
>> >>>> peak from any atom is set by its B factor.  Yes, atoms have an
>> intrinsic
>> >>>> width, but it is very quickly washed out by even modest B factors (B >
>> >>>> 10).  This is true for both x-ray and electron form factors. To a very
>> >>>> good approximation, the FWHM of C, N and O atoms is given by:
>> >>>> FWHM= sqrt(B*log(2))/pi+0.15
>> >>>>
>> >>>> where "B" is the B factor assigned to the atom and the 0.15 fudge
>> factor
>> >>>> accounts for its intrinsic width when B=0.  Now that we know the peak
>> >>>> width, we can start to ask if two peaks are "resolved".
>> >>>>
>> >>>> Start with the classical definition of "resolution" (call it after
>> Airy,
>> >>>> Raleigh, Dawes, or whatever famous person you like), but essentially
>> you
>> >>>> are asking the question: "how close can two peaks be before they merge
>> >>>> into one peak?".  For Gaussian peaks this is 0.849*FWHM. Simple
>> enough.
>> >>>> However, when you look at the density of two atoms this far apart you
>> >>>> will see the peak is highly oblong. Yes, the density has one maximum,
>> >>>> but there are clearly two atoms in there.  It is also pretty obvious
>> the
>> >>>> long axis of the peak is the line between the two atoms, and if you
>> fit
>> >>>> two round atoms into this peak you recover the distance between them
>> >>>> quite accurately.  Are they really not "resolved" if it is so clear
>> >>>> where they are?
>> >>>>
>> >>>> In such cases you usually want to sharpen, as that will make the
>> oblong
>> >>>> blob turn into two resolved peaks.  Sharpening reduces the B factor
>> and
>> >>>> therefore FWHM of every atom, making the "resolution" (0.849*FWHM) a
>> >>>> shorter distance.  So, we have improved resolution with sharpening!
>> Why
>> >>>> don't we always do this?  Well, the reason is because of noise.
>> >>>> Sharpening up-weights the noise of high-order Fourier terms and
>> >>>> therefore degrades the overall signal-to-noise (SNR) of the map.  This
>> >>>> is what I believe Colin would call reduced "contrast".  Of course,
>> since
>> >>>> we view maps with a threshold (aka contour) a map with SNR=5 will look
>> >>>> almost identical to a map with SNR=500. The "noise floor" is generally
>> >>>> well below the 1-sigma threshold, or even the 0-sigma threshold
>> >>>> (https://doi.org/10.1073/pnas.1302823110).  As you turn up the
>> >>>> sharpening you will see blobs split apart and also see new peaks
>> rising
>> >>>> above your map contouring threshold.  Are these new peaks real?  Or
>> are
>> >>>> they noise?  That is the difference between SNR=500 and SNR=5,
>> >>>> respectively.  The tricky part of sharpening is knowing when you have
>> >>>> reached the point where you are introducing more noise than signal.
>> >>>> There are some good methods out there, but none of them are perfect.
>> >>>>
>> >>>> What about filtering out the noise?  An ideal noise suppression filter
>> >>>> has the same shape as the signal (I found that in Numerical Recipes),
>> >>>> and the shape of the signal from a macromolecule is a Gaussian in
>> >>>> reciprocal space (aka straight line on a Wilson plot). This is true,
>> by
>> >>>> the way, for both a molecule packed into a crystal or free in
>> solution.
>> >>>> So, the ideal noise-suppression filter is simply applying a B factor.
>> >>>> Only problem is: sharpening is generally done by applying a negative B
>> >>>> factor, so applying a Gaussian blur is equivalent to just not
>> sharpening
>> >>>> as much. So, we are back to "optimal sharpening" again.
>> >>>>
>> >>>> Why not use a filter that is non-Gaussian?  We do this all the time!
>> >>>> Cutting off the data at a given resolution (d) is equivalent to
>> blurring
>> >>>> the map with this function:
>> >>>>
>> >>>> kernel_d(r) = 4/3*pi/d**3*sinc3(2*pi*r/d)
>> >>>> sinc3(x) = (x==0?1:3*(sin(x)/x-cos(x))/(x*x))
>> >>>>
>> >>>> where kernel_d(r) is the normalized weight given to a point "r"
>> Angstrom
>> >>>> away from the center of each blurring operation, and "sinc3" is the
>> >>>> Fourier synthesis of a solid sphere.  That is, if you make an HKL file
>> >>>> with all F=1 and PHI=0 out to a resolution d, then effectively all
>> hkls
>> >>>> beyond the resolution limit are zero. If you calculate a map with
>> those
>> >>>> Fs, you will find the kernel_d(r) function at the origin.  What that
>> >>>> means is: by applying a resolution cutoff, you are effectively
>> >>>> multiplying your data by this sphere of unit Fs, and since a
>> >>>> multiplication in reciprocal space is a convolution in real space, the
>> >>>> effect is convoluting (blurring) with kernel_d(x).
>> >>>>
>> >>>> For comparison, if you apply a B factor, the real-space blurring
>> kernel
>> >>>> is this:
>> >>>> kernel_B(r) = (4*pi/B)**1.5*exp(-4*pi**2/B*r*r)
>> >>>>
>> >>>> If you graph these two kernels (format is for gnuplot) you will find
>> >>>> that they have the same FWHM whenever B=80*(d/3)**2.  This "rule" is
>> the
>> >>>> one I used for my resolution demonstration movie I made back in the
>> late
>> >>>> 20th century:
>> >>>> https://bl831.als.lbl.gov/~jamesh/movies/index.html#resolution
>> >>>>
>> >>>> What I did then was set all atomic B factors to B = 80*(d/3)^2 and
>> then
>> >>>> cut the resolution at "d".  Seemed sensible at the time.  I suppose I
>> >>>> could have used the PDB-wide average atomic B factor reported for
>> >>>> structures with resolution "d", which roughly follows:
>> >>>> B = 4*d**2+12
>> >>>> https://bl831.als.lbl.gov/~jamesh/pickup/reso_vs_avgB.png
>> >>>>
>> >>>> The reason I didn't use this formula for the movie is because I didn't
>> >>>> figure it out until about 10 years later.  These two curves cross at
>> >>>> 1.5A, but diverge significantly at poor resolution.  So, which one is
>> >>>> right?  It depends on how well you can measure really really faint
>> >>>> spots, and we've been getting better at that in recent decades.
>> >>>>
>> >>>> So, what I'm trying to say here is that just because your data has
>> CC1/2
>> >>>> or FSC dropping off to insignificance at 1.8 A doesn't mean you are
>> >>>> going to see holes in Tyr side chains.  However, if you measure your
>> >>>> weak, high-res data really well (high multiplicity), you might be able
>> >>>> to sharpen your way to a much clearer map.
>> >>>>
>> >>>> -James Holton
>> >>>> MAD Scientist
>> >>>>
>> > ########################################################################
>> >
>> > To unsubscribe from the CCP4BB list, click the following link:
>> > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>>
>> ########################################################################
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>>
>
>########################################################################
>
>To unsubscribe from the CCP4BB list, click the following link:
>https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

Reply via email to