Re: [ccp4bb] Series termination effect calculation.

Pavel Afonine Thu, 20 Sep 2012 02:24:13 -0700

Hi James,

using dynamic N-Gaussian approximation to form-factor tables as described
here (pages 27-29):


http://cci.lbl.gov/publications/download/iucrcompcomm_jan2004.pdf

and used in Phenix since 2004, avoids both: singularity at B=0 and
inaccurate density values (compared to the raw forma-factor tables) for
B->0.

Attached is the script that proves this point. To run, simply
"phenix.python run.py".

Pavel

On Sun, Sep 16, 2012 at 11:32 PM, James Holton <jmhol...@lbl.gov> wrote:

> Yes, the constant term in the "5-Gaussian" structure factor tables does
> become annoying when you try to plot electron density in real space, but
> only if you try to make the B factor zero.  If the B factors are ~12 (like
> they are in 1m1n), then the electron density 2.0 A from an Fe atom is not
> -0.2 e-/A^3, it is 0.025 e-/A^3. This is only 1% of the electron density at
> the center of a nitrogen atom with the same B factor.
>
> But if you do set the B factor to zero, then the electron density at the
> center of any atom (using the 5-Gaussian model) is infinity.  To put it in
> gnuplot-ish, the structure factor of Fe (in reciprocal space) can be
> plotted with this function:
> Fe_sf(s)=Fe_a1*exp(-Fe_b1*s*s)**+Fe_a2*exp(-Fe_b2*s*s)+Fe_a3***
> exp(-Fe_b3*s*s)+Fe_a4*exp(-Fe_**b4*s*s)+Fe_c
>
> where:
> Fe_c = 1.036900;
> Fe_a1 = 11.769500; Fe_a2 = 7.357300; Fe_a3 = 3.522200; Fe_a4 = 2.304500;
> Fe_b1 = 4.761100; Fe_b2 = 0.307200; Fe_b3 = 15.353500; Fe_b4 = 76.880501;
> and "s" is sin(theta)/lambda
>
> applying a B factor is then just multiplication by exp(-B*s*s)
>
>
> Since the terms are all Gaussians, the inverse Fourier transform can
> actually be done analytically, giving the real-space version, or the
> expression for electron density vs distance from the nucleus (r):
>
> Fe_ff(r,B) = \
>   +Fe_a1*(4*pi/(Fe_b1+B))**1.5***safexp(-4*pi**2/(Fe_b1+B)*r*r) \
>   +Fe_a2*(4*pi/(Fe_b2+B))**1.5***safexp(-4*pi**2/(Fe_b2+B)*r*r) \
>   +Fe_a3*(4*pi/(Fe_b3+B))**1.5***safexp(-4*pi**2/(Fe_b3+B)*r*r) \
>   +Fe_a4*(4*pi/(Fe_b4+B))**1.5***safexp(-4*pi**2/(Fe_b4+B)*r*r) \
>   +Fe_c *(4*pi/(B))**1.5*safexp(-4*pi****2/(B)*r*r);
>
> Where here applying a B factor requires folding it into each Gaussian
> term.  Notice how the Fe_c term blows up as B->0? This is where most of the
> series-termination effects come from. If you want the above equations for
> other atoms, you can get them from here:
> http://bl831.als.lbl.gov/~**jamesh/pickup/all_atomsf.**gnuplot<http://bl831.als.lbl.gov/~jamesh/pickup/all_atomsf.gnuplot>
> http://bl831.als.lbl.gov/~**jamesh/pickup/all_atomff.**gnuplot<http://bl831.als.lbl.gov/~jamesh/pickup/all_atomff.gnuplot>
>
> This "infinitely sharp spike problem" seems to have led some people to
> conclude that a zero B factor is non-physical, but nothing could be further
> from the truth!  The scattering from mono-atomic gasses is an excellent
> example of how one can observe the B=0 structure factor.   In fact, gas
> scattering is how the quantum mechanical self-consistent field calculations
> of electron clouds around atoms was experimentally verified.  Does this
> mean that there really is an infinitely sharp "spike" in the middle of
> every atom?  Of course not.  But there is a "very" sharp spike.
>
> So, the problem of "infinite density" at the nucleus is really just an
> artifact of the 5-Gaussian formalism.  Strictly speaking, the "5-Gaussian"
> structure factor representation you find in ${CLIBD}/atomsf.lib (or Table
> 6.1.1.4 in the International Tables volume C) is nothing more than a curve
> fit to the "true" values listed in ITC volume C tables 6.1.1.1 (neutral
> atoms) and 6.1.1.3 (ions).  These latter tables are the Fourier transform
> of the "true" electron density distribution around a particular atom/ion
> obtained from quantum mechanical self-consistent field calculations (like
> those of Cromer, Mann and many others).
>
> The important thing to realize is that the fit was done in _reciprocal_
> space, and if you look carefully at tables 6.1.1.1 and 6.1.1.3, you can see
> that even at REALLY high angle (sin(theta)/lambda = 6, or 0.083 A
> resolution) there is still significant elastic scattering from the heavier
> atoms.  The purpose of the "constant term" in the 5-Gaussian representation
> is to try and capture this high-angle "tail", and for the really heavy
> atoms this can be more than 5 electron equivalents.  In real space, this is
> equivalent to saying that about 5 electrons are located within at least
> ~0.03 A of the nucleus.  That's a very short distance, but it is also not
> zero.  This is because the first few shells of electrons around things like
> a Uranium nucleus actually are very small and dense.  How, then, can we
> have any hope of modelling heavy atoms properly without using a map grid
> sampling of 0.01A ?  Easy!  The B factors are never zero.
>
> Even for a truly infinitely sharp peak (aka a single electron), it doesn't
> take much of a B factor to spread it out to a reasonable size.  For
> example, applying a B factor of 9 to a point charge will give it a
> full-width-half max (FWHM) of 0.8 A, the same as the "diameter" of a carbon
> atom.  A carbon atom with B=12 has FWHM = 1.1 A, the same as a "point"
> charge with B=16.  Carbon at B=80 and a point with B=93 both have FWHM =
> 2.6 A.  As the B factor becomes larger and larger, it tends to dominate the
> atomic shape (looks like a single Gaussian).  This is why it is so hard to
> assign atom types from density alone.  In fact, with B=80, a Uranium atom
> at 1/100th occupancy is essentially indistinguishable from a hydrogen atom.
> That is, even a modest B factor pretty much "washes out" any sharp features
> the atoms might have.  Sometimes I wonder why we bother with "form factors"
> at all, since at modest resolutions all we really need is Z (the atomic
> number) and the B factor.  But, then again, I suppose it doesn't hurt
> either.
>
>
> So, what does this have to do with series termination?  Series termination
> arises in the inverse Fourier transform (making a map from structure
> factors).  Technically, the "tails" of a Gaussian never reach zero, so any
> sort of "resolution cutoff" always introduces some error into the electron
> density calculation.  That is, if you create an arbitrary electron-density
> map, convert it into structure factors and then "fft" it back, you do _not_
> get the same map that you started with!  How much do they differ? Depends
> on the RMS value of the high-angle structure factors that have been cut off
> (Parseval's theorem).  The "infinitely sharp spike" problem exacerbates
> this, because the B=0 structure factors do not tend toward zero as fast as
> a Gaussian with the "atomic width" would.
>
> So, for a given resolution, when does the B factor get "too sharp"?  Well,
> for "protein" atoms, the following B factors will introduce an rms error in
> the electron density map equal to about 5% of the peak height of the atoms
> when the data are cut to the following resolution:
> d     B
> 1.0 <5
> 1.5 8
> 2.0 27
> 2.5 45
> 3.0 65
> 3.5 86
> 4.0 >99
>
> smaller B factors than this will introduce more than 5% error at each of
> these resolutions.  Now, of course, one is often not nearly as concerned
> with the average error in the map as you are with the error at a particular
> point of interest, but the above numbers can serve as a rough guide.  If
> you want to see the series-termination error at a particular point in the
> map, you will have to calculate the "true" map of your model (using a
> program like SFALL), and then run the map back and forth through the
> Fourier transform and resolution cutoff (such as with SFALL and FFT).  You
> can then use MAPMAN or Chimera to probe the electron density at the point
> of interest.
>
> But, to answer the OP's question, I would not recommend trying to do fancy
> map interpretation to identify a mystery atom.  Instead, just refine the
> occupancy of the mystery atom and see where that goes.  Perhaps jiggling
> the rest of the molecule with "kick maps" to see how stable the occupancy
> is.  Since refinement only does forward-FFTs, it is formally insensitive to
> series termination errors.  It is only map calculation where series
> termination can become a problem.
>
> Thanks to Garib for clearing up that last point for me!
>
> -James Holton
> MAD Scientist
>
>
>
> On 9/15/2012 3:12 AM, Tim Gruene wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Dear Ian,
>>
>> provided that f(s) is given by the formula in the Cromer/Mann article,
>> which I believe we have agreed on, the inset of Fig.1 of the Science
>> article we are talking about is claimed to be the graph of the
>> function g, which I added as pdf to this email for better readability.
>>
>> Irrespective of what has been plotted in any other article meantioned
>> throughout this thread, this claim is incorrect, given a_i, b_i, c > 0.
>>
>> I am sure you can figure this out yourself. My argument was not
>> involving mathematical programs but only one-dimensional calculus.
>>
>> Cheers,
>> Tim
>>
>> On 09/14/2012 04:46 PM, Ian Tickle wrote:
>>
>>> On 14 September 2012 15:15, Tim Gruene <t...@shelx.uni-ac.gwdg.de>
>>> wrote:
>>>
>>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>>
>>>> Hello Ian,
>>>>
>>>> your article describes f(s) as sum of four Gaussians, which is
>>>> not the same f(s) from Cromer's and Mann's paper and the one used
>>>> both by Niu and me. Here, f(s) contains a constant, as I pointed
>>>> out to in my response, which makes the integral oscillate between
>>>> plus and minus infinity as the upper integral border (called
>>>> 1/dmax in the article Niu refers to) goes to infinity).
>>>>
>>>> Maybe you can shed some light on why your article uses a
>>>> different f(s) than Cromer/Mann. This explanation might be the
>>>> answer to Nius question, I reckon, and feed my curiosity, too.
>>>>
>>> Tim & Niu, oops yes a small slip in the paper there, it should
>>> have read "4 Gaussians + constant term": this is clear from the
>>> ITC reference given and the $CLIBD/atomsf.lib table referred to.
>>> In practice it's actually rendered as a sum of 5 Gaussians after
>>> you multiply the f(s) and atomic Biso factor terms, so unless Biso
>>> = 0 (very unphysical!) there is actually no constant term.  My
>>> integral for rho(r) certainly doesn't oscillate between plus and
>>> minus infinity as d_min -> zero.  If yours does then I suspect that
>>> either the Biso term was forgotten or if not then a bug in the
>>> integration routine (e.g. can it handle properly the point at r = 0
>>> where the standard formula for the density gives 0/0?).  I used
>>> QUADPACK
>>> (http://people.sc.fsu.edu/~**jburkardt/f_src/quadpack/**quadpack.html<http://people.sc.fsu.edu/~jburkardt/f_src/quadpack/quadpack.html>
>>> )
>>> which seems pretty good at taking care of such singularities
>>> (assuming of course that the integral does actually converge).
>>>
>>> Cheers
>>>
>>> -- Ian
>>>
>>>  - -- - --
>> Dr Tim Gruene
>> Institut fuer anorganische Chemie
>> Tammannstr. 4
>> D-37077 Goettingen
>>
>> GPG Key ID = A46BEE1A
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.12 (GNU/Linux)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>>
>> iD8DBQFQVFSBUxlJ7aRr7hoRAoPYAK**DNQu84ozIz5Mn/qmRKiLxXPw/**zPgCgwd75
>> KUHsKzaSdi9mL5kzZBeOqUI=
>> =mbnY
>> -----END PGP SIGNATURE-----
>>
>

run.py
Description: Binary data

Re: [ccp4bb] Series termination effect calculation.

Reply via email to